******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/284/284.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 48025 1.0000 500 54842 1.0000 500 42255 1.0000 500 44208 1.0000 500 44360 1.0000 500 34584 1.0000 500 44392 1.0000 500 33032 1.0000 500 40264 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/284/284.seqs.fa -oc motifs/284 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.262 C 0.243 G 0.221 T 0.274 Background letter frequencies (from dataset with add-one prior applied): A 0.262 C 0.243 G 0.221 T 0.274 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 9 llr = 94 E-value = 1.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a1:13:1:::86 pos.-specific C :91:19::1::1 probability G :::73::8:a:1 matrix T ::9221929:22 bits 2.2 * 2.0 * * 1.7 * * 1.5 ** * * Relative 1.3 *** ***** Entropy 1.1 *** ****** (15.1 bits) 0.9 **** ****** 0.7 **** ****** 0.4 **** ****** 0.2 ************ 0.0 ------------ Multilevel ACTGACTGTGAA consensus TG T TT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 42255 11 1.14e-07 CGCATGAACA ACTGACTGTGAA TGCGACCACA 40264 10 9.68e-07 TTTGCGCTT ACTGTCTGTGAT AAGCAATCGA 33032 169 1.87e-06 ATAATAATGA ACCGACTGTGAA ATCAGAATCG 34584 80 3.04e-06 GATGTGCTCG ACTTGCTGTGTA AGTTCCATCC 54842 159 4.86e-06 TATGGATTCC ACTGGCTTTGAC ACAACAGTCA 44392 263 8.90e-06 ATACATCACA ACTGACAGTGAG CGAAAATAGC 44208 152 8.90e-06 TAGTAGTGTC ACTTCCTGTGAT AACTATTCAC 44360 14 4.26e-05 TTCGCGCCGT ACTAGTTTTGAA GTCAATTCCA 48025 12 5.62e-05 ATGGTGTTCG AATGTCTGCGTA TGTTGCTTGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42255 1.1e-07 10_[+1]_478 40264 9.7e-07 9_[+1]_479 33032 1.9e-06 168_[+1]_320 34584 3e-06 79_[+1]_409 54842 4.9e-06 158_[+1]_330 44392 8.9e-06 262_[+1]_226 44208 8.9e-06 151_[+1]_337 44360 4.3e-05 13_[+1]_475 48025 5.6e-05 11_[+1]_477 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=9 42255 ( 11) ACTGACTGTGAA 1 40264 ( 10) ACTGTCTGTGAT 1 33032 ( 169) ACCGACTGTGAA 1 34584 ( 80) ACTTGCTGTGTA 1 54842 ( 159) ACTGGCTTTGAC 1 44392 ( 263) ACTGACAGTGAG 1 44208 ( 152) ACTTCCTGTGAT 1 44360 ( 14) ACTAGTTTTGAA 1 48025 ( 12) AATGTCTGCGTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4401 bayes= 9.0653 E= 1.4e+001 193 -982 -982 -982 -123 187 -982 -982 -982 -113 -982 170 -123 -982 159 -30 35 -113 59 -30 -982 187 -982 -130 -123 -982 -982 170 -982 -982 182 -30 -982 -113 -982 170 -982 -982 218 -982 157 -982 -982 -30 108 -113 -99 -30 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 1.4e+001 1.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.000000 0.111111 0.000000 0.888889 0.111111 0.000000 0.666667 0.222222 0.333333 0.111111 0.333333 0.222222 0.000000 0.888889 0.000000 0.111111 0.111111 0.000000 0.000000 0.888889 0.000000 0.000000 0.777778 0.222222 0.000000 0.111111 0.000000 0.888889 0.000000 0.000000 1.000000 0.000000 0.777778 0.000000 0.000000 0.222222 0.555556 0.111111 0.111111 0.222222 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- ACT[GT][AGT]CT[GT]TG[AT][AT] -------------------------------------------------------------------------------- Time 0.75 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 6 llr = 85 E-value = 6.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A aa2:22a33:::a7: pos.-specific C ::8728:7:::8::5 probability G ::::2:::2a:2:2: matrix T :::35:::5:a::25 bits 2.2 * 2.0 ** * ** * 1.7 ** * ** * 1.5 ** * ** * Relative 1.3 *** ** **** Entropy 1.1 **** *** **** (20.4 bits) 0.9 **** *** **** * 0.7 **** *** ****** 0.4 **** ********** 0.2 *************** 0.0 --------------- Multilevel AACCTCACTGTCAAC consensus T AA T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 42255 99 2.46e-09 CGCAAGGATG AACCTCACTGTCAAT AGTAGAGTTC 44392 443 1.21e-08 AAAAATAGCC AACTTCACTGTCAAT ACTGCTGTTT 34584 169 1.95e-07 AAGTACACAC AACCAAACTGTCAAC TTCGTGTTAG 54842 304 3.86e-07 TTCGTATGAA AAACTCACAGTCATC GTAAGAGGCC 40264 300 4.47e-07 ATTTTACATA AACTGCAAGGTCAAT ACAGGTTTGC 33032 341 1.08e-06 GTTGCACGAC AACCCCAAAGTGAGC ATGTTGATCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42255 2.5e-09 98_[+2]_387 44392 1.2e-08 442_[+2]_43 34584 2e-07 168_[+2]_317 54842 3.9e-07 303_[+2]_182 40264 4.5e-07 299_[+2]_186 33032 1.1e-06 340_[+2]_145 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=6 42255 ( 99) AACCTCACTGTCAAT 1 44392 ( 443) AACTTCACTGTCAAT 1 34584 ( 169) AACCAAACTGTCAAC 1 54842 ( 304) AAACTCACAGTCATC 1 40264 ( 300) AACTGCAAGGTCAAT 1 33032 ( 341) AACCCCAAAGTGAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4374 bayes= 9.95578 E= 6.7e+001 193 -923 -923 -923 193 -923 -923 -923 -65 177 -923 -923 -923 145 -923 28 -65 -54 -40 87 -65 177 -923 -923 193 -923 -923 -923 35 145 -923 -923 35 -923 -40 87 -923 -923 218 -923 -923 -923 -923 186 -923 177 -40 -923 193 -923 -923 -923 135 -923 -40 -72 -923 104 -923 87 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 6 E= 6.7e+001 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.166667 0.166667 0.166667 0.500000 0.166667 0.833333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.333333 0.000000 0.166667 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.000000 0.166667 0.166667 0.000000 0.500000 0.000000 0.500000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- AAC[CT]TCA[CA][TA]GTCAA[CT] -------------------------------------------------------------------------------- Time 1.43 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 14 sites = 7 llr = 86 E-value = 5.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1a9a1:17:4::1: pos.-specific C ::1:4:3::191:: probability G 9:::1a:3a3:13: matrix T ::::3:6::1176a bits 2.2 * * 2.0 * * * * * 1.7 * * * * * 1.5 ** * * * * * Relative 1.3 **** * * * * Entropy 1.1 **** * ** * * (17.8 bits) 0.9 **** * ** ** * 0.7 **** **** **** 0.4 **** **** **** 0.2 ************** 0.0 -------------- Multilevel GAAACGTAGACTTT consensus T CG G G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 48025 335 1.68e-08 GATACGCGAC GAAACGTAGACTGT CCTTATCCTA 44208 384 9.63e-08 GAAGCGAAGA GAAACGTGGACTGT ATCCGACCCT 44360 39 4.31e-07 AATTCCACCG GAAAGGAAGACTTT CCTTTCCTCT 44392 163 7.21e-07 CTGTCAATGA GAAAAGTAGGCTAT GTTCTTTCCC 42255 469 2.29e-06 TTCTCGGCGG GAAATGTGGTCGTT CCTGGATTGA 33032 52 3.87e-06 AGAGGTAAGG GACATGCAGGCCTT TTTATTGTAT 54842 108 7.17e-06 ACCCTTTTAC AAAACGCAGCTTTT CTCTTACGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48025 1.7e-08 334_[+3]_152 44208 9.6e-08 383_[+3]_103 44360 4.3e-07 38_[+3]_448 44392 7.2e-07 162_[+3]_324 42255 2.3e-06 468_[+3]_18 33032 3.9e-06 51_[+3]_435 54842 7.2e-06 107_[+3]_379 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=14 seqs=7 48025 ( 335) GAAACGTAGACTGT 1 44208 ( 384) GAAACGTGGACTGT 1 44360 ( 39) GAAAGGAAGACTTT 1 44392 ( 163) GAAAAGTAGGCTAT 1 42255 ( 469) GAAATGTGGTCGTT 1 33032 ( 52) GACATGCAGGCCTT 1 54842 ( 108) AAAACGCAGCTTTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 4383 bayes= 9.89455 E= 5.0e+001 -87 -945 196 -945 193 -945 -945 -945 171 -77 -945 -945 193 -945 -945 -945 -87 82 -63 6 -945 -945 218 -945 -87 23 -945 106 145 -945 37 -945 -945 -945 218 -945 71 -77 37 -94 -945 182 -945 -94 -945 -77 -63 138 -87 -945 37 106 -945 -945 -945 187 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 7 E= 5.0e+001 0.142857 0.000000 0.857143 0.000000 1.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.428571 0.142857 0.285714 0.000000 0.000000 1.000000 0.000000 0.142857 0.285714 0.000000 0.571429 0.714286 0.000000 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 0.428571 0.142857 0.285714 0.142857 0.000000 0.857143 0.000000 0.142857 0.000000 0.142857 0.142857 0.714286 0.142857 0.000000 0.285714 0.571429 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GAAA[CT]G[TC][AG]G[AG]CT[TG]T -------------------------------------------------------------------------------- Time 2.10 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48025 1.97e-05 11_[+1(5.62e-05)]_311_\ [+3(1.68e-08)]_152 54842 3.52e-07 107_[+3(7.17e-06)]_37_\ [+1(4.86e-06)]_133_[+2(3.86e-07)]_109_[+2(4.82e-05)]_58 42255 3.63e-11 10_[+1(1.14e-07)]_76_[+2(2.46e-09)]_\ 355_[+3(2.29e-06)]_18 44208 2.50e-05 151_[+1(8.90e-06)]_220_\ [+3(9.63e-08)]_103 44360 2.35e-04 13_[+1(4.26e-05)]_13_[+3(4.31e-07)]_\ 448 34584 1.98e-05 79_[+1(3.04e-06)]_77_[+2(1.95e-07)]_\ 139_[+2(5.68e-05)]_163 44392 3.14e-09 162_[+3(7.21e-07)]_86_\ [+1(8.90e-06)]_168_[+2(1.21e-08)]_43 33032 2.16e-07 51_[+3(3.87e-06)]_31_[+1(2.38e-05)]_\ 60_[+1(1.87e-06)]_160_[+2(1.08e-06)]_145 40264 1.05e-05 9_[+1(9.68e-07)]_278_[+2(4.47e-07)]_\ 186 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************