******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/186/186.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47265 1.0000 500 42126 1.0000 500 51058 1.0000 500 54730 1.0000 500 14284 1.0000 500 54801 1.0000 500 47857 1.0000 500 48134 1.0000 500 54869 1.0000 500 48190 1.0000 500 48313 1.0000 500 29758 1.0000 500 3924 1.0000 500 48951 1.0000 500 49693 1.0000 500 23489 1.0000 500 23794 1.0000 500 16798 1.0000 500 23924 1.0000 500 33120 1.0000 500 44204 1.0000 500 39015 1.0000 500 48559 1.0000 500 50469 1.0000 500 48725 1.0000 500 37567 1.0000 500 47590 1.0000 500 50237 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/186/186.seqs.fa -oc motifs/186 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 28 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 14000 N= 28 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.253 C 0.249 G 0.227 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.253 C 0.249 G 0.227 T 0.271 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 11 llr = 142 E-value = 7.0e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::7:::::87: pos.-specific C 1:::9::::211 probability G 5:a:1:9:a:2: matrix T 5a:3:a1a:::9 bits 2.1 * * 1.9 ** * ** 1.7 ** **** 1.5 ** ***** * Relative 1.3 ** ****** * Entropy 1.1 ********* * (18.6 bits) 0.9 *********** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GTGACTGTGAAT consensus T T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 50237 461 5.77e-08 ACGCGATACA GTGACTGTGAAT GCAGGTACAA 50469 247 5.77e-08 CGTACGAACT GTGACTGTGAAT TTACGCCGGG 3924 406 5.77e-08 ACCGAAGCTT GTGACTGTGAAT GTTCCTCGTG 37567 75 1.27e-07 TGCTATTGCA TTGACTGTGAAT GCCCTATGAT 49693 288 2.62e-07 CATTTTTTCA TTGTCTGTGAAT TGGTTGTTTC 23489 161 4.33e-07 AAAAATAACG TTGACTGTGAGT TTAATTTGGT 51058 314 9.76e-07 CCTTTGTTAA TTGACTGTGAAC ACCTCCGGTT 29758 446 1.17e-06 CGTACGTATC GTGTCTGTGCAT TATAGTTATC 48190 444 1.56e-06 GTACGCTTCG CTGACTGTGAGT CGGTACCGTT 54730 78 4.08e-06 TCATTTACAG GTGACTTTGACT TTGAGACCAG 48725 215 5.06e-06 CAACAACTTC TTGTGTGTGCAT GCTTTGTGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50237 5.8e-08 460_[+1]_28 50469 5.8e-08 246_[+1]_242 3924 5.8e-08 405_[+1]_83 37567 1.3e-07 74_[+1]_414 49693 2.6e-07 287_[+1]_201 23489 4.3e-07 160_[+1]_328 51058 9.8e-07 313_[+1]_175 29758 1.2e-06 445_[+1]_43 48190 1.6e-06 443_[+1]_45 54730 4.1e-06 77_[+1]_411 48725 5.1e-06 214_[+1]_274 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=11 50237 ( 461) GTGACTGTGAAT 1 50469 ( 247) GTGACTGTGAAT 1 3924 ( 406) GTGACTGTGAAT 1 37567 ( 75) TTGACTGTGAAT 1 49693 ( 288) TTGTCTGTGAAT 1 23489 ( 161) TTGACTGTGAGT 1 51058 ( 314) TTGACTGTGAAC 1 29758 ( 446) GTGTCTGTGCAT 1 48190 ( 444) CTGACTGTGAGT 1 54730 ( 78) GTGACTTTGACT 1 48725 ( 215) TTGTGTGTGCAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 13692 bayes= 10.6358 E= 7.0e-006 -1010 -145 100 75 -1010 -1010 -1010 188 -1010 -1010 214 -1010 152 -1010 -1010 1 -1010 187 -132 -1010 -1010 -1010 -1010 188 -1010 -1010 200 -157 -1010 -1010 -1010 188 -1010 -1010 214 -1010 169 -45 -1010 -1010 152 -145 -32 -1010 -1010 -145 -1010 175 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 7.0e-006 0.000000 0.090909 0.454545 0.454545 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.727273 0.000000 0.000000 0.272727 0.000000 0.909091 0.090909 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.909091 0.090909 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.818182 0.181818 0.000000 0.000000 0.727273 0.090909 0.181818 0.000000 0.000000 0.090909 0.000000 0.909091 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GT]TG[AT]CTGTGAAT -------------------------------------------------------------------------------- Time 6.47 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 11 llr = 157 E-value = 3.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 62437614:87:92196:457 pos.-specific C 4:43::1111:3:1314::31 probability G :6:2:4759137156::a61: matrix T :2333:11:::::3:::::12 bits 2.1 * 1.9 * 1.7 * * 1.5 * * * * Relative 1.3 * *** * * Entropy 1.1 * ** ***** **** (20.6 bits) 0.9 ** *** ***** ***** * 0.6 ** *** ***** ***** * 0.4 *** ********* ******* 0.2 *** ***************** 0.0 --------------------- Multilevel AGAAAAGGGAAGAGGAAGGAA consensus C CCTG A GC TC C AC sequence TT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 37567 267 5.05e-11 ATAGCAACGA AGATAAGAGAACAGGAAGGAA CTTCTGAGAT 50469 440 1.16e-09 TCCCCTCTCT AGCCTAGGGAACAGCAAGGAA TACGACGCTC 23489 255 3.55e-08 TGACGGGTGA ATCCAATGGAACATGAAGGAA AGCCTGTAGG 47265 87 1.11e-07 GACAGGCGGA AGCATGAAGAAGACGAAGAAA ATTAGGAAAT 47857 427 1.57e-07 CGTTTCGACG AGTCAACGGAAGAAAACGGCA GCACGACATC 42126 112 1.57e-07 TGTTCTCTTT CAAGTAGAGAGGATCAAGAAA CTCTTCAGAG 50237 243 2.01e-07 ACGACTTGCG AAAAAAGGGCGGAGGACGGAC ACTTCAAATA 54730 231 3.24e-07 ACTGCGAGGA CGTTAGGCGAAGGAGACGGCA CGAACCCAAA 33120 323 4.06e-07 CTTCTCATCC CGTAAAGGGAAGATGCAGAGT AACTCACTAC 47590 50 4.70e-07 TCTACTTTTG CTCTAGGAGAGGAGCAAGGTT GGACTGCTTT 23794 45 9.94e-07 GCCTCTAGCC AGAGAGGTCGAGAGGACGACA CGGTATTCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37567 5e-11 266_[+2]_213 50469 1.2e-09 439_[+2]_40 23489 3.5e-08 254_[+2]_225 47265 1.1e-07 86_[+2]_393 47857 1.6e-07 426_[+2]_53 42126 1.6e-07 111_[+2]_368 50237 2e-07 242_[+2]_237 54730 3.2e-07 230_[+2]_249 33120 4.1e-07 322_[+2]_157 47590 4.7e-07 49_[+2]_430 23794 9.9e-07 44_[+2]_435 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=11 37567 ( 267) AGATAAGAGAACAGGAAGGAA 1 50469 ( 440) AGCCTAGGGAACAGCAAGGAA 1 23489 ( 255) ATCCAATGGAACATGAAGGAA 1 47265 ( 87) AGCATGAAGAAGACGAAGAAA 1 47857 ( 427) AGTCAACGGAAGAAAACGGCA 1 42126 ( 112) CAAGTAGAGAGGATCAAGAAA 1 50237 ( 243) AAAAAAGGGCGGAGGACGGAC 1 54730 ( 231) CGTTAGGCGAAGGAGACGGCA 1 33120 ( 323) CGTAAAGGGAAGATGCAGAGT 1 47590 ( 50) CTCTAGGAGAGGAGCAAGGTT 1 23794 ( 45) AGAGAGGTCGAGAGGACGACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 13440 bayes= 9.80574 E= 3.0e+002 133 55 -1010 -1010 -48 -1010 149 -57 52 55 -1010 1 11 13 -32 1 152 -1010 -1010 1 133 -1010 68 -1010 -147 -145 168 -157 52 -145 100 -157 -1010 -145 200 -1010 169 -145 -132 -1010 152 -1010 26 -1010 -1010 13 168 -1010 184 -1010 -132 -1010 -48 -145 100 1 -147 13 149 -1010 184 -145 -1010 -1010 133 55 -1010 -1010 -1010 -1010 214 -1010 52 -1010 149 -1010 111 13 -132 -157 152 -145 -1010 -57 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 11 E= 3.0e+002 0.636364 0.363636 0.000000 0.000000 0.181818 0.000000 0.636364 0.181818 0.363636 0.363636 0.000000 0.272727 0.272727 0.272727 0.181818 0.272727 0.727273 0.000000 0.000000 0.272727 0.636364 0.000000 0.363636 0.000000 0.090909 0.090909 0.727273 0.090909 0.363636 0.090909 0.454545 0.090909 0.000000 0.090909 0.909091 0.000000 0.818182 0.090909 0.090909 0.000000 0.727273 0.000000 0.272727 0.000000 0.000000 0.272727 0.727273 0.000000 0.909091 0.000000 0.090909 0.000000 0.181818 0.090909 0.454545 0.272727 0.090909 0.272727 0.636364 0.000000 0.909091 0.090909 0.000000 0.000000 0.636364 0.363636 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.363636 0.000000 0.636364 0.000000 0.545455 0.272727 0.090909 0.090909 0.727273 0.090909 0.000000 0.181818 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AC]G[ACT][ACT][AT][AG]G[GA]GA[AG][GC]A[GT][GC]A[AC]G[GA][AC]A -------------------------------------------------------------------------------- Time 12.77 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 7 llr = 120 E-value = 5.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::166::4::::::16:a::: pos.-specific C a:3443::347:67:34:93: probability G :a4::7a131:713916::6a matrix T ::1::::444333:::::11: bits 2.1 * * * 1.9 ** * * * 1.7 ** * * * 1.5 ** * * ** * Relative 1.3 ** ** * ** ** * Entropy 1.1 ** **** ** ** *** * (24.8 bits) 0.9 ** **** ** ** *** * 0.6 ** **** *********** 0.4 ** ****************** 0.2 ********************* 0.0 --------------------- Multilevel CGGAAGGATCCGCCGAGACGG consensus CCCC TCTTTTG CC C sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 50469 210 8.79e-11 CTGGAGCTCT CGGCACGACCCGCCGAGACGG CTCAACCGTA 23794 381 2.16e-09 GTAAGCTCTA CGGAAGGATCTTCCGGGACGG CACCGGATCG 16798 232 2.43e-09 CAGTACGAGA CGCACGGAGGCGCCGCGACCG ACGAAACGTC 48725 80 1.12e-08 TCTCCATTTT CGCAACGGTTTGTCGACACGG GCATACGCTC 54869 100 1.12e-08 CGTCGCGGTC CGTACGGTGCCGGGGAGACCG AATAGGTAGT 48190 371 2.03e-08 AAGGGTTCGT CGACAGGTTTCGTCGCCATGG CTCACCAAAA 48951 93 5.95e-08 GCATTCCTAT CGGCCGGTCTCTCGAACACTG TCGTGACGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50469 8.8e-11 209_[+3]_270 23794 2.2e-09 380_[+3]_99 16798 2.4e-09 231_[+3]_248 48725 1.1e-08 79_[+3]_400 54869 1.1e-08 99_[+3]_380 48190 2e-08 370_[+3]_109 48951 6e-08 92_[+3]_387 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=7 50469 ( 210) CGGCACGACCCGCCGAGACGG 1 23794 ( 381) CGGAAGGATCTTCCGGGACGG 1 16798 ( 232) CGCACGGAGGCGCCGCGACCG 1 48725 ( 80) CGCAACGGTTTGTCGACACGG 1 54869 ( 100) CGTACGGTGCCGGGGAGACCG 1 48190 ( 371) CGACAGGTTTCGTCGCCATGG 1 48951 ( 93) CGGCCGGTCTCTCGAACACTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 13440 bayes= 11.5121 E= 5.1e+002 -945 200 -945 -945 -945 -945 214 -945 -82 20 91 -92 117 78 -945 -945 117 78 -945 -945 -945 20 165 -945 -945 -945 214 -945 76 -945 -67 66 -945 20 33 66 -945 78 -67 66 -945 152 -945 8 -945 -945 165 8 -945 120 -67 8 -945 152 33 -945 -82 -945 191 -945 117 20 -67 -945 -945 78 133 -945 198 -945 -945 -945 -945 178 -945 -92 -945 20 133 -92 -945 -945 214 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 5.1e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.285714 0.428571 0.142857 0.571429 0.428571 0.000000 0.000000 0.571429 0.428571 0.000000 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 0.000000 1.000000 0.000000 0.428571 0.000000 0.142857 0.428571 0.000000 0.285714 0.285714 0.428571 0.000000 0.428571 0.142857 0.428571 0.000000 0.714286 0.000000 0.285714 0.000000 0.000000 0.714286 0.285714 0.000000 0.571429 0.142857 0.285714 0.000000 0.714286 0.285714 0.000000 0.142857 0.000000 0.857143 0.000000 0.571429 0.285714 0.142857 0.000000 0.000000 0.428571 0.571429 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.285714 0.571429 0.142857 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CG[GC][AC][AC][GC]G[AT][TCG][CT][CT][GT][CT][CG]G[AC][GC]AC[GC]G -------------------------------------------------------------------------------- Time 19.06 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47265 7.58e-04 86_[+2(1.11e-07)]_393 42126 4.73e-05 111_[+2(1.57e-07)]_198_\ [+3(2.38e-05)]_149 51058 6.53e-03 313_[+1(9.76e-07)]_175 54730 5.62e-06 77_[+1(4.08e-06)]_141_\ [+2(3.24e-07)]_249 14284 1.36e-01 500 54801 5.42e-01 500 47857 6.96e-04 426_[+2(1.57e-07)]_53 48134 9.86e-01 500 54869 1.23e-04 99_[+3(1.12e-08)]_380 48190 8.41e-07 370_[+3(2.03e-08)]_52_\ [+1(1.56e-06)]_45 48313 7.15e-01 500 29758 3.15e-04 302_[+3(2.83e-05)]_122_\ [+1(1.17e-06)]_43 3924 6.26e-04 405_[+1(5.77e-08)]_83 48951 3.14e-04 92_[+3(5.95e-08)]_387 49693 4.82e-03 287_[+1(2.62e-07)]_201 23489 7.07e-07 160_[+1(4.33e-07)]_82_\ [+2(3.55e-08)]_36_[+1(8.21e-05)]_177 23794 5.37e-08 44_[+2(9.94e-07)]_315_\ [+3(2.16e-09)]_99 16798 2.56e-05 231_[+3(2.43e-09)]_1_[+3(8.08e-05)]_\ 226 23924 9.32e-01 500 33120 2.65e-03 322_[+2(4.06e-07)]_157 44204 6.93e-01 500 39015 1.78e-01 500 48559 8.88e-02 500 50469 6.11e-16 209_[+3(8.79e-11)]_16_\ [+1(5.77e-08)]_181_[+2(1.16e-09)]_40 48725 1.07e-06 79_[+3(1.12e-08)]_114_\ [+1(5.06e-06)]_274 37567 1.96e-10 74_[+1(1.27e-07)]_156_\ [+1(1.72e-05)]_12_[+2(5.05e-11)]_213 47590 2.27e-03 49_[+2(4.70e-07)]_430 50237 4.04e-07 242_[+2(2.01e-07)]_197_\ [+1(5.77e-08)]_28 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************