******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/163/163.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 24514 1.0000 500 36566 1.0000 500 36668 1.0000 500 13458 1.0000 500 39174 1.0000 500 48702 1.0000 500 10494 1.0000 500 41476 1.0000 500 11795 1.0000 500 27286 1.0000 500 31834 1.0000 500 32027 1.0000 500 47085 1.0000 500 49935 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/163/163.seqs.fa -oc motifs/163 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.248 C 0.260 G 0.241 T 0.251 Background letter frequencies (from dataset with add-one prior applied): A 0.248 C 0.260 G 0.241 T 0.251 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 14 llr = 127 E-value = 7.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 4::6:1:::346 pos.-specific C 2:1:9:::1231 probability G 1191::8:941: matrix T 29:3192a:113 bits 2.1 * 1.8 * 1.6 ** * ** 1.4 ** ** ** Relative 1.2 ** ***** Entropy 1.0 ** ***** (13.0 bits) 0.8 ** ***** * 0.6 ******** * 0.4 ******** * 0.2 ********* ** 0.0 ------------ Multilevel ATGACTGTGGAA consensus C T T ACT sequence T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 11795 216 5.25e-08 TCCGGCGGGG ATGACTGTGGAA TCCCTCGCCA 47085 142 1.09e-06 CCTCTGATTC ATGACTGTGAAT ACTATTGTAT 27286 335 3.70e-06 ACTTGCGGGT CTGACTGTGAAT CCTTCACGAC 13458 315 1.05e-05 AAAGGAAGCC ATGACTGTGCTT TTGACAGCGT 36668 6 1.43e-05 TAAGC GTGTCTGTGGGA AGGAAACATA 39174 423 1.88e-05 GGCCAATGGA TTGACAGTGACA TTTTTCCAAG 36566 99 2.06e-05 ATCTTCCGTC GTGTCTTTGGAA CCAGGCGGCG 41476 274 3.98e-05 CTCCTTGTGA CTGACAGTGTCA CTTACAACCC 32027 177 5.34e-05 CGTCCGGCGG CGGACTGTGTAA CTTGGCGACC 48702 321 6.69e-05 TTCTTCGGAC ATGTCTGTCGGA GACAGATTGT 24514 415 7.15e-05 AATAGAGGCA TTGGCTGTGAAC ACTGTCACTT 31834 392 8.27e-05 TCAATAACCC TTGTCTTTGCCT CTAACACTTG 49935 47 1.06e-04 AATGTTTTAC ATCACTTTGCCA AATTTATCCC 10494 377 1.06e-04 CATGGTAAGT ATGGTTGTGGTA GAGGGAATGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11795 5.2e-08 215_[+1]_273 47085 1.1e-06 141_[+1]_347 27286 3.7e-06 334_[+1]_154 13458 1e-05 314_[+1]_174 36668 1.4e-05 5_[+1]_483 39174 1.9e-05 422_[+1]_66 36566 2.1e-05 98_[+1]_390 41476 4e-05 273_[+1]_215 32027 5.3e-05 176_[+1]_312 48702 6.7e-05 320_[+1]_168 24514 7.2e-05 414_[+1]_74 31834 8.3e-05 391_[+1]_97 49935 0.00011 46_[+1]_442 10494 0.00011 376_[+1]_112 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=14 11795 ( 216) ATGACTGTGGAA 1 47085 ( 142) ATGACTGTGAAT 1 27286 ( 335) CTGACTGTGAAT 1 13458 ( 315) ATGACTGTGCTT 1 36668 ( 6) GTGTCTGTGGGA 1 39174 ( 423) TTGACAGTGACA 1 36566 ( 99) GTGTCTTTGGAA 1 41476 ( 274) CTGACAGTGTCA 1 32027 ( 177) CGGACTGTGTAA 1 48702 ( 321) ATGTCTGTCGGA 1 24514 ( 415) TTGGCTGTGAAC 1 31834 ( 392) TTGTCTTTGCCT 1 49935 ( 47) ATCACTTTGCCA 1 10494 ( 377) ATGGTTGTGGTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 8.93074 E= 7.0e+001 79 -28 -75 -23 -1045 -1045 -175 189 -1045 -186 195 -1045 120 -1045 -75 19 -1045 184 -1045 -181 -80 -1045 -1045 177 -1045 -1045 170 -23 -1045 -1045 -1045 200 -1045 -186 195 -1045 20 -28 57 -81 79 14 -75 -81 137 -186 -1045 19 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 7.0e+001 0.428571 0.214286 0.142857 0.214286 0.000000 0.000000 0.071429 0.928571 0.000000 0.071429 0.928571 0.000000 0.571429 0.000000 0.142857 0.285714 0.000000 0.928571 0.000000 0.071429 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 0.785714 0.214286 0.000000 0.000000 0.000000 1.000000 0.000000 0.071429 0.928571 0.000000 0.285714 0.214286 0.357143 0.142857 0.428571 0.285714 0.142857 0.142857 0.642857 0.071429 0.000000 0.285714 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [ACT]TG[AT]CT[GT]TG[GAC][AC][AT] -------------------------------------------------------------------------------- Time 1.69 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 9 llr = 99 E-value = 9.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 88:9987428aa pos.-specific C :221:::2:2:: probability G 2:8::2:28::: matrix T ::::1:31:::: bits 2.1 ** 1.8 ** 1.6 ** 1.4 ** ** Relative 1.2 ****** **** Entropy 1.0 ******* **** (15.8 bits) 0.8 ******* **** 0.6 ******* **** 0.4 ******* **** 0.2 ************ 0.0 ------------ Multilevel AAGAAAAAGAAA consensus GCC GTCAC sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 27286 294 2.06e-07 GTTCTCGGTC AAGAAATAGAAA CATGACCCGG 24514 154 3.58e-07 TAATGCTTCC AAGAAAAAAAAA GTGTTGCAGT 47085 66 1.30e-06 TTTACCTTGG AAGAAGACGAAA TCGCGAGCGA 32027 449 2.42e-06 GGCCCATTGT ACGAAAAAAAAA CGAACATCCA 49935 393 2.90e-06 CAGCAGCATC AACAAAAAGCAA CCGCTATCTC 13458 427 3.38e-06 CCAGACAAGG GAGAAATCGAAA CTTCGATACC 10494 64 1.01e-05 TTCATTGGTT GACAAAATGAAA AGCAGCAAAA 36566 261 2.07e-05 TATACTTACA AAGCAATGGCAA ATATTTTTTA 36668 330 2.73e-05 CACAAACCAA ACGATGAGGAAA AGTCCCCGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 27286 2.1e-07 293_[+2]_195 24514 3.6e-07 153_[+2]_335 47085 1.3e-06 65_[+2]_423 32027 2.4e-06 448_[+2]_40 49935 2.9e-06 392_[+2]_96 13458 3.4e-06 426_[+2]_62 10494 1e-05 63_[+2]_425 36566 2.1e-05 260_[+2]_228 36668 2.7e-05 329_[+2]_159 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=9 27286 ( 294) AAGAAATAGAAA 1 24514 ( 154) AAGAAAAAAAAA 1 47085 ( 66) AAGAAGACGAAA 1 32027 ( 449) ACGAAAAAAAAA 1 49935 ( 393) AACAAAAAGCAA 1 13458 ( 427) GAGAAATCGAAA 1 10494 ( 64) GACAAAATGAAA 1 36566 ( 261) AAGCAATGGCAA 1 36668 ( 330) ACGATGAGGAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.70369 E= 9.2e+002 165 -982 -12 -982 165 -23 -982 -982 -982 -23 169 -982 184 -122 -982 -982 184 -982 -982 -117 165 -982 -12 -982 142 -982 -982 41 84 -23 -12 -117 -16 -982 169 -982 165 -23 -982 -982 201 -982 -982 -982 201 -982 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 9.2e+002 0.777778 0.000000 0.222222 0.000000 0.777778 0.222222 0.000000 0.000000 0.000000 0.222222 0.777778 0.000000 0.888889 0.111111 0.000000 0.000000 0.888889 0.000000 0.000000 0.111111 0.777778 0.000000 0.222222 0.000000 0.666667 0.000000 0.000000 0.333333 0.444444 0.222222 0.222222 0.111111 0.222222 0.000000 0.777778 0.000000 0.777778 0.222222 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AG][AC][GC]AA[AG][AT][ACG][GA][AC]AA -------------------------------------------------------------------------------- Time 3.68 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 10 llr = 106 E-value = 3.0e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 22a:3:::65:: pos.-specific C :1:a:::3:2:1 probability G ::::27:13::9 matrix T 87::53a613a: bits 2.1 * * * 1.8 ** * * 1.6 ** * ** 1.4 ** * ** Relative 1.2 * ** ** ** Entropy 1.0 * ** ** ** (15.2 bits) 0.8 **** ** * ** 0.6 ********* ** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTACTGTTAATG consensus AA AT CGT sequence G C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 36566 449 5.76e-08 CAGTCAAGAA TTACTGTTAATG GAACTGTAAA 24514 339 2.29e-07 ATCGGCTTCT TTACTGTTGATG TTACCTATAA 27286 395 1.10e-06 CACAAACGAG TTACGGTTATTG TCACAATAGT 49935 127 6.05e-06 TTATAAGAAC TTACTTTTTATG GCGGGTAATG 48702 20 6.70e-06 GAAGACTCCT TTACAGTTAATC ACCCAGGCAC 36668 185 8.10e-06 GTTGTTGTTG ATACTGTTGCTG CCGGAACAGC 10494 49 1.26e-05 ATTTGTACAG TAACTTTCATTG GTTGACAAAA 31834 51 1.49e-05 CCAGTCAATC TCACGGTCAATG CTCTCTGGTG 41476 435 2.32e-05 TTGTCACTAC TAACAGTCGCTG TCCGTTTAAA 11795 143 4.01e-05 GCGACGTCGA ATACATTGATTG CGAAAAGTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36566 5.8e-08 448_[+3]_40 24514 2.3e-07 338_[+3]_150 27286 1.1e-06 394_[+3]_94 49935 6e-06 126_[+3]_362 48702 6.7e-06 19_[+3]_469 36668 8.1e-06 184_[+3]_304 10494 1.3e-05 48_[+3]_440 31834 1.5e-05 50_[+3]_438 41476 2.3e-05 434_[+3]_54 11795 4e-05 142_[+3]_346 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=10 36566 ( 449) TTACTGTTAATG 1 24514 ( 339) TTACTGTTGATG 1 27286 ( 395) TTACGGTTATTG 1 49935 ( 127) TTACTTTTTATG 1 48702 ( 20) TTACAGTTAATC 1 36668 ( 185) ATACTGTTGCTG 1 10494 ( 49) TAACTTTCATTG 1 31834 ( 51) TCACGGTCAATG 1 41476 ( 435) TAACAGTCGCTG 1 11795 ( 143) ATACATTGATTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 10.3614 E= 3.0e+003 -31 -997 -997 167 -31 -138 -997 148 201 -997 -997 -997 -997 194 -997 -997 27 -997 -27 100 -997 -997 154 26 -997 -997 -997 199 -997 21 -127 126 127 -997 32 -132 101 -38 -997 26 -997 -997 -997 199 -997 -138 190 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 3.0e+003 0.200000 0.000000 0.000000 0.800000 0.200000 0.100000 0.000000 0.700000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.300000 0.000000 0.200000 0.500000 0.000000 0.000000 0.700000 0.300000 0.000000 0.000000 0.000000 1.000000 0.000000 0.300000 0.100000 0.600000 0.600000 0.000000 0.300000 0.100000 0.500000 0.200000 0.000000 0.300000 0.000000 0.000000 0.000000 1.000000 0.000000 0.100000 0.900000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TA][TA]AC[TAG][GT]T[TC][AG][ATC]TG -------------------------------------------------------------------------------- Time 5.22 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24514 1.65e-07 153_[+2(3.58e-07)]_173_\ [+3(2.29e-07)]_64_[+1(7.15e-05)]_74 36566 6.11e-07 98_[+1(2.06e-05)]_150_\ [+2(2.07e-05)]_176_[+3(5.76e-08)]_40 36668 4.60e-05 5_[+1(1.43e-05)]_167_[+3(8.10e-06)]_\ 133_[+2(2.73e-05)]_159 13458 2.66e-04 314_[+1(1.05e-05)]_100_\ [+2(3.38e-06)]_62 39174 1.18e-01 422_[+1(1.88e-05)]_66 48702 3.44e-03 19_[+3(6.70e-06)]_289_\ [+1(6.69e-05)]_168 10494 1.60e-04 48_[+3(1.26e-05)]_3_[+2(1.01e-05)]_\ 51_[+2(5.64e-05)]_362 41476 3.55e-03 273_[+1(3.98e-05)]_149_\ [+3(2.32e-05)]_54 11795 5.45e-05 142_[+3(4.01e-05)]_61_\ [+1(5.25e-08)]_273 27286 2.84e-08 293_[+2(2.06e-07)]_29_\ [+1(3.70e-06)]_48_[+3(1.10e-06)]_94 31834 1.06e-02 50_[+3(1.49e-05)]_329_\ [+1(8.27e-05)]_97 32027 1.64e-03 176_[+1(5.34e-05)]_260_\ [+2(2.42e-06)]_40 47085 2.11e-05 65_[+2(1.30e-06)]_64_[+1(1.09e-06)]_\ 347 49935 2.84e-05 126_[+3(6.05e-06)]_254_\ [+2(2.90e-06)]_37_[+2(3.10e-05)]_47 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************