******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/292/292.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42892 1.0000 500 46420 1.0000 500 46798 1.0000 500 37328 1.0000 500 47560 1.0000 500 47620 1.0000 500 48109 1.0000 500 48444 1.0000 500 49215 1.0000 500 50392 1.0000 500 12562 1.0000 500 32765 1.0000 500 48048 1.0000 500 44281 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/292/292.seqs.fa -oc motifs/292 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.280 C 0.221 G 0.234 T 0.265 Background letter frequencies (from dataset with add-one prior applied): A 0.280 C 0.221 G 0.234 T 0.265 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 7 llr = 133 E-value = 1.0e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 6:::a::6:3377:1:::1: pos.-specific C :967:164:761:a1a:::a probability G 4:33:91:a:::::1::::: matrix T :11:::3:::113:6:aa9: bits 2.2 * * * * 2.0 * * *** * 1.7 * * * *** * 1.5 * ** * * *** * Relative 1.3 * *** ** * ***** Entropy 1.1 * *** *** ** ***** (27.3 bits) 0.9 ** *** *** ** ***** 0.7 ************** ***** 0.4 ************** ***** 0.2 ******************** 0.0 -------------------- Multilevel ACCCAGCAGCCAACTCTTTC consensus G GG TC AA T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 50392 194 1.05e-12 TTGACTCTCC ACCCAGCAGCCAACTCTTTC CTGTTCACGA 48444 195 1.05e-12 TTGACTCTCC ACCCAGCAGCCAACTCTTTC TTGTTCACGA 37328 195 1.05e-12 TTGACTCTCC ACCCAGCAGCCAACTCTTTC TTGTTCACGA 47620 116 1.17e-09 ACAATCTGAG GCGCAGTCGACAACCCTTTC CCTGAACAAG 48048 207 1.33e-08 ATAAATGGCA ACTCAGTCGCATTCGCTTTC AGTTAGCTAG 44281 431 1.59e-08 GGTGAAGGAA GTCGAGGAGAAATCTCTTTC TCATTTCTGT 47560 320 4.01e-08 GGACCGGAAA GCGGACCCGCTCACACTTAC AATTAGTGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50392 1e-12 193_[+1]_287 48444 1e-12 194_[+1]_286 37328 1e-12 194_[+1]_286 47620 1.2e-09 115_[+1]_365 48048 1.3e-08 206_[+1]_274 44281 1.6e-08 430_[+1]_50 47560 4e-08 319_[+1]_161 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=7 50392 ( 194) ACCCAGCAGCCAACTCTTTC 1 48444 ( 195) ACCCAGCAGCCAACTCTTTC 1 37328 ( 195) ACCCAGCAGCCAACTCTTTC 1 47620 ( 116) GCGCAGTCGACAACCCTTTC 1 48048 ( 207) ACTCAGTCGCATTCGCTTTC 1 44281 ( 431) GTCGAGGAGAAATCTCTTTC 1 47560 ( 320) GCGGACCCGCTCACACTTAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6734 bayes= 9.7521 E= 1.0e-006 103 -945 87 -945 -945 196 -945 -89 -945 137 29 -89 -945 169 29 -945 183 -945 -945 -945 -945 -63 187 -945 -945 137 -71 11 103 96 -945 -945 -945 -945 209 -945 3 169 -945 -945 3 137 -945 -89 135 -63 -945 -89 135 -945 -945 11 -945 218 -945 -945 -97 -63 -71 111 -945 218 -945 -945 -945 -945 -945 192 -945 -945 -945 192 -97 -945 -945 169 -945 218 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 7 E= 1.0e-006 0.571429 0.000000 0.428571 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.571429 0.285714 0.142857 0.000000 0.714286 0.285714 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.571429 0.142857 0.285714 0.571429 0.428571 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.285714 0.714286 0.000000 0.000000 0.285714 0.571429 0.000000 0.142857 0.714286 0.142857 0.000000 0.142857 0.714286 0.000000 0.000000 0.285714 0.000000 1.000000 0.000000 0.000000 0.142857 0.142857 0.142857 0.571429 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.142857 0.000000 0.000000 0.857143 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AG]C[CG][CG]AG[CT][AC]G[CA][CA]A[AT]CTCTTTC -------------------------------------------------------------------------------- Time 1.63 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 7 llr = 132 E-value = 1.4e-005 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 93:9:3:::::311:9:431: pos.-specific C ::a:7:a:11a7::919333: probability G 17:13::96:::96:::14:a matrix T :::::7:139:::31:11:6: bits 2.2 * * * * 2.0 * * * * 1.7 * * * * 1.5 * ** * * * * * Relative 1.3 * *** ** **** *** * Entropy 1.1 ******** **** *** * (27.1 bits) 0.9 ******** **** *** * 0.7 ***************** ** 0.4 ***************** *** 0.2 ********************* 0.0 --------------------- Multilevel AGCACTCGGTCCGGCACAGTG consensus A GA T A T CAC sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 50392 444 1.13e-13 CGAATGAGCG AGCACTCGGTCCGGCACAGTG CATTCTAGAA 48444 445 1.13e-13 CGAATGAGCA AGCACTCGGTCCGGCACAGTG CATTCTAGAA 37328 444 1.13e-13 CGAATGAGCA AGCACTCGGTCCGGCACAGTG CATTCTAAAA 49215 162 1.58e-08 TGCCAATCAA AACACACTTTCAGTCACGATG GGAAACGCAA 48048 442 2.35e-08 TCTTTAATGA AACAGTCGCTCCGGTCCTCCG TGTGATTGAT 46420 236 2.81e-08 TCTTCCGAGC GGCAGACGGCCCGTCATCCCG AAGAATCTGG 12562 467 3.07e-08 AAGGGCTCCG AGCGCTCGTTCAAACACCAAG ATGTAGAGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50392 1.1e-13 443_[+2]_36 48444 1.1e-13 444_[+2]_35 37328 1.1e-13 443_[+2]_36 49215 1.6e-08 161_[+2]_318 48048 2.3e-08 441_[+2]_38 46420 2.8e-08 235_[+2]_244 12562 3.1e-08 466_[+2]_13 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=7 50392 ( 444) AGCACTCGGTCCGGCACAGTG 1 48444 ( 445) AGCACTCGGTCCGGCACAGTG 1 37328 ( 444) AGCACTCGGTCCGGCACAGTG 1 49215 ( 162) AACACACTTTCAGTCACGATG 1 48048 ( 442) AACAGTCGCTCCGGTCCTCCG 1 46420 ( 236) GGCAGACGGCCCGTCATCCCG 1 12562 ( 467) AGCGCTCGTTCAAACACCAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6720 bayes= 9.7491 E= 1.4e-005 161 -945 -71 -945 3 -945 161 -945 -945 218 -945 -945 161 -945 -71 -945 -945 169 29 -945 3 -945 -945 143 -945 218 -945 -945 -945 -945 187 -89 -945 -63 129 11 -945 -63 -945 169 -945 218 -945 -945 3 169 -945 -945 -97 -945 187 -945 -97 -945 129 11 -945 196 -945 -89 161 -63 -945 -945 -945 196 -945 -89 61 37 -71 -89 3 37 87 -945 -97 37 -945 111 -945 -945 209 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 1.4e-005 0.857143 0.000000 0.142857 0.000000 0.285714 0.000000 0.714286 0.000000 0.000000 1.000000 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 0.714286 0.285714 0.000000 0.285714 0.000000 0.000000 0.714286 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.142857 0.571429 0.285714 0.000000 0.142857 0.000000 0.857143 0.000000 1.000000 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.142857 0.000000 0.571429 0.285714 0.000000 0.857143 0.000000 0.142857 0.857143 0.142857 0.000000 0.000000 0.000000 0.857143 0.000000 0.142857 0.428571 0.285714 0.142857 0.142857 0.285714 0.285714 0.428571 0.000000 0.142857 0.285714 0.000000 0.571429 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[GA]CA[CG][TA]CG[GT]TC[CA]G[GT]CAC[AC][GAC][TC]G -------------------------------------------------------------------------------- Time 3.24 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 4 llr = 103 E-value = 2.2e-005 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a3:a:::::::3::::::::a pos.-specific C :8:::8a::3::3a:a33:a: probability G ::a:a::a::a:::::88a:: matrix T :::::3::a8:88:a:::::: bits 2.2 * * ** * * * ** 2.0 * * *** * *** ** 1.7 * *** *** * *** *** 1.5 * *** *** * *** *** Relative 1.3 ********* * ******** Entropy 1.1 ********************* (37.1 bits) 0.9 ********************* 0.7 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel ACGAGCCGTTGTTCTCGGGCA consensus A T C AC CC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 50392 407 1.28e-13 GAAGAACCCT ACGAGCCGTTGTTCTCGGGCA AAGCTGCGAA 48444 408 1.28e-13 GAAGAACCCC ACGAGCCGTTGTTCTCGGGCA AAGCTGCGAA 37328 407 1.28e-13 GAAGAACCCC ACGAGCCGTTGTTCTCGGGCA AAGCTGCGAA 47620 239 2.43e-11 AATTACTGTA AAGAGTCGTCGACCTCCCGCA TAACATACCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50392 1.3e-13 406_[+3]_73 48444 1.3e-13 407_[+3]_72 37328 1.3e-13 406_[+3]_73 47620 2.4e-11 238_[+3]_241 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=4 50392 ( 407) ACGAGCCGTTGTTCTCGGGCA 1 48444 ( 408) ACGAGCCGTTGTTCTCGGGCA 1 37328 ( 407) ACGAGCCGTTGTTCTCGGGCA 1 47620 ( 239) AAGAGTCGTCGACCTCCCGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6720 bayes= 10.7134 E= 2.2e-005 183 -865 -865 -865 -16 176 -865 -865 -865 -865 209 -865 183 -865 -865 -865 -865 -865 209 -865 -865 176 -865 -8 -865 218 -865 -865 -865 -865 209 -865 -865 -865 -865 191 -865 18 -865 150 -865 -865 209 -865 -16 -865 -865 150 -865 18 -865 150 -865 218 -865 -865 -865 -865 -865 191 -865 218 -865 -865 -865 18 168 -865 -865 18 168 -865 -865 -865 209 -865 -865 218 -865 -865 183 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 2.2e-005 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[CA]GAG[CT]CGT[TC]G[TA][TC]CTC[GC][GC]GCA -------------------------------------------------------------------------------- Time 5.10 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42892 7.10e-01 500 46420 2.19e-04 235_[+2(2.81e-08)]_244 46798 1.29e-01 447_[+1(8.57e-05)]_33 37328 4.07e-27 194_[+1(1.05e-12)]_192_\ [+3(1.28e-13)]_16_[+2(1.13e-13)]_36 47560 2.46e-04 319_[+1(4.01e-08)]_161 47620 1.99e-12 115_[+1(1.17e-09)]_103_\ [+3(2.43e-11)]_115_[+1(5.14e-05)]_106 48109 3.22e-01 500 48444 4.07e-27 194_[+1(1.05e-12)]_193_\ [+3(1.28e-13)]_16_[+2(1.13e-13)]_35 49215 4.56e-06 161_[+2(1.58e-08)]_318 50392 4.07e-27 193_[+1(1.05e-12)]_193_\ [+3(1.28e-13)]_16_[+2(1.13e-13)]_36 12562 1.87e-04 466_[+2(3.07e-08)]_13 32765 3.42e-01 500 48048 1.40e-08 206_[+1(1.33e-08)]_215_\ [+2(2.35e-08)]_38 44281 3.01e-04 430_[+1(1.59e-08)]_50 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************