******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/308/308.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1784 1.0000 500 13834 1.0000 500 13582 1.0000 500 47199 1.0000 500 37440 1.0000 500 14067 1.0000 500 38261 1.0000 500 43684 1.0000 500 33006 1.0000 500 49057 1.0000 500 43857 1.0000 500 25619 1.0000 500 44445 1.0000 500 10723 1.0000 500 12820 1.0000 500 42564 1.0000 500 49095 1.0000 500 47672 1.0000 500 44187 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/308/308.seqs.fa -oc motifs/308 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.276 C 0.225 G 0.232 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.276 C 0.225 G 0.232 T 0.268 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 19 llr = 194 E-value = 2.0e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::8362::a2127222 pos.-specific C :6172::5:2322:41 probability G 141::8:5:5352321 matrix T 9:::2:a::231:527 bits 2.2 1.9 * * 1.7 * * 1.5 * ** * Relative 1.3 ** ** * Entropy 1.1 **** **** (14.8 bits) 0.9 **** **** 0.6 **** **** * * 0.4 ********* *** * 0.2 **************** 0.0 ---------------- Multilevel TCACAGTGAGCGATCT consensus G AC C CGA GGA sequence T TC AT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 12820 18 1.08e-08 GTCGCTACCG TCACAGTCAGCGATTT CTGTCAAGGT 25619 329 3.44e-08 AAATTCATTC TCACAGTCAGTCAGCT GACTCCAACT 44187 228 5.24e-07 CACTGTCATG TCACAGTCACGAAACT CGGCAAAACG 38261 258 1.54e-06 TCTGTTTGCT TCACTGTCAGCGCTAT TTTTCTCTTC 37440 89 2.43e-06 GCGTTTTAAC TGACTGTGAAGGATTT CAATAGAAAA 43684 138 3.40e-06 TTTTTATTGG TCGACGTGAGCGAGCT ATGAAGAATG 13582 197 3.40e-06 GGAGGTCCAA TGAACGTCAGGGAAGT CCGACCCGAT 1784 32 3.40e-06 ATTCTGCCAA TCACAGTCATCCCTGT TCCTCTAGCC 47672 90 4.63e-06 ATCAACATGT TGACAGTGATTACTCT CTCCTCACTA 49057 332 4.63e-06 GTATCACGAT TCACAGTCACGCAGAA TCAGATACCC 10723 2 9.82e-06 C TGACTGTGAATGATGA GTCACAGGTA 33006 305 1.07e-05 ACAGTGACAT TGACAGTGAATCGGGT TGGGCTTCCG 49095 162 2.20e-05 TCTGTATCAA TGAATGTGAGCTAACT ATTTCGAGAT 14067 368 2.73e-05 AGAAACGAGC TCAACATGATGGGTCT AAGAAGCTCA 43857 479 2.93e-05 CGGAGCACTC TCGCAATCACTGAGCA ACAAGA 42564 207 3.14e-05 TCTTATATCT TCCACGTCAGGAATTT AGGTGTCAGC 13834 223 4.08e-05 TTTGACTATC TCACAGTGAGAAGTCC CTCGGATGTC 47199 271 4.61e-05 GGAATTTAAG TGACAATGAGTGAATG TCGTAAGGGT 44445 285 5.53e-05 ACGCGGCGAG GCAAAGTGACCGAGAA TGACACGCGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12820 1.1e-08 17_[+1]_467 25619 3.4e-08 328_[+1]_156 44187 5.2e-07 227_[+1]_257 38261 1.5e-06 257_[+1]_227 37440 2.4e-06 88_[+1]_396 43684 3.4e-06 137_[+1]_347 13582 3.4e-06 196_[+1]_288 1784 3.4e-06 31_[+1]_453 47672 4.6e-06 89_[+1]_395 49057 4.6e-06 331_[+1]_153 10723 9.8e-06 1_[+1]_483 33006 1.1e-05 304_[+1]_180 49095 2.2e-05 161_[+1]_323 14067 2.7e-05 367_[+1]_117 43857 2.9e-05 478_[+1]_6 42564 3.1e-05 206_[+1]_278 13834 4.1e-05 222_[+1]_262 47199 4.6e-05 270_[+1]_214 44445 5.5e-05 284_[+1]_200 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=19 12820 ( 18) TCACAGTCAGCGATTT 1 25619 ( 329) TCACAGTCAGTCAGCT 1 44187 ( 228) TCACAGTCACGAAACT 1 38261 ( 258) TCACTGTCAGCGCTAT 1 37440 ( 89) TGACTGTGAAGGATTT 1 43684 ( 138) TCGACGTGAGCGAGCT 1 13582 ( 197) TGAACGTCAGGGAAGT 1 1784 ( 32) TCACAGTCATCCCTGT 1 47672 ( 90) TGACAGTGATTACTCT 1 49057 ( 332) TCACAGTCACGCAGAA 1 10723 ( 2) TGACTGTGAATGATGA 1 33006 ( 305) TGACAGTGAATCGGGT 1 49095 ( 162) TGAATGTGAGCTAACT 1 14067 ( 368) TCAACATGATGGGTCT 1 43857 ( 479) TCGCAATCACTGAGCA 1 42564 ( 207) TCCACGTCAGGAATTT 1 13834 ( 223) TCACAGTGAGAAGTCC 1 47199 ( 271) TGACAATGAGTGAATG 1 44445 ( 285) GCAAAGTGACCGAGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9215 bayes= 9.81767 E= 2.0e-004 -1089 -1089 -214 182 -1089 149 67 -1089 161 -209 -114 -1089 20 161 -1089 -1089 107 -9 -1089 -35 -80 -1089 186 -1089 -1089 -1089 -1089 190 -1089 108 118 -1089 186 -1089 -1089 -1089 -80 -9 103 -76 -239 49 45 24 -39 -9 118 -235 131 -51 -55 -1089 -39 -1089 45 82 -80 91 -14 -35 -39 -209 -214 135 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 19 E= 2.0e-004 0.000000 0.000000 0.052632 0.947368 0.000000 0.631579 0.368421 0.000000 0.842105 0.052632 0.105263 0.000000 0.315789 0.684211 0.000000 0.000000 0.578947 0.210526 0.000000 0.210526 0.157895 0.000000 0.842105 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.473684 0.526316 0.000000 1.000000 0.000000 0.000000 0.000000 0.157895 0.210526 0.473684 0.157895 0.052632 0.315789 0.315789 0.315789 0.210526 0.210526 0.526316 0.052632 0.684211 0.157895 0.157895 0.000000 0.210526 0.000000 0.315789 0.473684 0.157895 0.421053 0.210526 0.210526 0.210526 0.052632 0.052632 0.684211 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[CG]A[CA][ACT]GT[GC]A[GC][CGT][GAC]A[TGA][CGT][TA] -------------------------------------------------------------------------------- Time 3.68 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 10 llr = 110 E-value = 7.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::1::19273:: pos.-specific C ::7:a118236: probability G ::22:8:::11a matrix T aa:8::::133: bits 2.2 * * 1.9 ** * * 1.7 ** * * 1.5 ** * * * Relative 1.3 ** ** ** * Entropy 1.1 ** ***** * (15.8 bits) 0.9 ********* ** 0.6 ********* ** 0.4 ********* ** 0.2 ********* ** 0.0 ------------ Multilevel TTCTCGACAACG consensus GG ACCT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47199 306 4.55e-08 TTGGGGCAGC TTCTCGACACCG AACGTGGGAC 25619 23 3.78e-07 TCAACTTTTA TTCTCGACAATG AAATATTCAT 1784 254 9.32e-07 CGCCTCATTT TTCTCGAAATCG GATAGTGTTT 10723 295 1.14e-06 AAGGCTTACC TTCTCGACCCTG GTAACACGGT 43684 110 2.03e-06 AATCGAACTG TTCTCCACATCG GTCGTGTTTT 47672 129 3.36e-06 GTGCTCTCAT TTGGCGACAACG CCGCTTTTCA 44187 282 9.10e-06 CAAAAAGCCG TTCGCAACAACG AGGCCGGTTA 13582 78 1.20e-05 CGACGGACTA TTATCGACTCCG TCATTCATGA 12820 328 1.90e-05 AGTTTTGATT TTCTCGAACGTG TGGGAAAGGG 49057 266 2.77e-05 AAGAACAGGG TTGTCGCCATGG GCACACGCCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47199 4.6e-08 305_[+2]_183 25619 3.8e-07 22_[+2]_466 1784 9.3e-07 253_[+2]_235 10723 1.1e-06 294_[+2]_194 43684 2e-06 109_[+2]_379 47672 3.4e-06 128_[+2]_360 44187 9.1e-06 281_[+2]_207 13582 1.2e-05 77_[+2]_411 12820 1.9e-05 327_[+2]_161 49057 2.8e-05 265_[+2]_223 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=10 47199 ( 306) TTCTCGACACCG 1 25619 ( 23) TTCTCGACAATG 1 1784 ( 254) TTCTCGAAATCG 1 10723 ( 295) TTCTCGACCCTG 1 43684 ( 110) TTCTCCACATCG 1 47672 ( 129) TTGGCGACAACG 1 44187 ( 282) TTCGCAACAACG 1 13582 ( 78) TTATCGACTCCG 1 12820 ( 328) TTCTCGAACGTG 1 49057 ( 266) TTGTCGCCATGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9291 bayes= 10.1099 E= 7.5e+002 -997 -997 -997 190 -997 -997 -997 190 -146 164 -21 -997 -997 -997 -21 158 -997 215 -997 -997 -146 -117 179 -997 171 -117 -997 -997 -46 183 -997 -997 134 -17 -997 -142 12 42 -121 16 -997 142 -121 16 -997 -997 211 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 7.5e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.100000 0.700000 0.200000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 1.000000 0.000000 0.000000 0.100000 0.100000 0.800000 0.000000 0.900000 0.100000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.700000 0.200000 0.000000 0.100000 0.300000 0.300000 0.100000 0.300000 0.000000 0.600000 0.100000 0.300000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TT[CG][TG]CGA[CA][AC][ACT][CT]G -------------------------------------------------------------------------------- Time 7.15 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 7 llr = 98 E-value = 9.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1:::::3::::1::: pos.-specific C ::11::69:3::6:: probability G ::97:711a:77::6 matrix T 9a:1a3:::7314a4 bits 2.2 * 1.9 * * * * 1.7 * * * * 1.5 ** * ** * Relative 1.3 *** ** ** * * Entropy 1.1 *** ** **** *** (20.3 bits) 0.9 ****** ******** 0.6 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel TTGGTGCCGTGGCTG consensus TA CT T T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 44445 389 5.18e-09 TTGGTGTTTG TTGGTTCCGTGGCTG CTTCTTTCTA 43857 115 6.31e-09 TGTTAACAGA TTGGTGCCGCGGTTG CTTTTAATGT 1784 83 8.35e-08 CTCGACAACA TTGTTGCCGTTGCTG TGCGATTGAA 25619 264 3.40e-07 TCTCAACCTT ATGGTGACGTTGCTT CACAGGCATT 42564 131 4.18e-07 TCTAAAGCAG TTGCTTACGTGGTTT TCACTATCAA 10723 35 4.18e-07 GTAGATGATG TTGGTGCGGTGTTTT GGACGTCGTC 14067 265 1.10e-06 CCTACCGATA TTCGTGGCGCGACTG GTGACGCCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44445 5.2e-09 388_[+3]_97 43857 6.3e-09 114_[+3]_371 1784 8.4e-08 82_[+3]_403 25619 3.4e-07 263_[+3]_222 42564 4.2e-07 130_[+3]_355 10723 4.2e-07 34_[+3]_451 14067 1.1e-06 264_[+3]_221 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=7 44445 ( 389) TTGGTTCCGTGGCTG 1 43857 ( 115) TTGGTGCCGCGGTTG 1 1784 ( 83) TTGTTGCCGTTGCTG 1 25619 ( 264) ATGGTGACGTTGCTT 1 42564 ( 131) TTGCTTACGTGGTTT 1 10723 ( 35) TTGGTGCGGTGTTTT 1 14067 ( 265) TTCGTGGCGCGACTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 9234 bayes= 10.208 E= 9.1e+002 -95 -945 -945 167 -945 -945 -945 190 -945 -65 189 -945 -945 -65 162 -91 -945 -945 -945 190 -945 -945 162 9 5 135 -70 -945 -945 193 -70 -945 -945 -945 211 -945 -945 35 -945 141 -945 -945 162 9 -95 -945 162 -91 -945 135 -945 68 -945 -945 -945 190 -945 -945 130 68 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 7 E= 9.1e+002 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.142857 0.714286 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.714286 0.285714 0.285714 0.571429 0.142857 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.285714 0.000000 0.714286 0.000000 0.000000 0.714286 0.285714 0.142857 0.000000 0.714286 0.142857 0.000000 0.571429 0.000000 0.428571 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.571429 0.428571 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TTGGT[GT][CA]CG[TC][GT]G[CT]T[GT] -------------------------------------------------------------------------------- Time 10.20 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1784 9.70e-09 31_[+1(3.40e-06)]_35_[+3(8.35e-08)]_\ 156_[+2(9.32e-07)]_235 13834 6.23e-02 222_[+1(4.08e-05)]_262 13582 5.80e-04 77_[+2(1.20e-05)]_107_\ [+1(3.40e-06)]_288 47199 3.43e-05 270_[+1(4.61e-05)]_19_\ [+2(4.55e-08)]_183 37440 1.00e-02 88_[+1(2.43e-06)]_396 14067 5.54e-04 264_[+3(1.10e-06)]_88_\ [+1(2.73e-05)]_117 38261 1.03e-02 257_[+1(1.54e-06)]_227 43684 1.34e-04 109_[+2(2.03e-06)]_16_\ [+1(3.40e-06)]_347 33006 5.04e-02 304_[+1(1.07e-05)]_180 49057 4.86e-04 265_[+2(2.77e-05)]_54_\ [+1(4.63e-06)]_153 43857 3.65e-06 114_[+3(6.31e-09)]_349_\ [+1(2.93e-05)]_6 25619 2.19e-10 22_[+2(3.78e-07)]_111_\ [+1(4.34e-05)]_102_[+3(3.40e-07)]_50_[+1(3.44e-08)]_156 44445 7.61e-06 284_[+1(5.53e-05)]_88_\ [+3(5.18e-09)]_97 10723 1.34e-07 1_[+1(9.82e-06)]_17_[+3(4.18e-07)]_\ 245_[+2(1.14e-06)]_194 12820 7.15e-06 17_[+1(1.08e-08)]_294_\ [+2(1.90e-05)]_161 42564 2.17e-04 130_[+3(4.18e-07)]_61_\ [+1(3.14e-05)]_278 49095 3.86e-02 161_[+1(2.20e-05)]_323 47672 2.42e-04 89_[+1(4.63e-06)]_23_[+2(3.36e-06)]_\ 360 44187 4.39e-05 227_[+1(5.24e-07)]_38_\ [+2(9.10e-06)]_207 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************