******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/311/311.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31991 1.0000 500 46359 1.0000 500 1199 1.0000 500 47912 1.0000 500 4936 1.0000 500 43531 1.0000 500 25433 1.0000 500 40329 1.0000 500 30786 1.0000 500 50567 1.0000 500 45673 1.0000 500 43872 1.0000 500 38781 1.0000 500 44934 1.0000 500 39504 1.0000 500 43305 1.0000 500 43423 1.0000 500 47858 1.0000 500 47911 1.0000 500 49598 1.0000 500 35573 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/311/311.seqs.fa -oc motifs/311 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 21 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10500 N= 21 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.242 G 0.225 T 0.265 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.242 G 0.225 T 0.265 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 10 llr = 131 E-value = 4.3e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :523:2::::a::::1 pos.-specific C 1:51:17:11::::76 probability G 5:349:3629:1:a3: matrix T 45:217:47::9a::3 bits 2.2 * 1.9 * ** 1.7 * ** ** 1.5 * ***** Relative 1.3 * * ****** Entropy 1.1 * ** ****** (18.9 bits) 0.9 * *********** 0.6 *** ************ 0.4 *** ************ 0.2 **************** 0.0 ---------------- Multilevel GACGGTCGTGATTGCC consensus TTGA AGTG GT sequence AT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 49598 424 9.03e-09 GCTACACTCA TTCGGTGGTGATTGCC GTGTTCCTAA 43305 362 9.03e-09 TTCCACCGGT GTCAGTCGTGATTGGC AGACAACCCA 44934 181 1.45e-08 AGACTGTGTC GACCGTCGTGATTGCC GAAACCCTGC 50567 112 6.62e-08 GTTCATAATG GACAGTCGGGATTGCT TTACAACTGG 46359 382 4.12e-07 GACGAACAAG TTGAGTCTTGATTGCA GCAATCGTCG 43531 37 1.25e-06 ACGGGTCCCG TAGGGACGTGAGTGCT GACACGTTCG 31991 208 1.64e-06 CTCGGGGACC TTCGTTGTGGATTGCC AACATGAATC 35573 441 1.98e-06 TAAGAGCGGC GTATGTCTTCATTGCT AGAGGGAGGC 45673 260 2.66e-06 GGAATCAGTC CAAGGCCTTGATTGGC CAAAGCAGGT 1199 100 2.66e-06 CGCCCGTGCA GAGTGAGGCGATTGGC CTCAACCACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49598 9e-09 423_[+1]_61 43305 9e-09 361_[+1]_123 44934 1.5e-08 180_[+1]_304 50567 6.6e-08 111_[+1]_373 46359 4.1e-07 381_[+1]_103 43531 1.3e-06 36_[+1]_448 31991 1.6e-06 207_[+1]_277 35573 2e-06 440_[+1]_44 45673 2.7e-06 259_[+1]_225 1199 2.7e-06 99_[+1]_385 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=10 49598 ( 424) TTCGGTGGTGATTGCC 1 43305 ( 362) GTCAGTCGTGATTGGC 1 44934 ( 181) GACCGTCGTGATTGCC 1 50567 ( 112) GACAGTCGGGATTGCT 1 46359 ( 382) TTGAGTCTTGATTGCA 1 43531 ( 37) TAGGGACGTGAGTGCT 1 31991 ( 208) TTCGTTGTGGATTGCC 1 35573 ( 441) GTATGTCTTCATTGCT 1 45673 ( 260) CAAGGCCTTGATTGGC 1 1199 ( 100) GAGTGAGGCGATTGGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 10185 bayes= 10.9349 E= 4.3e+000 -997 -127 115 59 90 -997 -997 91 -42 105 41 -997 16 -127 83 -41 -997 -997 200 -140 -42 -127 -997 140 -997 153 41 -997 -997 -997 141 59 -997 -127 -17 140 -997 -127 200 -997 190 -997 -997 -997 -997 -997 -117 176 -997 -997 -997 191 -997 -997 215 -997 -997 153 41 -997 -142 131 -997 18 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 4.3e+000 0.000000 0.100000 0.500000 0.400000 0.500000 0.000000 0.000000 0.500000 0.200000 0.500000 0.300000 0.000000 0.300000 0.100000 0.400000 0.200000 0.000000 0.000000 0.900000 0.100000 0.200000 0.100000 0.000000 0.700000 0.000000 0.700000 0.300000 0.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.100000 0.200000 0.700000 0.000000 0.100000 0.900000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.100000 0.900000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.700000 0.300000 0.000000 0.100000 0.600000 0.000000 0.300000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GT][AT][CGA][GAT]G[TA][CG][GT][TG]GATTG[CG][CT] -------------------------------------------------------------------------------- Time 4.41 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 10 llr = 126 E-value = 3.3e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::715:::a:3:132 pos.-specific C :1:9::1::42217: probability G 191:29:a:62:2:: matrix T 9:2:319:::386:8 bits 2.2 * 1.9 ** 1.7 * * ** 1.5 ** * **** Relative 1.3 ** * **** * * Entropy 1.1 ** * ***** * ** (18.2 bits) 0.9 **** ***** * ** 0.6 **** ***** * ** 0.4 ********** **** 0.2 ********** **** 0.0 --------------- Multilevel TGACAGTGAGATTCT consensus T T CTCGAA sequence G C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 39504 269 4.78e-08 ACTCTACTAT TGACAGTGAGGTTCA TCCTTGCGAC 30786 447 1.28e-07 TCCATCCTTT TGACGGTGACTTGCT GGTGTTATTG 43423 118 1.64e-07 AATATATTTT GGACAGTGAGCTTCT GAGGGGCTTT 35573 314 3.02e-07 ATTTCGAGTT TGACTTTGAGTTTCT GAGATAGTTC 40329 266 9.88e-07 TCACTGGCTG TCACAGTGACGTTAT TCCAAACCCA 46359 104 9.88e-07 AGTTCGCGAG TGACAGCGAGTTGAT TGCGGGCGCC 47858 67 1.24e-06 TATGCGTTAC TGTCAGTGAGCCTCA TCTCTAATAC 47912 108 1.24e-06 GCAGCAGTTG TGTCGGTGACATCCT CTTTGTCAAT 1199 39 1.71e-06 AGGTAAATCA TGGATGTGAGATTCT TGTTGAAGCA 44934 258 2.40e-06 GAGACGTCAC TGACTGTGACACAAT CGCGAGTCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39504 4.8e-08 268_[+2]_217 30786 1.3e-07 446_[+2]_39 43423 1.6e-07 117_[+2]_368 35573 3e-07 313_[+2]_172 40329 9.9e-07 265_[+2]_220 46359 9.9e-07 103_[+2]_382 47858 1.2e-06 66_[+2]_419 47912 1.2e-06 107_[+2]_378 1199 1.7e-06 38_[+2]_447 44934 2.4e-06 257_[+2]_228 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=10 39504 ( 269) TGACAGTGAGGTTCA 1 30786 ( 447) TGACGGTGACTTGCT 1 43423 ( 118) GGACAGTGAGCTTCT 1 35573 ( 314) TGACTTTGAGTTTCT 1 40329 ( 266) TCACAGTGACGTTAT 1 46359 ( 104) TGACAGCGAGTTGAT 1 47858 ( 67) TGTCAGTGAGCCTCA 1 47912 ( 108) TGTCGGTGACATCCT 1 1199 ( 39) TGGATGTGAGATTCT 1 44934 ( 258) TGACTGTGACACAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 10206 bayes= 10.2456 E= 3.3e+000 -997 -997 -117 176 -997 -127 200 -997 138 -997 -117 -41 -142 190 -997 -997 90 -997 -17 18 -997 -997 200 -140 -997 -127 -997 176 -997 -997 215 -997 190 -997 -997 -997 -997 73 141 -997 16 -27 -17 18 -997 -27 -997 159 -142 -127 -17 118 16 153 -997 -997 -42 -997 -997 159 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 10 E= 3.3e+000 0.000000 0.000000 0.100000 0.900000 0.000000 0.100000 0.900000 0.000000 0.700000 0.000000 0.100000 0.200000 0.100000 0.900000 0.000000 0.000000 0.500000 0.000000 0.200000 0.300000 0.000000 0.000000 0.900000 0.100000 0.000000 0.100000 0.000000 0.900000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.300000 0.200000 0.200000 0.300000 0.000000 0.200000 0.000000 0.800000 0.100000 0.100000 0.200000 0.600000 0.300000 0.700000 0.000000 0.000000 0.200000 0.000000 0.000000 0.800000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TG[AT]C[ATG]GTGA[GC][ATCG][TC][TG][CA][TA] -------------------------------------------------------------------------------- Time 8.52 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 6 llr = 95 E-value = 1.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::8:a:5233::::: pos.-specific C ::2:8:23::52:2:a probability G ::8:2:825::8:8:: matrix T aa:2::::372:a:a: bits 2.2 * 1.9 ** * * ** 1.7 ** * * ** 1.5 *** *** ***** Relative 1.3 ******* ***** Entropy 1.1 ******* * ***** (22.8 bits) 0.9 ******* * ***** 0.6 ******* ** ***** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TTGACAGAGTCGTGTC consensus CTAA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 47858 391 2.58e-09 GTCATGCAAC TTGACAGAGAAGTGTC GCTCGAGAAA 46359 414 9.84e-09 GTCGCCCGTG TTGACACCGTCGTGTC GACCTGCCCA 49598 126 1.07e-08 GTGTCAACGG TTGTCAGATTCGTGTC AAAGCCTTCT 40329 358 6.59e-08 ATTACGTCAC TTGACAGCTTTCTGTC AACGGTCCCA 43531 322 7.46e-08 GTTTGGTACC TTGACAGGATCGTCTC ACTCGAGCGT 30786 203 1.17e-07 GCGATAACAC TTCAGAGAGAAGTGTC TCCACGCAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47858 2.6e-09 390_[+3]_94 46359 9.8e-09 413_[+3]_71 49598 1.1e-08 125_[+3]_359 40329 6.6e-08 357_[+3]_127 43531 7.5e-08 321_[+3]_163 30786 1.2e-07 202_[+3]_282 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=6 47858 ( 391) TTGACAGAGAAGTGTC 1 46359 ( 414) TTGACACCGTCGTGTC 1 49598 ( 126) TTGTCAGATTCGTGTC 1 40329 ( 358) TTGACAGCTTTCTGTC 1 43531 ( 322) TTGACAGGATCGTCTC 1 30786 ( 203) TTCAGAGAGAAGTGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 10185 bayes= 11.176 E= 1.3e+001 -923 -923 -923 191 -923 -923 -923 191 -923 -54 189 -923 163 -923 -923 -67 -923 178 -43 -923 190 -923 -923 -923 -923 -54 189 -923 90 46 -43 -923 -68 -923 115 33 31 -923 -923 133 31 105 -923 -67 -923 -54 189 -923 -923 -923 -923 191 -923 -54 189 -923 -923 -923 -923 191 -923 205 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 1.3e+001 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.833333 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.833333 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.500000 0.333333 0.166667 0.000000 0.166667 0.000000 0.500000 0.333333 0.333333 0.000000 0.000000 0.666667 0.333333 0.500000 0.000000 0.166667 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TTGACAG[AC][GT][TA][CA]GTGTC -------------------------------------------------------------------------------- Time 12.60 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31991 7.61e-03 207_[+1(1.64e-06)]_277 46359 1.98e-10 103_[+2(9.88e-07)]_263_\ [+1(4.12e-07)]_16_[+3(9.84e-09)]_71 1199 2.54e-05 38_[+2(1.71e-06)]_46_[+1(2.66e-06)]_\ 385 47912 1.16e-02 107_[+2(1.24e-06)]_378 4936 3.79e-01 500 43531 1.84e-06 36_[+1(1.25e-06)]_269_\ [+3(7.46e-08)]_163 25433 2.91e-01 500 40329 1.39e-06 265_[+2(9.88e-07)]_77_\ [+3(6.59e-08)]_127 30786 6.34e-07 202_[+3(1.17e-07)]_228_\ [+2(1.28e-07)]_39 50567 4.18e-04 111_[+1(6.62e-08)]_373 45673 1.34e-02 259_[+1(2.66e-06)]_225 43872 2.19e-02 8_[+3(1.61e-05)]_476 38781 6.74e-01 500 44934 1.44e-07 180_[+1(1.45e-08)]_61_\ [+2(2.40e-06)]_228 39504 8.33e-04 268_[+2(4.78e-08)]_217 43305 1.76e-04 361_[+1(9.03e-09)]_123 43423 9.10e-04 117_[+2(1.64e-07)]_349_\ [+2(3.59e-05)]_4 47858 1.05e-07 66_[+2(1.24e-06)]_309_\ [+3(2.58e-09)]_94 47911 3.59e-01 500 49598 2.32e-09 125_[+3(1.07e-08)]_282_\ [+1(9.03e-09)]_61 35573 6.05e-06 313_[+2(3.02e-07)]_112_\ [+1(1.98e-06)]_44 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************