******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/434/434.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42539 1.0000 500 3241 1.0000 500 9491 1.0000 500 43110 1.0000 500 43220 1.0000 500 38655 1.0000 500 29532 1.0000 500 10350 1.0000 500 17044 1.0000 500 20024 1.0000 500 38650 1.0000 500 43975 1.0000 500 48986 1.0000 500 37267 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/434/434.seqs.fa -oc motifs/434 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.263 C 0.250 G 0.229 T 0.258 Background letter frequencies (from dataset with add-one prior applied): A 0.263 C 0.250 G 0.229 T 0.258 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 9 llr = 109 E-value = 5.1e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::a:61::a72: pos.-specific C :::9::2::::2 probability G 2a:1:9:a:348 matrix T 8:::4:8:::3: bits 2.1 * * 1.9 ** ** 1.7 ** * ** 1.5 *** * ** Relative 1.3 **** **** * Entropy 1.1 **** ***** * (17.5 bits) 0.9 ********** * 0.6 ********** * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGACAGTGAAGG consensus G T C GTC sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 17044 246 2.55e-07 AAATTCACAA TGACTGTGAATG CGCTCTCGTC 38655 70 2.55e-07 CATGCTAGGC TGACTGTGAATG TACCAAACCT 10350 68 2.98e-07 GTGAGGGGCT TGACTGTGAGGG AGTATGGCCT 38650 377 4.04e-07 TTTAGCTAGC TGACAGTGAGTG TGAATGCCCG 48986 220 5.54e-07 TTTTCGTTGG GGACAGTGAAGG ACTTGGACTT 43110 236 2.21e-06 TGCCTAATTC TGACTGTGAAAC CCCTTCATCG 42539 246 3.14e-06 TTGTTAATAT TGACAATGAGGG AAGAAACTTA 20024 49 4.39e-06 CCAACGACAT GGACAGCGAAAG TACCGCCGGT 3241 166 9.15e-06 TTCCCTGAAC TGAGAGCGAAGC ATTCGAACCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17044 2.5e-07 245_[+1]_243 38655 2.5e-07 69_[+1]_419 10350 3e-07 67_[+1]_421 38650 4e-07 376_[+1]_112 48986 5.5e-07 219_[+1]_269 43110 2.2e-06 235_[+1]_253 42539 3.1e-06 245_[+1]_243 20024 4.4e-06 48_[+1]_440 3241 9.1e-06 165_[+1]_323 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=9 17044 ( 246) TGACTGTGAATG 1 38655 ( 70) TGACTGTGAATG 1 10350 ( 68) TGACTGTGAGGG 1 38650 ( 377) TGACAGTGAGTG 1 48986 ( 220) GGACAGTGAAGG 1 43110 ( 236) TGACTGTGAAAC 1 42539 ( 246) TGACAATGAGGG 1 20024 ( 49) GGACAGCGAAAG 1 3241 ( 166) TGAGAGCGAAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 10.4181 E= 5.1e-002 -982 -982 -4 159 -982 -982 212 -982 193 -982 -982 -982 -982 183 -104 -982 108 -982 -982 79 -124 -982 195 -982 -982 -17 -982 159 -982 -982 212 -982 193 -982 -982 -982 134 -982 54 -982 -24 -982 95 37 -982 -17 176 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 5.1e-002 0.000000 0.000000 0.222222 0.777778 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.888889 0.111111 0.000000 0.555556 0.000000 0.000000 0.444444 0.111111 0.000000 0.888889 0.000000 0.000000 0.222222 0.000000 0.777778 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.222222 0.000000 0.444444 0.333333 0.000000 0.222222 0.777778 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TG]GAC[AT]G[TC]GA[AG][GTA][GC] -------------------------------------------------------------------------------- Time 1.70 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 14 llr = 131 E-value = 2.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::12:a1::413 pos.-specific C 3a:4a::4416: probability G 1:91::::51:7 matrix T 6:13::96144: bits 2.1 1.9 * ** 1.7 * ** 1.5 * *** Relative 1.3 ** *** * Entropy 1.1 ** **** * (13.5 bits) 0.9 *** **** * 0.6 *** ***** ** 0.4 *** ***** ** 0.2 ************ 0.0 ------------ Multilevel TCGCCATTGACG consensus C T CCTTA sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 38655 40 5.47e-08 ATTCCCACTG TCGCCATTGTCG CAAGCAAGCA 29532 349 1.65e-06 TTTGACTTGG TCGTCATTGATG AAGCGGCACA 48986 96 7.25e-06 TTCCCGTGAA TCGTCATTCCCG AATGGTGACG 43110 279 8.92e-06 TTCAAGACTT TCGTCATTCACA GTCACTGTCA 3241 276 8.92e-06 GATTCCAACT CCGCCATCGGCG CCACCCCTGC 43220 119 1.93e-05 CCAAAGCCAA CCGCCATCCACA GTCTTCCAGT 20024 20 2.24e-05 TTCAGCACTT GCGCCATCCTCG GCATCATCCA 43975 127 2.65e-05 GAATTTTATC TCGACATCGTAG CTTCTAACCA 9491 250 3.58e-05 AAATGCGTTC TCTCCATCGATG CTGGTCTTGC 17044 339 3.86e-05 CCGATCCAAG TCGACATTTCTG GAAGGGCCAT 38650 48 4.40e-05 GATATTGGTG CCGCCATTCGTA GTTGAAGAGA 42539 414 4.75e-05 AATGAGAAAA TCGACAATGTTG TACAAAAGAG 37267 356 6.60e-05 AAAGCGTAGA CCATCATCGTCG GATAGATCAC 10350 26 7.01e-05 CCACTTGTCT TCGGCATTTACA AGGATGGCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38655 5.5e-08 39_[+2]_449 29532 1.6e-06 348_[+2]_140 48986 7.2e-06 95_[+2]_393 43110 8.9e-06 278_[+2]_210 3241 8.9e-06 275_[+2]_213 43220 1.9e-05 118_[+2]_370 20024 2.2e-05 19_[+2]_469 43975 2.6e-05 126_[+2]_362 9491 3.6e-05 249_[+2]_239 17044 3.9e-05 338_[+2]_150 38650 4.4e-05 47_[+2]_441 42539 4.7e-05 413_[+2]_75 37267 6.6e-05 355_[+2]_133 10350 7e-05 25_[+2]_463 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=14 38655 ( 40) TCGCCATTGTCG 1 29532 ( 349) TCGTCATTGATG 1 48986 ( 96) TCGTCATTCCCG 1 43110 ( 279) TCGTCATTCACA 1 3241 ( 276) CCGCCATCGGCG 1 43220 ( 119) CCGCCATCCACA 1 20024 ( 20) GCGCCATCCTCG 1 43975 ( 127) TCGACATCGTAG 1 9491 ( 250) TCTCCATCGATG 1 17044 ( 339) TCGACATTTCTG 1 38650 ( 48) CCGCCATTCGTA 1 42539 ( 414) TCGACAATGTTG 1 37267 ( 356) CCATCATCGTCG 1 10350 ( 26) TCGGCATTTACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 8.93074 E= 2.4e+001 -1045 19 -168 132 -1045 200 -1045 -1045 -188 -1045 190 -185 -29 77 -168 15 -1045 200 -1045 -1045 193 -1045 -1045 -1045 -188 -1045 -1045 185 -1045 77 -1045 115 -1045 51 113 -85 44 -81 -68 47 -188 119 -1045 47 12 -1045 164 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 2.4e+001 0.000000 0.285714 0.071429 0.642857 0.000000 1.000000 0.000000 0.000000 0.071429 0.000000 0.857143 0.071429 0.214286 0.428571 0.071429 0.285714 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.071429 0.000000 0.000000 0.928571 0.000000 0.428571 0.000000 0.571429 0.000000 0.357143 0.500000 0.142857 0.357143 0.142857 0.142857 0.357143 0.071429 0.571429 0.000000 0.357143 0.285714 0.000000 0.714286 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TC]CG[CTA]CAT[TC][GC][AT][CT][GA] -------------------------------------------------------------------------------- Time 3.39 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 4 llr = 68 E-value = 5.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::5::::::3::3:3: pos.-specific C 3:53:a::8::::5:: probability G 8a:8a:8a3:8a::8a matrix T ::::::3::83:85:: bits 2.1 * * * * * 1.9 * ** * * * 1.7 * ** * * * 1.5 * ** * * * Relative 1.3 ** ****** ** ** Entropy 1.1 **************** (24.4 bits) 0.9 **************** 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GGAGGCGGCTGGTCGG consensus C CC T GAT ATA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 29532 377 4.10e-09 CACAGGCCCT GGAGGCGGCTTGTTGG ATTCCTGCCT 37267 129 8.61e-09 GACAGGCTGG CGCGGCGGCAGGTCGG TTTCGATGAT 17044 100 1.78e-08 AGGCACGGTA GGAGGCTGCTGGTTAG GCCTTTGTTC 43975 15 1.92e-08 GCCTTTCCGA GGCCGCGGGTGGACGG AATGTTCGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 29532 4.1e-09 376_[+3]_108 37267 8.6e-09 128_[+3]_356 17044 1.8e-08 99_[+3]_385 43975 1.9e-08 14_[+3]_470 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=4 29532 ( 377) GGAGGCGGCTTGTTGG 1 37267 ( 129) CGCGGCGGCAGGTCGG 1 17044 ( 100) GGAGGCTGCTGGTTAG 1 43975 ( 15) GGCCGCGGGTGGACGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 11.4657 E= 5.7e+002 -865 0 171 -865 -865 -865 212 -865 93 100 -865 -865 -865 0 171 -865 -865 -865 212 -865 -865 199 -865 -865 -865 -865 171 -4 -865 -865 212 -865 -865 158 13 -865 -7 -865 -865 154 -865 -865 171 -4 -865 -865 212 -865 -7 -865 -865 154 -865 100 -865 95 -7 -865 171 -865 -865 -865 212 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 4 E= 5.7e+002 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.500000 0.000000 0.500000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC]G[AC][GC]GC[GT]G[CG][TA][GT]G[TA][CT][GA]G -------------------------------------------------------------------------------- Time 4.96 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42539 1.04e-03 245_[+1(3.14e-06)]_156_\ [+2(4.75e-05)]_75 3241 4.81e-04 165_[+1(9.15e-06)]_98_\ [+2(8.92e-06)]_213 9491 1.48e-01 249_[+2(3.58e-05)]_239 43110 7.94e-05 235_[+1(2.21e-06)]_31_\ [+2(8.92e-06)]_210 43220 7.97e-02 118_[+2(1.93e-05)]_370 38655 5.27e-07 39_[+2(5.47e-08)]_18_[+1(2.55e-07)]_\ 419 29532 2.06e-07 348_[+2(1.65e-06)]_16_\ [+3(4.10e-09)]_108 10350 2.93e-04 25_[+2(7.01e-05)]_30_[+1(2.98e-07)]_\ 421 17044 6.63e-09 99_[+3(1.78e-08)]_130_\ [+1(2.55e-07)]_81_[+2(3.86e-05)]_150 20024 1.44e-04 19_[+2(2.24e-05)]_17_[+1(4.39e-06)]_\ 440 38650 1.66e-04 47_[+2(4.40e-05)]_317_\ [+1(4.04e-07)]_112 43975 1.27e-05 14_[+3(1.92e-08)]_96_[+2(2.65e-05)]_\ 362 48986 3.38e-05 95_[+2(7.25e-06)]_112_\ [+1(5.54e-07)]_269 37267 1.51e-05 128_[+3(8.61e-09)]_211_\ [+2(6.60e-05)]_133 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************