******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/401/401.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47188 1.0000 500 37532 1.0000 500 21988 1.0000 500 22279 1.0000 500 22315 1.0000 500 38713 1.0000 500 14908 1.0000 500 9897 1.0000 500 43766 1.0000 500 49336 1.0000 500 55150 1.0000 500 50481 1.0000 500 11337 1.0000 500 34373 1.0000 500 34489 1.0000 500 45544 1.0000 500 45559 1.0000 500 45816 1.0000 500 36043 1.0000 500 38633 1.0000 500 48230 1.0000 500 38709 1.0000 500 48309 1.0000 500 34556 1.0000 500 43229 1.0000 500 32443 1.0000 500 45616 1.0000 500 49379 1.0000 500 49461 1.0000 500 50492 1.0000 500 38632 1.0000 500 49007 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/401/401.seqs.fa -oc motifs/401 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 32 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 16000 N= 32 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.274 C 0.244 G 0.214 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.274 C 0.244 G 0.214 T 0.268 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 12 llr = 146 E-value = 5.6e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 5:::8:3::1a5 pos.-specific C :21::a::3::3 probability G 4::a:::a:9:3 matrix T 189:2:8:8::: bits 2.2 * * 2.0 * * * 1.8 * * * ** 1.6 ** * * ** Relative 1.3 *** * * ** Entropy 1.1 ********** (17.5 bits) 0.9 ********** 0.7 *********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel ATTGACTGTGAA consensus G A C C sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 48309 420 5.43e-08 TCCGAACAAG GTTGACTGTGAA CCTGAACATA 45559 13 5.43e-08 CTTACTTTCA GTTGACTGTGAA TTACCATGGA 22315 414 2.21e-07 ACTACAATAC ATTGACTGTGAG CGCTACGAAC 21988 56 3.31e-07 GTTCACTCTG ATTGACTGTGAC CGTGACACAT 32443 14 8.67e-07 CTGAAATGTT GTTGACAGTGAG TGACTTTTGA 38709 48 8.67e-07 CGGAAGAACA ACTGACTGTGAA GGCTGTATGC 50481 142 1.65e-06 GGGTCGATGC ACTGACTGTGAC TTCGAGTGCG 43229 379 1.94e-06 CCGAAAAAGA ATCGACTGTGAA GAAGCGAGAA 49461 329 2.59e-06 TGTGTGTTTC GTTGACTGTAAA TGCTGAGATA 47188 389 4.27e-06 TTCGTCACCG TTTGACTGCGAG ATAGCGGGCT 48230 123 5.81e-06 TGGACATAGG ATTGTCAGCGAA CTTACATGAA 38632 53 7.83e-06 TCCAGGATTG GTTGTCAGCGAC GAATGTCGGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48309 5.4e-08 419_[+1]_69 45559 5.4e-08 12_[+1]_476 22315 2.2e-07 413_[+1]_75 21988 3.3e-07 55_[+1]_433 32443 8.7e-07 13_[+1]_475 38709 8.7e-07 47_[+1]_441 50481 1.7e-06 141_[+1]_347 43229 1.9e-06 378_[+1]_110 49461 2.6e-06 328_[+1]_160 47188 4.3e-06 388_[+1]_100 48230 5.8e-06 122_[+1]_366 38632 7.8e-06 52_[+1]_436 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=12 48309 ( 420) GTTGACTGTGAA 1 45559 ( 13) GTTGACTGTGAA 1 22315 ( 414) ATTGACTGTGAG 1 21988 ( 56) ATTGACTGTGAC 1 32443 ( 14) GTTGACAGTGAG 1 38709 ( 48) ACTGACTGTGAA 1 50481 ( 142) ACTGACTGTGAC 1 43229 ( 379) ATCGACTGTGAA 1 49461 ( 329) GTTGACTGTAAA 1 47188 ( 389) TTTGACTGCGAG 1 48230 ( 123) ATTGTCAGCGAA 1 38632 ( 53) GTTGTCAGCGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 15648 bayes= 10.7954 E= 5.6e-003 87 -1023 96 -168 -1023 -55 -1023 164 -1023 -155 -1023 177 -1023 -1023 222 -1023 160 -1023 -1023 -68 -1023 203 -1023 -1023 -13 -1023 -1023 149 -1023 -1023 222 -1023 -1023 3 -1023 149 -172 -1023 210 -1023 187 -1023 -1023 -1023 87 3 23 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 5.6e-003 0.500000 0.000000 0.416667 0.083333 0.000000 0.166667 0.000000 0.833333 0.000000 0.083333 0.000000 0.916667 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.083333 0.000000 0.916667 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AG]TTGAC[TA]G[TC]GA[ACG] -------------------------------------------------------------------------------- Time 8.31 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 13 llr = 145 E-value = 1.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::212:7a:588 pos.-specific C :418::2::5:1 probability G :65:8a2:a:22 matrix T a:22:::::::: bits 2.2 * * 2.0 * * * 1.8 * * ** 1.6 * * ** Relative 1.3 * ** ** * Entropy 1.1 ** *** ** * (16.1 bits) 0.9 ** *** ***** 0.7 ** ********* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGGCGGAAGAAA consensus CA A C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 34556 448 8.57e-08 TGGGAAAGAA TGGCGGAAGAAA AATAGTATCC 38713 171 2.93e-07 AAGGATACTA TGACGGAAGAAA GATATTTGTG 48309 160 7.87e-07 AAAATGAATA TGTCGGAAGCAA TAATCAAATG 50492 327 1.67e-06 AAACTGCGAT TCGCAGAAGAAA CCTTCAGTCT 34373 186 1.67e-06 AGGAAAATGC TGCCGGAAGCAA GTTCTCGCCG 38709 35 3.38e-06 CACAAACAAT TCGCGGAAGAAC AACTGACTGT 34489 2 4.23e-06 C TGGTGGAAGAAG ACGCGTCGGT 38633 295 4.99e-06 CCCCAAGTCA TCACGGCAGAAA ACACACTCGC 38632 219 5.71e-06 GTCAGCGATA TCGCAGGAGCAA TTGCAGCGAG 21988 279 7.03e-06 ACACGACCAA TCGCAGAAGCGA CTCGCCATGA 50481 98 7.59e-06 AATCCTTCCT TGGAGGGAGCAA AATTGTCAAG 45544 61 9.72e-06 GACACTTTGC TGACGGCAGAGA ACATATCGTC 22279 204 1.48e-05 ACTCTGTATT TGTTGGAAGCAG CCACGCAAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34556 8.6e-08 447_[+2]_41 38713 2.9e-07 170_[+2]_318 48309 7.9e-07 159_[+2]_329 50492 1.7e-06 326_[+2]_162 34373 1.7e-06 185_[+2]_303 38709 3.4e-06 34_[+2]_454 34489 4.2e-06 1_[+2]_487 38633 5e-06 294_[+2]_194 38632 5.7e-06 218_[+2]_270 21988 7e-06 278_[+2]_210 50481 7.6e-06 97_[+2]_391 45544 9.7e-06 60_[+2]_428 22279 1.5e-05 203_[+2]_285 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=13 34556 ( 448) TGGCGGAAGAAA 1 38713 ( 171) TGACGGAAGAAA 1 48309 ( 160) TGTCGGAAGCAA 1 50492 ( 327) TCGCAGAAGAAA 1 34373 ( 186) TGCCGGAAGCAA 1 38709 ( 35) TCGCGGAAGAAC 1 34489 ( 2) TGGTGGAAGAAG 1 38633 ( 295) TCACGGCAGAAA 1 38632 ( 219) TCGCAGGAGCAA 1 21988 ( 279) TCGCAGAAGCGA 1 50481 ( 98) TGGAGGGAGCAA 1 45544 ( 61) TGACGGCAGAGA 1 22279 ( 204) TGTTGGAAGCAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 15648 bayes= 11.3971 E= 1.6e+001 -1035 -1035 -1035 190 -1035 65 152 -1035 -25 -166 133 -80 -183 165 -1035 -80 -25 -1035 185 -1035 -1035 -1035 223 -1035 134 -67 -47 -1035 187 -1035 -1035 -1035 -1035 -1035 223 -1035 97 92 -1035 -1035 163 -1035 -47 -1035 149 -166 -47 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 1.6e+001 0.000000 0.000000 0.000000 1.000000 0.000000 0.384615 0.615385 0.000000 0.230769 0.076923 0.538462 0.153846 0.076923 0.769231 0.000000 0.153846 0.230769 0.000000 0.769231 0.000000 0.000000 0.000000 1.000000 0.000000 0.692308 0.153846 0.153846 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.538462 0.461538 0.000000 0.000000 0.846154 0.000000 0.153846 0.000000 0.769231 0.076923 0.153846 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[GC][GA]C[GA]GAAG[AC]AA -------------------------------------------------------------------------------- Time 16.40 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 8 llr = 116 E-value = 1.8e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 4:135a1:::84::5: pos.-specific C 66::::::61::6::: probability G :4855::a493:495: matrix T ::13::9::::6:1:a bits 2.2 * 2.0 * * 1.8 * * * 1.6 * * * * * Relative 1.3 *** * * * Entropy 1.1 *** ******* **** (20.9 bits) 0.9 *** ************ 0.7 *** ************ 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CCGGAATGCGATCGAT consensus AG AG G GAG G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 22279 333 1.25e-08 TCGGTGATAG AGGGAATGCGATCGAT TGCCTCGCGT 14908 230 2.47e-08 ACACGAGTTA CGGGGATGCGGTGGAT ACTTACCGAA 9897 437 4.79e-08 CGGGAAAGCG ACGAAATGCGAACGGT CTGGTTGCTC 45816 48 1.50e-07 CACCAGAGAA CGAGAATGCGAACGAT GATGTCGCAG 49461 288 2.05e-07 CGTGTTCTGT CCGGGATGGGGTGTGT GTGTGTGTGT 48309 345 2.05e-07 CAGCCAGAAA CCTTGATGGGATGGGT TGTGAAGCGA 43766 241 2.05e-07 GGACACGTCG CCGAAATGCCAACGGT ATTGGAGCTC 45616 321 3.00e-07 TTGATACCCC ACGTGAAGGGATCGAT TTCCCGTGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22279 1.2e-08 332_[+3]_152 14908 2.5e-08 229_[+3]_255 9897 4.8e-08 436_[+3]_48 45816 1.5e-07 47_[+3]_437 49461 2.1e-07 287_[+3]_197 48309 2.1e-07 344_[+3]_140 43766 2.1e-07 240_[+3]_244 45616 3e-07 320_[+3]_164 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=8 22279 ( 333) AGGGAATGCGATCGAT 1 14908 ( 230) CGGGGATGCGGTGGAT 1 9897 ( 437) ACGAAATGCGAACGGT 1 45816 ( 48) CGAGAATGCGAACGAT 1 49461 ( 288) CCGGGATGGGGTGTGT 1 48309 ( 345) CCTTGATGGGATGGGT 1 43766 ( 241) CCGAAATGCCAACGGT 1 45616 ( 321) ACGTGAAGGGATCGAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 15520 bayes= 12.2435 E= 1.8e+003 45 135 -965 -965 -965 135 81 -965 -113 -965 181 -110 -13 -965 122 -10 87 -965 122 -965 187 -965 -965 -965 -113 -965 -965 171 -965 -965 222 -965 -965 135 81 -965 -965 -97 203 -965 145 -965 23 -965 45 -965 -965 122 -965 135 81 -965 -965 -965 203 -110 87 -965 122 -965 -965 -965 -965 190 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 1.8e+003 0.375000 0.625000 0.000000 0.000000 0.000000 0.625000 0.375000 0.000000 0.125000 0.000000 0.750000 0.125000 0.250000 0.000000 0.500000 0.250000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.000000 0.000000 0.875000 0.000000 0.000000 1.000000 0.000000 0.000000 0.625000 0.375000 0.000000 0.000000 0.125000 0.875000 0.000000 0.750000 0.000000 0.250000 0.000000 0.375000 0.000000 0.000000 0.625000 0.000000 0.625000 0.375000 0.000000 0.000000 0.000000 0.875000 0.125000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA][CG]G[GAT][AG]ATG[CG]G[AG][TA][CG]G[AG]T -------------------------------------------------------------------------------- Time 24.74 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47188 6.11e-03 388_[+1(4.27e-06)]_100 37532 5.37e-01 500 21988 9.59e-06 55_[+1(3.31e-07)]_211_\ [+2(7.03e-06)]_210 22279 3.23e-06 203_[+2(1.48e-05)]_117_\ [+3(1.25e-08)]_152 22315 3.37e-03 413_[+1(2.21e-07)]_75 38713 5.43e-04 170_[+2(2.93e-07)]_318 14908 5.10e-04 229_[+3(2.47e-08)]_255 9897 4.85e-04 436_[+3(4.79e-08)]_48 43766 3.70e-03 240_[+3(2.05e-07)]_244 49336 4.88e-01 500 55150 5.46e-01 500 50481 4.02e-05 97_[+2(7.59e-06)]_32_[+1(1.65e-06)]_\ 347 11337 3.61e-01 500 34373 6.85e-03 185_[+2(1.67e-06)]_303 34489 3.20e-02 1_[+2(4.23e-06)]_487 45544 1.05e-03 60_[+2(9.72e-06)]_297_\ [+3(3.14e-05)]_115 45559 5.74e-04 12_[+1(5.43e-08)]_476 45816 3.18e-04 47_[+3(1.50e-07)]_437 36043 8.92e-01 500 38633 2.61e-02 294_[+2(4.99e-06)]_194 48230 4.39e-02 122_[+1(5.81e-06)]_366 38709 2.61e-05 34_[+2(3.38e-06)]_1_[+1(8.67e-07)]_\ 441 48309 4.17e-10 159_[+2(7.87e-07)]_173_\ [+3(2.05e-07)]_59_[+1(5.43e-08)]_69 34556 1.28e-03 447_[+2(8.57e-08)]_41 43229 8.35e-03 378_[+1(1.94e-06)]_110 32443 6.72e-03 13_[+1(8.67e-07)]_475 45616 1.09e-03 320_[+3(3.00e-07)]_164 49379 9.25e-01 500 49461 1.49e-05 287_[+3(2.05e-07)]_25_\ [+1(2.59e-06)]_160 50492 9.30e-03 326_[+2(1.67e-06)]_162 38632 8.02e-04 52_[+1(7.83e-06)]_154_\ [+2(5.71e-06)]_270 49007 4.25e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************