******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/85/85.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 13244 1.0000 500 47068 1.0000 500 52286 1.0000 500 48233 1.0000 500 38821 1.0000 500 39151 1.0000 500 15138 1.0000 500 7235 1.0000 500 48862 1.0000 500 43731 1.0000 500 49634 1.0000 500 44750 1.0000 500 12762 1.0000 500 31539 1.0000 500 42529 1.0000 500 42644 1.0000 500 49353 1.0000 500 49690 1.0000 500 45763 1.0000 500 49239 1.0000 500 50342 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/85/85.seqs.fa -oc motifs/85 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 21 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10500 N= 21 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.259 C 0.251 G 0.217 T 0.274 Background letter frequencies (from dataset with add-one prior applied): A 0.259 C 0.251 G 0.217 T 0.274 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 7 llr = 127 E-value = 7.2e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::a:4:::9:a7:31:3a7:1 pos.-specific C 9::a:3:::1:13341::149 probability G :a::1:9a:3:164:94::4: matrix T 1:::471:16::1:4:3:11: bits 2.2 * * 2.0 *** * * * 1.8 *** * * * 1.5 *** ** * * * Relative 1.3 **** *** * * * * Entropy 1.1 **** **** * * * * (26.1 bits) 0.9 **** **** ** * ** * 0.7 **** ******** * **** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CGACATGGATAAGGCGGAACC consensus TC G CAT A G sequence C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 50342 11 2.61e-11 GACTATGAAA CGACTTGGATAAGACGTAACC AACTAAGGCA 49353 225 2.61e-11 GACTATGAAA CGACTTGGATAAGACGTAACC AACTAAGGCA 42529 347 2.44e-10 GACTTGATTG CGACACGGAGAACGTGAAAGC GATCTTTCAA 31539 347 2.44e-10 GACTTGATTG CGACACGGAGAACGTGAAAGC GATCTTTCAA 7235 332 2.67e-08 CCGAGGCTTT CGACGTGGATACGGCCGACTC CGCGACCGAG 12762 177 2.80e-08 TCATTCTACC CGACTTTGATAGTCTGGAAGA TTACCTGGCG 13244 356 5.11e-08 AGGATATCAC TGACATGGTCAAGCAGGATCC GGACAAGGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50342 2.6e-11 10_[+1]_469 49353 2.6e-11 224_[+1]_255 42529 2.4e-10 346_[+1]_133 31539 2.4e-10 346_[+1]_133 7235 2.7e-08 331_[+1]_148 12762 2.8e-08 176_[+1]_303 13244 5.1e-08 355_[+1]_124 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=7 50342 ( 11) CGACTTGGATAAGACGTAACC 1 49353 ( 225) CGACTTGGATAAGACGTAACC 1 42529 ( 347) CGACACGGAGAACGTGAAAGC 1 31539 ( 347) CGACACGGAGAACGTGAAAGC 1 7235 ( 332) CGACGTGGATACGGCCGACTC 1 12762 ( 177) CGACTTTGATAGTCTGGAAGA 1 13244 ( 356) TGACATGGTCAAGCAGGATCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 10080 bayes= 10.3346 E= 7.2e-002 -945 177 -945 -94 -945 -945 220 -945 195 -945 -945 -945 -945 200 -945 -945 72 -945 -60 65 -945 19 -945 138 -945 -945 198 -94 -945 -945 220 -945 172 -945 -945 -94 -945 -81 40 106 195 -945 -945 -945 146 -81 -60 -945 -945 19 140 -94 14 19 98 -945 -86 77 -945 65 -945 -81 198 -945 14 -945 98 6 195 -945 -945 -945 146 -81 -945 -94 -945 77 98 -94 -86 177 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 7.2e-002 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.428571 0.000000 0.142857 0.428571 0.000000 0.285714 0.000000 0.714286 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 1.000000 0.000000 0.857143 0.000000 0.000000 0.142857 0.000000 0.142857 0.285714 0.571429 1.000000 0.000000 0.000000 0.000000 0.714286 0.142857 0.142857 0.000000 0.000000 0.285714 0.571429 0.142857 0.285714 0.285714 0.428571 0.000000 0.142857 0.428571 0.000000 0.428571 0.000000 0.142857 0.857143 0.000000 0.285714 0.000000 0.428571 0.285714 1.000000 0.000000 0.000000 0.000000 0.714286 0.142857 0.000000 0.142857 0.000000 0.428571 0.428571 0.142857 0.142857 0.857143 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CGAC[AT][TC]GGA[TG]AA[GC][GAC][CT]G[GAT]AA[CG]C -------------------------------------------------------------------------------- Time 3.81 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 13 llr = 174 E-value = 4.2e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 2212:512:55:43186a74 pos.-specific C :5269:::7:2:269:3::5 probability G 2122129:31::3:::::21 matrix T 725::4:8:44a11:21:1: bits 2.2 2.0 * 1.8 * * * 1.5 * * * * * Relative 1.3 * * * ** * Entropy 1.1 * *** * ** * (19.3 bits) 0.9 * *** * ** ** 0.7 * ** **** * ******* 0.4 * ********* ******* 0.2 ******************** 0.0 -------------------- Multilevel TCTCCAGTCAATACCAAAAC consensus TCA T AGTT GA C GA sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 42529 202 2.47e-09 GATACACGGG TCTACTGTCAATAACAAAAA GCGAGCAAGG 31539 202 2.47e-09 GATACACGGG TCTACTGTCAATAACAAAAA GCGAGCAAGG 52286 311 5.45e-09 CTACTCCCAA TTTCCAGTCTCTGCCAAAAA ATTACAATTA 12762 355 7.75e-08 CGAAAATTGG GCCGCTGTCTTTACCAAAAA TAAGGTAAGT 50342 215 1.17e-07 GTCAGATGCA ACGCCAGACTTTGCCACAAC GTCAAATGAG 49353 429 1.17e-07 GTCAGATGCA ACGCCAGACTTTGCCACAAC GTCAAATGAG 15138 262 1.43e-07 GCAGGCGCCT TCTCCGGTGTTTCCCAAATC GTGCAGGGCG 39151 301 6.87e-07 TGTTCTCGGT TTTCCAATGGATACCACAAC TTACACATAT 13244 318 7.42e-07 CAGTGCTGCT GCCACTGTCAATTACAAAGC TGTGGCTAAG 38821 339 1.97e-06 ACGTATCCAG TATGCTGAGATTAACTCAAC AGGTACATGA 48862 8 2.37e-06 GCAATCT TGTCCAGTGACTCCATAAAC ATGCTGTAAC 7235 11 2.37e-06 ATCTTTCGCA TTACCAGTCAATCCCATAGG TATCTATCGC 49239 376 3.78e-06 TCAACAAAGA TACCGGGTCAATGTCAAAGA TTATTTCCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42529 2.5e-09 201_[+2]_279 31539 2.5e-09 201_[+2]_279 52286 5.4e-09 310_[+2]_170 12762 7.7e-08 354_[+2]_126 50342 1.2e-07 214_[+2]_266 49353 1.2e-07 428_[+2]_52 15138 1.4e-07 261_[+2]_219 39151 6.9e-07 300_[+2]_180 13244 7.4e-07 317_[+2]_163 38821 2e-06 338_[+2]_142 48862 2.4e-06 7_[+2]_473 7235 2.4e-06 10_[+2]_470 49239 3.8e-06 375_[+2]_105 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=13 42529 ( 202) TCTACTGTCAATAACAAAAA 1 31539 ( 202) TCTACTGTCAATAACAAAAA 1 52286 ( 311) TTTCCAGTCTCTGCCAAAAA 1 12762 ( 355) GCCGCTGTCTTTACCAAAAA 1 50342 ( 215) ACGCCAGACTTTGCCACAAC 1 49353 ( 429) ACGCCAGACTTTGCCACAAC 1 15138 ( 262) TCTCCGGTGTTTCCCAAATC 1 39151 ( 301) TTTCCAATGGATACCACAAC 1 13244 ( 318) GCCACTGTCAATTACAAAGC 1 38821 ( 339) TATGCTGAGATTAACTCAAC 1 48862 ( 8) TGTCCAGTGACTCCATAAAC 1 7235 ( 11) TTACCAGTCAATCCCATAGG 1 49239 ( 376) TACCGGGTCAATGTCAAAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 10101 bayes= 9.35515 E= 4.2e-002 -75 -1035 -49 134 -75 110 -149 -25 -175 -12 -49 98 -17 130 -49 -1035 -1035 188 -149 -1035 83 -1035 -49 49 -175 -1035 209 -1035 -17 -1035 -1035 149 -1035 147 51 -1035 105 -1035 -149 49 83 -70 -1035 49 -1035 -1035 -1035 187 57 -12 51 -183 25 130 -1035 -183 -175 188 -1035 -1035 171 -1035 -1035 -83 125 30 -1035 -183 195 -1035 -1035 -1035 142 -1035 9 -183 57 110 -149 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 13 E= 4.2e-002 0.153846 0.000000 0.153846 0.692308 0.153846 0.538462 0.076923 0.230769 0.076923 0.230769 0.153846 0.538462 0.230769 0.615385 0.153846 0.000000 0.000000 0.923077 0.076923 0.000000 0.461538 0.000000 0.153846 0.384615 0.076923 0.000000 0.923077 0.000000 0.230769 0.000000 0.000000 0.769231 0.000000 0.692308 0.307692 0.000000 0.538462 0.000000 0.076923 0.384615 0.461538 0.153846 0.000000 0.384615 0.000000 0.000000 0.000000 1.000000 0.384615 0.230769 0.307692 0.076923 0.307692 0.615385 0.000000 0.076923 0.076923 0.923077 0.000000 0.000000 0.846154 0.000000 0.000000 0.153846 0.615385 0.307692 0.000000 0.076923 1.000000 0.000000 0.000000 0.000000 0.692308 0.000000 0.230769 0.076923 0.384615 0.538462 0.076923 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[CT][TC][CA]C[AT]G[TA][CG][AT][AT]T[AGC][CA]CA[AC]A[AG][CA] -------------------------------------------------------------------------------- Time 7.36 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 120 E-value = 1.1e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 5a375:82:a283:3288a:: pos.-specific C :::::::82:::7::3::::: probability G 5:::5:::8:22:a7522:7: matrix T ::73:a2:::7::::::::3a bits 2.2 * 2.0 * * * * 1.8 * * * * * * 1.5 * * ** * * * Relative 1.3 * ***** * * *** * Entropy 1.1 ** ******* **** ***** (28.9 bits) 0.9 ********** **** ***** 0.7 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel AATAATACGATACGGGAAAGT consensus G ATG A AC T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 50342 106 1.19e-11 TAGAATTCCA GATTGTACGATAAGGGAAAGT ATGGTATCTT 49353 320 1.19e-11 TAGAATTCCA GATTGTACGATAAGGGAAAGT ATGGTATCTT 42529 31 1.19e-11 TGCGTGCTCC AAAAATACGATACGGCAAAGT CATCGTGGTT 31539 31 1.15e-10 TGCGTGCTCC AAAAATACGAAACGGCAAAGT CATCGTGGTT 39151 5 5.99e-09 CGTA AATAATACGAGGCGAAGAATT GGCTTTCGCC 47068 77 8.25e-09 TTGATATACT GATAGTTACATACGAGAGATT ACAAAATTGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50342 1.2e-11 105_[+3]_374 49353 1.2e-11 319_[+3]_160 42529 1.2e-11 30_[+3]_449 31539 1.2e-10 30_[+3]_449 39151 6e-09 4_[+3]_475 47068 8.3e-09 76_[+3]_403 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 50342 ( 106) GATTGTACGATAAGGGAAAGT 1 49353 ( 320) GATTGTACGATAAGGGAAAGT 1 42529 ( 31) AAAAATACGATACGGCAAAGT 1 31539 ( 31) AAAAATACGAAACGGCAAAGT 1 39151 ( 5) AATAATACGAGGCGAAGAATT 1 47068 ( 77) GATAGTTACATACGAGAGATT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 10080 bayes= 11.1611 E= 1.1e-001 95 -923 121 -923 195 -923 -923 -923 36 -923 -923 128 136 -923 -923 28 95 -923 121 -923 -923 -923 -923 187 168 -923 -923 -71 -64 173 -923 -923 -923 -59 194 -923 195 -923 -923 -923 -64 -923 -38 128 168 -923 -38 -923 36 141 -923 -923 -923 -923 220 -923 36 -923 162 -923 -64 41 121 -923 168 -923 -38 -923 168 -923 -38 -923 195 -923 -923 -923 -923 -923 162 28 -923 -923 -923 187 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 1.1e-001 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.000000 0.000000 0.666667 0.666667 0.000000 0.000000 0.333333 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.833333 0.000000 0.000000 0.166667 0.166667 0.833333 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.166667 0.666667 0.833333 0.000000 0.166667 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.166667 0.333333 0.500000 0.000000 0.833333 0.000000 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AG]A[TA][AT][AG]TACGATA[CA]G[GA][GC]AAA[GT]T -------------------------------------------------------------------------------- Time 11.26 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 13244 1.01e-06 317_[+2(7.42e-07)]_18_\ [+1(5.11e-08)]_124 47068 3.41e-06 76_[+3(8.25e-09)]_227_\ [+1(4.32e-05)]_155 52286 8.74e-05 310_[+2(5.45e-09)]_170 48233 9.36e-01 500 38821 1.74e-03 338_[+2(1.97e-06)]_142 39151 1.69e-07 4_[+3(5.99e-09)]_275_[+2(6.87e-07)]_\ 180 15138 1.81e-04 261_[+2(1.43e-07)]_219 7235 1.46e-06 10_[+2(2.37e-06)]_301_\ [+1(2.67e-08)]_49_[+2(9.76e-05)]_79 48862 7.88e-03 7_[+2(2.37e-06)]_473 43731 6.97e-01 500 49634 3.87e-01 500 44750 9.90e-01 500 12762 9.76e-08 176_[+1(2.80e-08)]_157_\ [+2(7.75e-08)]_126 31539 8.65e-18 30_[+3(1.15e-10)]_69_[+1(5.72e-05)]_\ 60_[+2(2.47e-09)]_125_[+1(2.44e-10)]_133 42529 9.82e-19 30_[+3(1.19e-11)]_69_[+1(5.31e-06)]_\ 60_[+2(2.47e-09)]_125_[+1(2.44e-10)]_133 42644 1.41e-01 84_[+3(5.59e-05)]_395 49353 4.65e-18 224_[+1(2.61e-11)]_74_\ [+3(1.19e-11)]_88_[+2(1.17e-07)]_52 49690 3.64e-01 500 45763 8.95e-01 500 49239 6.39e-03 375_[+2(3.78e-06)]_105 50342 4.65e-18 10_[+1(2.61e-11)]_74_[+3(1.19e-11)]_\ 88_[+2(1.17e-07)]_266 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************