******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/41/41.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42699 1.0000 500 46775 1.0000 500 37521 1.0000 500 49335 1.0000 500 49352 1.0000 500 49766 1.0000 500 16669 1.0000 500 44106 1.0000 500 33623 1.0000 500 44512 1.0000 500 11021 1.0000 500 44980 1.0000 500 45536 1.0000 500 45563 1.0000 500 45973 1.0000 500 43324 1.0000 500 54087 1.0000 500 46403 1.0000 500 46653 1.0000 500 35895 1.0000 500 47618 1.0000 500 49648 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/41/41.seqs.fa -oc motifs/41 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 22 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 11000 N= 22 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.272 C 0.236 G 0.222 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.272 C 0.236 G 0.222 T 0.271 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 17 llr = 187 E-value = 8.3e-010 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::9:1:::952 pos.-specific C 2:61a2::8::1 probability G 2:2:::a:1151 matrix T 6a2::7:a1::6 bits 2.2 * * 2.0 * * ** 1.7 * * ** 1.5 * * ** * Relative 1.3 * ** **** Entropy 1.1 * ** ***** (15.9 bits) 0.9 * ******** 0.7 *********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTCACTGTCAGT consensus G T C AA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 45536 310 7.01e-08 AGGTGGCGTT TTCACTGTCAGT ACCAAAATCG 46775 469 2.13e-07 CGACTCACCT GTCACTGTCAGT CACTCACACC 43324 90 4.25e-07 CGGTATTGGC TTCACCGTCAGT AGCGTCATAG 45563 405 4.25e-07 CGCGGTAGTG TTCACTGTCAGA CCACCACTAT 44512 191 4.25e-07 AAAACAGAAA TTCACCGTCAGT CAGCAGACCT 49335 165 4.25e-07 AATTCGTCCC TTCACTGTCAGA ACGAAAATGA 11021 64 8.82e-07 CAAATTATGA TTTACTGTCAAT TGTAAGTTTA 44106 469 1.04e-06 GCAGAAACGC CTCACTGTCAAT CGATATATCC 54087 116 1.26e-06 TCGACTGACA GTGACTGTCAGT AAAAAAGCTC 49648 403 3.65e-06 TTTTTGCACC TTCACTGTCGGT CCAATCGACG 37521 362 7.08e-06 TGAAGAGTTC GTCACTGTCAAG AATGGCGCCT 33623 430 9.15e-06 GCAAAAGGAA TTTACTGTCAAC GCGTCCAAGC 35895 334 1.19e-05 ACGAAACAAT CTCACAGTCAAT CACAAATCAC 16669 409 1.56e-05 GCGATAAGGT TTGCCCGTCAAT TCTTTCGGAC 42699 313 2.09e-05 CGTTTTTACT GTTACCGTTAGT TCGTTCCGAT 44980 250 4.24e-05 TACCGTACAA CTGACTGTGAAA AGTCTCTTAC 46653 148 4.62e-05 ATTGACTGTC TTTCCTGTTAAA TACGATACTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45536 7e-08 309_[+1]_179 46775 2.1e-07 468_[+1]_20 43324 4.3e-07 89_[+1]_399 45563 4.3e-07 404_[+1]_84 44512 4.3e-07 190_[+1]_298 49335 4.3e-07 164_[+1]_324 11021 8.8e-07 63_[+1]_425 44106 1e-06 468_[+1]_20 54087 1.3e-06 115_[+1]_373 49648 3.6e-06 402_[+1]_86 37521 7.1e-06 361_[+1]_127 33623 9.2e-06 429_[+1]_59 35895 1.2e-05 333_[+1]_155 16669 1.6e-05 408_[+1]_80 42699 2.1e-05 312_[+1]_176 44980 4.2e-05 249_[+1]_239 46653 4.6e-05 147_[+1]_341 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=17 45536 ( 310) TTCACTGTCAGT 1 46775 ( 469) GTCACTGTCAGT 1 43324 ( 90) TTCACCGTCAGT 1 45563 ( 405) TTCACTGTCAGA 1 44512 ( 191) TTCACCGTCAGT 1 49335 ( 165) TTCACTGTCAGA 1 11021 ( 64) TTTACTGTCAAT 1 44106 ( 469) CTCACTGTCAAT 1 54087 ( 116) GTGACTGTCAGT 1 49648 ( 403) TTCACTGTCGGT 1 37521 ( 362) GTCACTGTCAAG 1 33623 ( 430) TTTACTGTCAAC 1 35895 ( 334) CTCACAGTCAAT 1 16669 ( 409) TTGCCCGTCAAT 1 42699 ( 313) GTTACCGTTAGT 1 44980 ( 250) CTGACTGTGAAA 1 46653 ( 148) TTTCCTGTTAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 10758 bayes= 10.099 E= 8.3e-010 -1073 -42 8 112 -1073 -1073 -1073 188 -1073 132 -33 -20 170 -100 -1073 -1073 -1073 209 -1073 -1073 -220 0 -1073 138 -1073 -1073 217 -1073 -1073 -1073 -1073 188 -1073 181 -191 -120 179 -1073 -191 -1073 79 -1073 125 -1073 -21 -200 -191 126 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 17 E= 8.3e-010 0.000000 0.176471 0.235294 0.588235 0.000000 0.000000 0.000000 1.000000 0.000000 0.588235 0.176471 0.235294 0.882353 0.117647 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.058824 0.235294 0.000000 0.705882 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.823529 0.058824 0.117647 0.941176 0.000000 0.058824 0.000000 0.470588 0.000000 0.529412 0.000000 0.235294 0.058824 0.058824 0.647059 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TG]T[CT]AC[TC]GTCA[GA][TA] -------------------------------------------------------------------------------- Time 4.99 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 9 llr = 113 E-value = 1.7e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::8:4:::a461 pos.-specific C 1::9::::::2: probability G :a21:a1a:6:9 matrix T 9:::6:9:::2: bits 2.2 * * * 2.0 * * ** 1.7 * * ** * 1.5 * * **** * Relative 1.3 ** * **** * Entropy 1.1 **** ***** * (18.1 bits) 0.9 ********** * 0.7 ********** * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGACTGTGAGAG consensus G A AC sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44106 242 5.10e-08 TCAATTGCAT TGACTGTGAGAG CGACGATGTC 46403 296 1.02e-07 TTGCAACGAA TGACAGTGAGAG AAATCTGTCC 46775 188 1.65e-07 ATGCCCAGAC TGACTGTGAAAG CCGGGGAGTA 45973 395 8.36e-07 GAAAATCAAA TGGCAGTGAAAG AGCCAAATCA 44980 97 1.36e-06 AAAAGGGCTG TGGCTGTGAACG TCGACGACGA 44512 254 1.88e-06 GTAGCGGTGT TGACAGTGAGAA CGAGGTGTCA 54087 425 2.02e-06 ATGAGCGCAA CGACTGTGAGCG CGGATGGAAG 33623 379 2.27e-06 GTGCTTGGAT TGAGTGTGAGTG CTCGTCATAG 11021 40 3.44e-06 GAATAGCTCC TGACAGGGAATG GACAAATTAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44106 5.1e-08 241_[+2]_247 46403 1e-07 295_[+2]_193 46775 1.6e-07 187_[+2]_301 45973 8.4e-07 394_[+2]_94 44980 1.4e-06 96_[+2]_392 44512 1.9e-06 253_[+2]_235 54087 2e-06 424_[+2]_64 33623 2.3e-06 378_[+2]_110 11021 3.4e-06 39_[+2]_449 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=9 44106 ( 242) TGACTGTGAGAG 1 46403 ( 296) TGACAGTGAGAG 1 46775 ( 188) TGACTGTGAAAG 1 45973 ( 395) TGGCAGTGAAAG 1 44980 ( 97) TGGCTGTGAACG 1 44512 ( 254) TGACAGTGAGAA 1 54087 ( 425) CGACTGTGAGCG 1 33623 ( 379) TGAGTGTGAGTG 1 11021 ( 40) TGACAGGGAATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 10758 bayes= 10.3564 E= 1.7e-001 -982 -108 -982 171 -982 -982 217 -982 152 -982 0 -982 -982 191 -100 -982 71 -982 -982 104 -982 -982 217 -982 -982 -982 -100 171 -982 -982 217 -982 188 -982 -982 -982 71 -982 132 -982 103 -8 -982 -28 -129 -982 200 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 1.7e-001 0.000000 0.111111 0.000000 0.888889 0.000000 0.000000 1.000000 0.000000 0.777778 0.000000 0.222222 0.000000 0.000000 0.888889 0.111111 0.000000 0.444444 0.000000 0.000000 0.555556 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.444444 0.000000 0.555556 0.000000 0.555556 0.222222 0.000000 0.222222 0.111111 0.000000 0.888889 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TG[AG]C[TA]GTGA[GA][ACT]G -------------------------------------------------------------------------------- Time 9.79 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 4 llr = 81 E-value = 9.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::35::3::5a3::a8::: pos.-specific C :33:aa::a::8:8:3a:: probability G 8835::5::5::a3:::a: matrix T 3:3:::3a::::::::::a bits 2.2 ** * * ** 2.0 ** ** * * * *** 1.7 ** ** * * * *** 1.5 ** ** * * * *** Relative 1.3 ** ** ** ***** *** Entropy 1.1 ** *** ************ (29.1 bits) 0.9 ** *** ************ 0.7 ** *** ************ 0.4 ** **************** 0.2 ** **************** 0.0 ------------------- Multilevel GGAACCGTCAACGCAACGT consensus TCCG A G A G C sequence G T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 46775 306 3.11e-12 GATGACCTTC GGCGCCGTCGACGCAACGT GGAGGAACGC 45563 355 5.55e-10 CTACTATATC GGTGCCATCAACGGAACGT TTGCCTTCTG 37521 164 1.05e-09 GGGAATCGAT TCGACCGTCAACGCAACGT TTGCGCCGTG 45973 90 1.59e-09 TGCTGGTTCT GGAACCTTCGAAGCACCGT CAAAACTTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46775 3.1e-12 305_[+3]_176 45563 5.5e-10 354_[+3]_127 37521 1e-09 163_[+3]_318 45973 1.6e-09 89_[+3]_392 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=4 46775 ( 306) GGCGCCGTCGACGCAACGT 1 45563 ( 355) GGTGCCATCAACGGAACGT 1 37521 ( 164) TCGACCGTCAACGCAACGT 1 45973 ( 90) GGAACCTTCGAAGCACCGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 10604 bayes= 11.3718 E= 9.4e+002 -865 -865 175 -12 -865 9 175 -865 -12 9 17 -12 88 -865 117 -865 -865 208 -865 -865 -865 208 -865 -865 -12 -865 117 -12 -865 -865 -865 188 -865 208 -865 -865 88 -865 117 -865 188 -865 -865 -865 -12 167 -865 -865 -865 -865 217 -865 -865 167 17 -865 188 -865 -865 -865 146 9 -865 -865 -865 208 -865 -865 -865 -865 217 -865 -865 -865 -865 188 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 4 E= 9.4e+002 0.000000 0.000000 0.750000 0.250000 0.000000 0.250000 0.750000 0.000000 0.250000 0.250000 0.250000 0.250000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.500000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GT][GC][ACGT][AG]CC[GAT]TC[AG]A[CA]G[CG]A[AC]CGT -------------------------------------------------------------------------------- Time 14.22 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42699 3.15e-02 312_[+1(2.09e-05)]_176 46775 1.00e-14 187_[+2(1.65e-07)]_106_\ [+3(3.11e-12)]_144_[+1(2.13e-07)]_20 37521 1.59e-07 163_[+3(1.05e-09)]_179_\ [+1(7.08e-06)]_127 49335 3.68e-03 164_[+1(4.25e-07)]_324 49352 4.50e-01 500 49766 9.84e-01 500 16669 4.90e-02 408_[+1(1.56e-05)]_80 44106 9.58e-07 241_[+2(5.10e-08)]_215_\ [+1(1.04e-06)]_20 33623 1.73e-04 204_[+2(3.72e-05)]_162_\ [+2(2.27e-06)]_39_[+1(9.15e-06)]_59 44512 2.59e-05 190_[+1(4.25e-07)]_51_\ [+2(1.88e-06)]_235 11021 1.41e-05 39_[+2(3.44e-06)]_12_[+1(8.82e-07)]_\ 197_[+1(9.43e-05)]_216 44980 6.57e-04 96_[+2(1.36e-06)]_142_\ [+2(2.18e-06)]_238 45536 4.43e-04 309_[+1(7.01e-08)]_66_\ [+1(9.15e-06)]_101 45563 1.19e-08 354_[+3(5.55e-10)]_31_\ [+1(4.25e-07)]_84 45973 6.10e-08 89_[+3(1.59e-09)]_286_\ [+2(8.36e-07)]_94 43324 6.71e-04 89_[+1(4.25e-07)]_218_\ [+1(6.43e-06)]_169 54087 5.22e-06 115_[+1(1.26e-06)]_297_\ [+2(2.02e-06)]_64 46403 4.23e-04 295_[+2(1.02e-07)]_193 46653 7.77e-02 147_[+1(4.62e-05)]_341 35895 1.49e-02 333_[+1(1.19e-05)]_155 47618 4.28e-01 500 49648 2.99e-02 135_[+1(8.62e-05)]_255_\ [+1(3.65e-06)]_86 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************