******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/489/489.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 43159 1.0000 500 20872 1.0000 500 46543 1.0000 500 38848 1.0000 500 29633 1.0000 500 22510 1.0000 500 54139 1.0000 500 44390 1.0000 500 19482 1.0000 500 44981 1.0000 500 11830 1.0000 500 44060 1.0000 500 47949 1.0000 500 45168 1.0000 500 37533 1.0000 500 39448 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/489/489.seqs.fa -oc motifs/489 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.266 C 0.258 G 0.228 T 0.249 Background letter frequencies (from dataset with add-one prior applied): A 0.266 C 0.257 G 0.228 T 0.249 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 10 llr = 113 E-value = 5.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 822:a::::1a6 pos.-specific C 261::a:::2:2 probability G ::2a:::a36:: matrix T :25:::a:71:2 bits 2.1 * * 1.9 ***** * 1.7 ***** * 1.5 ***** * Relative 1.3 * ***** * Entropy 1.1 * ****** * (16.3 bits) 0.9 * ****** * 0.6 ** ****** ** 0.4 ** ********* 0.2 ************ 0.0 ------------ Multilevel ACTGACTGTGAA consensus CAA GC C sequence TG T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44981 208 3.51e-07 GCCAATATCA ATTGACTGTGAA AACCGAGAAT 38848 197 3.51e-07 GCCACTTTGG ACAGACTGTGAA TACCATTGAT 45168 97 4.72e-07 ACAGTCATTG ACTGACTGTGAC TGTACCTTAA 39448 257 2.26e-06 GGCGGCCAGT CCGGACTGTGAA ACGCCGAACC 37533 264 3.31e-06 GAAGAAAATG AAGGACTGGGAA CGAACGATCG 46543 365 3.66e-06 CAGTGTCGCA AACGACTGTGAA GGTCGGACAT 44390 342 5.26e-06 CGGCGACTCG ACTGACTGGCAT TTACCGACGT 47949 87 6.51e-06 GGGTCGAGGT ACAGACTGTCAT GCCCGGCGCT 19482 297 9.16e-06 TGGGCTGGGT ACTGACTGGTAC GCACCGGTAA 20872 160 1.49e-05 GTGAGGGGAA CTTGACTGTAAA TATTAGACGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44981 3.5e-07 207_[+1]_281 38848 3.5e-07 196_[+1]_292 45168 4.7e-07 96_[+1]_392 39448 2.3e-06 256_[+1]_232 37533 3.3e-06 263_[+1]_225 46543 3.7e-06 364_[+1]_124 44390 5.3e-06 341_[+1]_147 47949 6.5e-06 86_[+1]_402 19482 9.2e-06 296_[+1]_192 20872 1.5e-05 159_[+1]_329 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=10 44981 ( 208) ATTGACTGTGAA 1 38848 ( 197) ACAGACTGTGAA 1 45168 ( 97) ACTGACTGTGAC 1 39448 ( 257) CCGGACTGTGAA 1 37533 ( 264) AAGGACTGGGAA 1 46543 ( 365) AACGACTGTGAA 1 44390 ( 342) ACTGACTGGCAT 1 47949 ( 87) ACAGACTGTCAT 1 19482 ( 297) ACTGACTGGTAC 1 20872 ( 160) CTTGACTGTAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7824 bayes= 9.86175 E= 5.8e+000 159 -36 -997 -997 -41 122 -997 -31 -41 -136 -19 101 -997 -997 213 -997 191 -997 -997 -997 -997 196 -997 -997 -997 -997 -997 201 -997 -997 213 -997 -997 -997 40 149 -141 -36 140 -131 191 -997 -997 -997 117 -36 -997 -31 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 5.8e+000 0.800000 0.200000 0.000000 0.000000 0.200000 0.600000 0.000000 0.200000 0.200000 0.100000 0.200000 0.500000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.300000 0.700000 0.100000 0.200000 0.600000 0.100000 1.000000 0.000000 0.000000 0.000000 0.600000 0.200000 0.000000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AC][CAT][TAG]GACTG[TG][GC]A[ACT] -------------------------------------------------------------------------------- Time 2.17 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 7 llr = 99 E-value = 9.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :a:1:111:1a:::a3 pos.-specific C a:97:16:73:1a3:: probability G :::1a3:3:6:::7:6 matrix T ::1::4363::9:::1 bits 2.1 * 1.9 ** * * * * 1.7 ** * * * * 1.5 ** * *** * Relative 1.3 *** * ***** Entropy 1.1 *** * * ***** (20.5 bits) 0.9 ***** * ***** 0.6 ***** ********** 0.4 ***** ********** 0.2 **************** 0.0 ---------------- Multilevel CACCGTCTCGATCGAG consensus GTGTC C A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 44981 480 3.83e-09 GTATCAATCA CACCGTTGCGATCGAG AAACA 38848 342 2.34e-08 CCACGGATCC CACGGTCTTGATCGAG TGTGTGCACC 45168 467 4.17e-08 GCTTTTCTTT CACCGGATTGATCGAG AGTAGCCGTG 54139 289 1.75e-07 CCCGTAGTGC CACCGGCTCCACCGAA ACCAACCGTA 22510 286 3.41e-07 TTTTGTCAAC CACCGCCGCCATCCAA GCTGGCGATT 29633 335 5.38e-07 TCCATCCACA CATCGATTCGATCGAT CGACCGAATC 19482 458 8.03e-07 GTGTTCCTCC CACAGTCACAATCCAG ACAGTTCCAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44981 3.8e-09 479_[+2]_5 38848 2.3e-08 341_[+2]_143 45168 4.2e-08 466_[+2]_18 54139 1.7e-07 288_[+2]_196 22510 3.4e-07 285_[+2]_199 29633 5.4e-07 334_[+2]_150 19482 8e-07 457_[+2]_27 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=7 44981 ( 480) CACCGTTGCGATCGAG 1 38848 ( 342) CACGGTCTTGATCGAG 1 45168 ( 467) CACCGGATTGATCGAG 1 54139 ( 289) CACCGGCTCCACCGAA 1 22510 ( 286) CACCGCCGCCATCCAA 1 29633 ( 335) CATCGATTCGATCGAT 1 19482 ( 458) CACAGTCACAATCCAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7760 bayes= 10.7194 E= 9.4e+002 -945 196 -945 -945 191 -945 -945 -945 -945 173 -945 -80 -90 147 -67 -945 -945 -945 213 -945 -90 -85 33 78 -90 115 -945 20 -90 -945 33 120 -945 147 -945 20 -90 15 133 -945 191 -945 -945 -945 -945 -85 -945 178 -945 196 -945 -945 -945 15 165 -945 191 -945 -945 -945 10 -945 133 -80 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 9.4e+002 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.857143 0.000000 0.142857 0.142857 0.714286 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.142857 0.285714 0.428571 0.142857 0.571429 0.000000 0.285714 0.142857 0.000000 0.285714 0.571429 0.000000 0.714286 0.000000 0.285714 0.142857 0.285714 0.571429 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.714286 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.000000 0.571429 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CACCG[TG][CT][TG][CT][GC]ATC[GC]A[GA] -------------------------------------------------------------------------------- Time 4.53 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 16 llr = 147 E-value = 3.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 86177:692:24 pos.-specific C 3:6139:13:84 probability G :4:3::::5a:2 matrix T ::3::14::::: bits 2.1 * 1.9 * 1.7 * * 1.5 * * * Relative 1.3 * * ** Entropy 1.1 ** **** ** (13.3 bits) 0.9 ** ***** ** 0.6 *********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AACAACAAGGCA consensus CGTGC T C C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 38848 453 8.31e-08 TAATAATAAC AACAACAAGGCA CAGTCAGTAG 47949 445 3.81e-07 CTATCTCGAA AGCAACAAGGCC TTGCCGCACA 29633 106 7.62e-06 ATTGCCAACG AACAACTAGGAA TTCGGTTCCG 54139 148 8.63e-06 GAAGCCGCCA AACGACAAAGCA TGCAACAAGC 20872 333 9.83e-06 CACAGCTTCG AACACCAAAGCC GTCCACCGAT 43159 229 1.53e-05 AAACGATGCT AATAACTAAGCA ATTATATCTA 46543 441 2.10e-05 CCACCGCAAG AACAACTACGAC GGACTATTTG 37533 6 2.30e-05 CAACA CATGACAAGGCA TACAGTTAAA 22510 144 2.50e-05 TACGATTGAT CGCGACAACGCA TCAAACACTC 39448 124 3.02e-05 TGCATACGAC AGCAACTACGAC CAGGATGTGA 45168 401 3.55e-05 CGCTGCCACA CAAAACAACGCA TGTCGATTCA 19482 162 3.55e-05 GTACGAACGA CGTACCAAGGCC GGAAACTCCC 44981 455 4.18e-05 GCAAAAACCA AGAGACTAGGCC GAAGTATCAA 11830 237 7.32e-05 TCAAAAGTTC AACACTAAGGCG ATTGCGCTGG 44390 220 1.11e-04 TGATTCACGA AACACCACCGCG CCAGTGACGA 44060 20 1.72e-04 ATGTTCTATG AGTCCCTAGGCG TCACCTTCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38848 8.3e-08 452_[+3]_36 47949 3.8e-07 444_[+3]_44 29633 7.6e-06 105_[+3]_383 54139 8.6e-06 147_[+3]_341 20872 9.8e-06 332_[+3]_156 43159 1.5e-05 228_[+3]_260 46543 2.1e-05 440_[+3]_48 37533 2.3e-05 5_[+3]_483 22510 2.5e-05 143_[+3]_345 39448 3e-05 123_[+3]_365 45168 3.6e-05 400_[+3]_88 19482 3.6e-05 161_[+3]_327 44981 4.2e-05 454_[+3]_34 11830 7.3e-05 236_[+3]_252 44390 0.00011 219_[+3]_269 44060 0.00017 19_[+3]_469 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=16 38848 ( 453) AACAACAAGGCA 1 47949 ( 445) AGCAACAAGGCC 1 29633 ( 106) AACAACTAGGAA 1 54139 ( 148) AACGACAAAGCA 1 20872 ( 333) AACACCAAAGCC 1 43159 ( 229) AATAACTAAGCA 1 46543 ( 441) AACAACTACGAC 1 37533 ( 6) CATGACAAGGCA 1 22510 ( 144) CGCGACAACGCA 1 39448 ( 124) AGCAACTACGAC 1 45168 ( 401) CAAAACAACGCA 1 19482 ( 162) CGTACCAAGGCC 1 44981 ( 455) AGAGACTAGGCC 1 11830 ( 237) AACACTAAGGCG 1 44390 ( 220) AACACCACCGCG 1 44060 ( 20) AGTCCCTAGGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7824 bayes= 8.93074 E= 3.2e+001 149 -4 -1064 -1064 123 -1064 72 -1064 -109 128 -1064 1 137 -204 14 -1064 137 28 -1064 -1064 -1064 186 -1064 -199 123 -1064 -1064 59 181 -204 -1064 -1064 -51 28 114 -1064 -1064 -1064 214 -1064 -51 166 -1064 -1064 72 54 -28 -1064 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 16 E= 3.2e+001 0.750000 0.250000 0.000000 0.000000 0.625000 0.000000 0.375000 0.000000 0.125000 0.625000 0.000000 0.250000 0.687500 0.062500 0.250000 0.000000 0.687500 0.312500 0.000000 0.000000 0.000000 0.937500 0.000000 0.062500 0.625000 0.000000 0.000000 0.375000 0.937500 0.062500 0.000000 0.000000 0.187500 0.312500 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.187500 0.812500 0.000000 0.000000 0.437500 0.375000 0.187500 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AC][AG][CT][AG][AC]C[AT]A[GC]GC[AC] -------------------------------------------------------------------------------- Time 6.80 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43159 1.05e-01 228_[+3(1.53e-05)]_260 20872 1.59e-03 159_[+1(1.49e-05)]_161_\ [+3(9.83e-06)]_156 46543 2.31e-04 364_[+1(3.66e-06)]_64_\ [+3(2.10e-05)]_48 38848 3.85e-11 196_[+1(3.51e-07)]_133_\ [+2(2.34e-08)]_95_[+3(8.31e-08)]_36 29633 7.20e-05 105_[+3(7.62e-06)]_217_\ [+2(5.38e-07)]_150 22510 8.97e-05 143_[+3(2.50e-05)]_130_\ [+2(3.41e-07)]_199 54139 4.12e-05 147_[+3(8.63e-06)]_129_\ [+2(1.75e-07)]_196 44390 3.04e-03 341_[+1(5.26e-06)]_147 19482 5.05e-06 161_[+3(3.55e-05)]_123_\ [+1(9.16e-06)]_149_[+2(8.03e-07)]_27 44981 2.32e-09 207_[+1(3.51e-07)]_184_\ [+1(6.47e-05)]_39_[+3(4.18e-05)]_13_[+2(3.83e-09)]_5 11830 7.13e-02 236_[+3(7.32e-05)]_252 44060 3.88e-01 500 47949 6.54e-05 86_[+1(6.51e-06)]_346_\ [+3(3.81e-07)]_44 45168 2.37e-08 96_[+1(4.72e-07)]_292_\ [+3(3.55e-05)]_54_[+2(4.17e-08)]_18 37533 5.62e-04 5_[+3(2.30e-05)]_246_[+1(3.31e-06)]_\ 225 39448 2.99e-04 123_[+3(3.02e-05)]_121_\ [+1(2.26e-06)]_232 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************