******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/340/340.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9462 1.0000 500 42888 1.0000 500 42902 1.0000 500 43118 1.0000 500 22019 1.0000 500 14518 1.0000 500 23365 1.0000 500 49656 1.0000 500 31367 1.0000 500 44766 1.0000 500 35050 1.0000 500 8045 1.0000 500 19954 1.0000 500 40148 1.0000 500 44732 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/340/340.seqs.fa -oc motifs/340 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.265 C 0.245 G 0.226 T 0.264 Background letter frequencies (from dataset with add-one prior applied): A 0.265 C 0.245 G 0.226 T 0.264 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 7 llr = 96 E-value = 5.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::7:11::63::147 pos.-specific C :1:6:::::4::4:: probability G :933:9:a:3:9463 matrix T a::19:a:4:a1::: bits 2.1 * 1.9 * ** * 1.7 * ** * 1.5 ** *** ** Relative 1.3 ** **** ** Entropy 1.1 *** **** ** ** (19.8 bits) 0.9 *** ***** ** ** 0.6 ********* ***** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel TGACTGTGACTGCGA consensus GG TA GAG sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 19954 3 2.53e-08 AT TGACTGTGAATGGAA AGAGATAGAG 40148 165 1.00e-07 CAGAATTTGA TGACTATGACTGCGA ATCATGGAAA 23365 290 1.32e-07 CGGGTTCTGT TGGGTGTGTCTGCAA TGTGGACGAA 22019 30 1.32e-07 GTCCAAGGAA TGAGTGTGTCTGGAG TAGGAAACAA 42888 404 1.61e-07 ACAATGTCTC TGACTGTGAGTTCGA TCGTGCTTTT 42902 242 1.02e-06 CTGGGAGATA TGACAGTGAATGAGG TCGAGGAAGA 44766 259 1.34e-06 GGGCTGTGAT TCGTTGTGTGTGGGA CGGAAGCGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 19954 2.5e-08 2_[+1]_483 40148 1e-07 164_[+1]_321 23365 1.3e-07 289_[+1]_196 22019 1.3e-07 29_[+1]_456 42888 1.6e-07 403_[+1]_82 42902 1e-06 241_[+1]_244 44766 1.3e-06 258_[+1]_227 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=7 19954 ( 3) TGACTGTGAATGGAA 1 40148 ( 165) TGACTATGACTGCGA 1 23365 ( 290) TGGGTGTGTCTGCAA 1 22019 ( 30) TGAGTGTGTCTGGAG 1 42888 ( 404) TGACTGTGAGTTCGA 1 42902 ( 242) TGACAGTGAATGAGG 1 44766 ( 259) TCGTTGTGTGTGGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 7290 bayes= 9.86668 E= 5.5e+002 -945 -945 -945 192 -945 -78 192 -945 143 -945 34 -945 -945 122 34 -88 -89 -945 -945 170 -89 -945 192 -945 -945 -945 -945 192 -945 -945 214 -945 111 -945 -945 70 11 81 34 -945 -945 -945 -945 192 -945 -945 192 -88 -89 81 92 -945 69 -945 134 -945 143 -945 34 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 7 E= 5.5e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.857143 0.000000 0.714286 0.000000 0.285714 0.000000 0.000000 0.571429 0.285714 0.142857 0.142857 0.000000 0.000000 0.857143 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.571429 0.000000 0.000000 0.428571 0.285714 0.428571 0.285714 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.857143 0.142857 0.142857 0.428571 0.428571 0.000000 0.428571 0.000000 0.571429 0.000000 0.714286 0.000000 0.285714 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TG[AG][CG]TGTG[AT][CAG]TG[CG][GA][AG] -------------------------------------------------------------------------------- Time 1.90 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 15 llr = 134 E-value = 1.8e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 5:::13:1:51: pos.-specific C 1::85432a18: probability G 1:::1::1::11 matrix T 4aa23377:419 bits 2.1 1.9 ** * 1.7 ** * 1.5 ** * Relative 1.3 *** * * Entropy 1.1 *** * * ** (12.9 bits) 0.9 *** * * ** 0.6 *** ****** 0.4 **** ******* 0.2 ************ 0.0 ------------ Multilevel ATTCCCTTCACT consensus T TTTCC T sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44732 431 7.95e-08 TGACTGCTCC ATTCCCTTCACT TTCTAATACC 35050 380 6.59e-07 CAATCGTGTT TTTCTCTTCACT ACCGCAACCT 19954 462 1.59e-06 CTCGATCTTC ATTCCATTCTCT CCGGTAAGCA 43118 359 6.45e-06 AGCTCATCGT TTTCCATCCACT TTACGGAAGT 14518 372 1.01e-05 AATGGGCAAC ATTCCTCCCACT GCACAATTGC 22019 401 1.01e-05 AAATGACACA GTTCTCTTCACT TTTACTTTGG 40148 407 2.34e-05 CTTGCCCATC TTTTTCCTCTCT GATGAATGGA 44766 441 3.11e-05 ATACCCATTA TTTTTTCTCTCT TTCACTCGAC 23365 369 3.43e-05 GATATGGCAT TTTTGTTTCTCT TTCCTATAGC 31367 382 5.93e-05 CGTTCGGGAC ATTCCCTCCTGT CAGCGACAGA 42888 141 9.05e-05 GTTGGTGTCA ATTCAATTCACG CTACACGTTA 8045 356 1.12e-04 TTAGCTCCGC ATTCTCTGCTCG ATCGCTTTCG 49656 132 1.28e-04 ATAGGGGGTA TTTCGTCTCATT TTCTCATTCG 9462 359 1.36e-04 CGCATGCACC CTTCCTCTCCCT CTATTGCTCC 42902 469 2.06e-04 ACTCCTGTAC ATTCCATACAAT CTGTCTCCCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44732 7.9e-08 430_[+2]_58 35050 6.6e-07 379_[+2]_109 19954 1.6e-06 461_[+2]_27 43118 6.5e-06 358_[+2]_130 14518 1e-05 371_[+2]_117 22019 1e-05 400_[+2]_88 40148 2.3e-05 406_[+2]_82 44766 3.1e-05 440_[+2]_48 23365 3.4e-05 368_[+2]_120 31367 5.9e-05 381_[+2]_107 42888 9e-05 140_[+2]_348 8045 0.00011 355_[+2]_133 49656 0.00013 131_[+2]_357 9462 0.00014 358_[+2]_130 42902 0.00021 468_[+2]_20 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=15 44732 ( 431) ATTCCCTTCACT 1 35050 ( 380) TTTCTCTTCACT 1 19954 ( 462) ATTCCATTCTCT 1 43118 ( 359) TTTCCATCCACT 1 14518 ( 372) ATTCCTCCCACT 1 22019 ( 401) GTTCTCTTCACT 1 40148 ( 407) TTTTTCCTCTCT 1 44766 ( 441) TTTTTTCTCTCT 1 23365 ( 369) TTTTGTTTCTCT 1 31367 ( 382) ATTCCCTCCTGT 1 42888 ( 141) ATTCAATTCACG 1 8045 ( 356) ATTCTCTGCTCG 1 49656 ( 132) TTTCGTCTCATT 1 9462 ( 359) CTTCCTCTCCCT 1 42902 ( 469) ATTCCATACAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 8.93074 E= 1.8e+003 82 -188 -176 60 -1055 -1055 -1055 192 -1055 -1055 -1055 192 -1055 171 -1055 -40 -199 93 -76 34 1 71 -1055 34 -1055 44 -1055 134 -199 -29 -176 134 -1055 203 -1055 -1055 101 -188 -1055 60 -199 171 -176 -198 -1055 -1055 -76 172 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 15 E= 1.8e+003 0.466667 0.066667 0.066667 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.200000 0.066667 0.466667 0.133333 0.333333 0.266667 0.400000 0.000000 0.333333 0.000000 0.333333 0.000000 0.666667 0.066667 0.200000 0.066667 0.666667 0.000000 1.000000 0.000000 0.000000 0.533333 0.066667 0.000000 0.400000 0.066667 0.800000 0.066667 0.066667 0.000000 0.000000 0.133333 0.866667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AT]TT[CT][CT][CTA][TC][TC]C[AT]CT -------------------------------------------------------------------------------- Time 3.69 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 12 llr = 119 E-value = 5.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 83:2:853:::9 pos.-specific C :17:9:33::8: probability G 223:131:a:3: matrix T :4:8::14:a:1 bits 2.1 * 1.9 ** 1.7 * ** 1.5 * ** * Relative 1.3 * ** **** Entropy 1.1 * **** **** (14.3 bits) 0.9 * **** **** 0.6 * **** **** 0.4 * ********** 0.2 ************ 0.0 ------------ Multilevel ATCTCAATGTCA consensus AG GCC G sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 40148 142 6.77e-07 TGATGACGAA ATGTCAATGTCA TCAGAATTTG 22019 486 9.62e-07 CTACAGGAGT AACTCACCGTCA GTG 9462 172 2.26e-06 GAAACCAAAA AGCTCACCGTCA ACCTTTTTCT 8045 162 8.31e-06 CAGCTTGACA AGCTCGAAGTCA AGTGATTCGT 19954 295 9.00e-06 CTGCCGAAAC GTCTCACAGTCA AAAAGACACC 14518 415 1.02e-05 GACGTCATGG AACTCGACGTGA ACGACACCGA 44766 65 1.09e-05 CGAGAGAGAG ACCTCGATGTCA CTCCATTCAT 44732 1 1.32e-05 . AAGTCATTGTCA GCCAGTCATC 49656 299 2.17e-05 CGGTGATTGT GAGTCACAGTCA GTCAGTAGAC 31367 325 3.98e-05 AATTTGCGTG ATGTGAATGTGA GCAAATCAGG 42888 391 4.24e-05 ACTGTAAACC ATCACAATGTCT CTGACTGTGA 23365 484 5.67e-05 GCATATTTTG ATCACAGCGTGA TCAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40148 6.8e-07 141_[+3]_347 22019 9.6e-07 485_[+3]_3 9462 2.3e-06 171_[+3]_317 8045 8.3e-06 161_[+3]_327 19954 9e-06 294_[+3]_194 14518 1e-05 414_[+3]_74 44766 1.1e-05 64_[+3]_424 44732 1.3e-05 [+3]_488 49656 2.2e-05 298_[+3]_190 31367 4e-05 324_[+3]_164 42888 4.2e-05 390_[+3]_98 23365 5.7e-05 483_[+3]_5 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=12 40148 ( 142) ATGTCAATGTCA 1 22019 ( 486) AACTCACCGTCA 1 9462 ( 172) AGCTCACCGTCA 1 8045 ( 162) AGCTCGAAGTCA 1 19954 ( 295) GTCTCACAGTCA 1 14518 ( 415) AACTCGACGTGA 1 44766 ( 65) ACCTCGATGTCA 1 44732 ( 1) AAGTCATTGTCA 1 49656 ( 299) GAGTCACAGTCA 1 31367 ( 325) ATGTGAATGTGA 1 42888 ( 391) ATCACAATGTCT 1 23365 ( 484) ATCACAGCGTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 9.70135 E= 5.9e+002 165 -1023 -44 -1023 33 -155 -44 66 -1023 144 56 -1023 -67 -1023 -1023 166 -1023 190 -144 -1023 150 -1023 14 -1023 92 44 -144 -166 -8 44 -1023 66 -1023 -1023 214 -1023 -1023 -1023 -1023 192 -1023 161 14 -1023 179 -1023 -1023 -166 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 5.9e+002 0.833333 0.000000 0.166667 0.000000 0.333333 0.083333 0.166667 0.416667 0.000000 0.666667 0.333333 0.000000 0.166667 0.000000 0.000000 0.833333 0.000000 0.916667 0.083333 0.000000 0.750000 0.000000 0.250000 0.000000 0.500000 0.333333 0.083333 0.083333 0.250000 0.333333 0.000000 0.416667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 0.916667 0.000000 0.000000 0.083333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[TA][CG]TC[AG][AC][TCA]GT[CG]A -------------------------------------------------------------------------------- Time 5.44 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9462 3.45e-03 171_[+3(2.26e-06)]_317 42888 1.07e-05 140_[+2(9.05e-05)]_238_\ [+3(4.24e-05)]_1_[+1(1.61e-07)]_82 42902 1.32e-03 241_[+1(1.02e-06)]_244 43118 1.60e-02 358_[+2(6.45e-06)]_130 22019 4.15e-08 29_[+1(1.32e-07)]_356_\ [+2(1.01e-05)]_73_[+3(9.62e-07)]_3 14518 1.64e-03 371_[+2(1.01e-05)]_31_\ [+3(1.02e-05)]_74 23365 4.91e-06 289_[+1(1.32e-07)]_64_\ [+2(3.43e-05)]_103_[+3(5.67e-05)]_5 49656 3.85e-03 298_[+3(2.17e-05)]_190 31367 7.11e-03 324_[+3(3.98e-05)]_45_\ [+2(5.93e-05)]_107 44766 8.28e-06 64_[+3(1.09e-05)]_182_\ [+1(1.34e-06)]_167_[+2(3.11e-05)]_48 35050 7.82e-03 264_[+2(3.11e-05)]_103_\ [+2(6.59e-07)]_109 8045 4.88e-03 161_[+3(8.31e-06)]_327 19954 1.30e-08 2_[+1(2.53e-08)]_121_[+1(8.05e-05)]_\ 141_[+3(9.00e-06)]_155_[+2(1.59e-06)]_27 40148 5.04e-08 141_[+3(6.77e-07)]_11_\ [+1(1.00e-07)]_227_[+2(2.34e-05)]_9_[+3(9.92e-05)]_61 44732 2.42e-05 [+3(1.32e-05)]_418_[+2(7.95e-08)]_\ 58 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************