******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/481/481.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 14599 1.0000 500 54920 1.0000 500 49373 1.0000 500 16334 1.0000 500 55137 1.0000 500 33463 1.0000 500 50445 1.0000 500 41409 1.0000 500 45959 1.0000 500 20360 1.0000 500 12375 1.0000 500 36034 1.0000 500 32692 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/481/481.seqs.fa -oc motifs/481 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.258 C 0.266 G 0.238 T 0.239 Background letter frequencies (from dataset with add-one prior applied): A 0.258 C 0.266 G 0.238 T 0.239 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 10 llr = 142 E-value = 9.8e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::::11252241365::39 pos.-specific C :2:3:::2:::23:217::: probability G :51749:6483:231:2a:1 matrix T a39:6:9:1:5444141:7: bits 2.1 * * 1.9 * * 1.7 * * ** * 1.4 * * ** * * Relative 1.2 * ** ** * *** Entropy 1.0 * ***** * *** (20.5 bits) 0.8 * ***** * **** 0.6 *********** ***** 0.4 ************ ******* 0.2 ******************** 0.0 -------------------- Multilevel TGTGTGTGAGTATTAACGTA consensus T CG AGAGTCACTG A sequence C C ACGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 16334 87 3.36e-10 GGGTTTTGGA TGTGTGTGAGTTCGAACGAA ATCGTCGGGT 41409 307 1.20e-08 AACACGGCAG TGTGTGTGGGAATAGTCGTA CCTACACGAG 32692 260 3.40e-08 TAGGGGTGTG TGTGTGTGGGTCGACAGGTA CTCCCACTCG 12375 4 5.15e-08 GCC TTTGTGTAAGTCGAATCGAA CGGACTTTCC 36034 352 5.69e-08 TGGGAAAGAC TTGCTGTGAGGTTGATCGTA TAAAGCCTGC 55137 46 7.61e-08 AAGTGGTACG TGTGTGTGTGGTTTTAGGTA AACGAATCAT 54920 360 1.69e-07 GATAGAGAAT TCTGGATCGGTACTAACGTA AGCATATCTC 49373 358 1.10e-06 GTCATCCTGT TCTGGGTAAGATCTACCGTG AGATTGTACG 14599 141 1.23e-06 AATGGCGAAG TTTCGGTCGAGAATCTCGTA CTTTGCTGTC 45959 77 1.30e-06 CAACTACGAT TGTCGGAGAATATGAATGAA TAAATGAATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 16334 3.4e-10 86_[+1]_394 41409 1.2e-08 306_[+1]_174 32692 3.4e-08 259_[+1]_221 12375 5.1e-08 3_[+1]_477 36034 5.7e-08 351_[+1]_129 55137 7.6e-08 45_[+1]_435 54920 1.7e-07 359_[+1]_121 49373 1.1e-06 357_[+1]_123 14599 1.2e-06 140_[+1]_340 45959 1.3e-06 76_[+1]_404 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=10 16334 ( 87) TGTGTGTGAGTTCGAACGAA 1 41409 ( 307) TGTGTGTGGGAATAGTCGTA 1 32692 ( 260) TGTGTGTGGGTCGACAGGTA 1 12375 ( 4) TTTGTGTAAGTCGAATCGAA 1 36034 ( 352) TTGCTGTGAGGTTGATCGTA 1 55137 ( 46) TGTGTGTGTGGTTTTAGGTA 1 54920 ( 360) TCTGGATCGGTACTAACGTA 1 49373 ( 358) TCTGGGTAAGATCTACCGTG 1 14599 ( 141) TTTCGGTCGAGAATCTCGTA 1 45959 ( 77) TGTCGGAGAATATGAATGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6253 bayes= 9.538 E= 9.8e-001 -997 -997 -997 207 -997 -41 107 33 -997 -997 -125 191 -997 18 156 -997 -997 -997 75 133 -136 -997 192 -997 -136 -997 -997 191 -37 -41 133 -997 96 -997 75 -125 -37 -997 175 -997 -37 -997 33 107 63 -41 -997 74 -136 18 -25 74 22 -997 33 74 122 -41 -125 -125 96 -141 -997 74 -997 140 -25 -125 -997 -997 207 -997 22 -997 -997 155 180 -997 -125 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 10 E= 9.8e-001 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.500000 0.300000 0.000000 0.000000 0.100000 0.900000 0.000000 0.300000 0.700000 0.000000 0.000000 0.000000 0.400000 0.600000 0.100000 0.000000 0.900000 0.000000 0.100000 0.000000 0.000000 0.900000 0.200000 0.200000 0.600000 0.000000 0.500000 0.000000 0.400000 0.100000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 0.300000 0.500000 0.400000 0.200000 0.000000 0.400000 0.100000 0.300000 0.200000 0.400000 0.300000 0.000000 0.300000 0.400000 0.600000 0.200000 0.100000 0.100000 0.500000 0.100000 0.000000 0.400000 0.000000 0.700000 0.200000 0.100000 0.000000 0.000000 1.000000 0.000000 0.300000 0.000000 0.000000 0.700000 0.900000 0.000000 0.100000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[GTC]T[GC][TG]GT[GAC][AG][GA][TGA][ATC][TCG][TAG][AC][AT][CG]G[TA]A -------------------------------------------------------------------------------- Time 1.82 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 8 llr = 108 E-value = 4.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :5::::::::::1:3: pos.-specific C :5:::4::::111559 probability G 9::1834:1595::3: matrix T 1:a9346a95:485:1 bits 2.1 * * 1.9 * * 1.7 * * 1.4 * ** ** * * Relative 1.2 * *** ** * * Entropy 1.0 ***** ***** ** * (19.5 bits) 0.8 ***** ***** ** * 0.6 ***** ******** * 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GATTGCTTTGGGTCCC consensus C TTG T T TA sequence G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 32692 72 2.60e-08 GCGTTGGGAC GATTTTTTTGGGTTCC GTGCGTTTGT 45959 193 4.08e-08 CGGTGTACCA GCTTGGTTTGGTTTAC CGGACGGAAA 36034 451 8.59e-08 AATCGGCCAA GATGGCTTTGGGTTCC TACCGGCAAA 16334 389 1.66e-07 CGCCTTCTTC GATTGCGTTGGGACCC TCGCAGCACC 12375 91 2.64e-07 CCAGCCTTTT TCTTGTTTTTGTTTGC AGTCCCGTCG 33463 416 7.81e-07 CGACCACCAC GCTTGCGTTTGCCCCC TCTTCTCGCA 14599 37 1.18e-06 GTCTGGCGAG GATTGTTTTTCGTCGT GCCAATTACC 54920 432 1.59e-06 AACAAATCTA GCTTTGGTGTGTTCAC CTTTGCCTAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32692 2.6e-08 71_[+2]_413 45959 4.1e-08 192_[+2]_292 36034 8.6e-08 450_[+2]_34 16334 1.7e-07 388_[+2]_96 12375 2.6e-07 90_[+2]_394 33463 7.8e-07 415_[+2]_69 14599 1.2e-06 36_[+2]_448 54920 1.6e-06 431_[+2]_53 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=8 32692 ( 72) GATTTTTTTGGGTTCC 1 45959 ( 193) GCTTGGTTTGGTTTAC 1 36034 ( 451) GATGGCTTTGGGTTCC 1 16334 ( 389) GATTGCGTTGGGACCC 1 12375 ( 91) TCTTGTTTTTGTTTGC 1 33463 ( 416) GCTTGCGTTTGCCCCC 1 14599 ( 37) GATTGTTTTTCGTCGT 1 54920 ( 432) GCTTTGGTGTGTTCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 10.3581 E= 4.2e+001 -965 -965 188 -93 96 91 -965 -965 -965 -965 -965 207 -965 -965 -93 187 -965 -965 165 7 -965 50 7 65 -965 -965 66 139 -965 -965 -965 207 -965 -965 -93 187 -965 -965 107 107 -965 -109 188 -965 -965 -109 107 65 -104 -109 -965 165 -965 91 -965 107 -4 91 7 -965 -965 172 -965 -93 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 4.2e+001 0.000000 0.000000 0.875000 0.125000 0.500000 0.500000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.750000 0.250000 0.000000 0.375000 0.250000 0.375000 0.000000 0.000000 0.375000 0.625000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.500000 0.500000 0.000000 0.125000 0.875000 0.000000 0.000000 0.125000 0.500000 0.375000 0.125000 0.125000 0.000000 0.750000 0.000000 0.500000 0.000000 0.500000 0.250000 0.500000 0.250000 0.000000 0.000000 0.875000 0.000000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[AC]TT[GT][CTG][TG]TT[GT]G[GT]T[CT][CAG]C -------------------------------------------------------------------------------- Time 3.39 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 8 llr = 92 E-value = 7.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::3::6a:a5 pos.-specific C :a8::::3:3:3 probability G a::4:8a::6:3 matrix T ::3683:1:1:: bits 2.1 * * 1.9 ** * * * 1.7 ** * * * 1.4 ** * * * Relative 1.2 *** *** * * Entropy 1.0 ******* * * (16.6 bits) 0.8 ******* *** 0.6 *********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GCCTTGGAAGAA consensus TGAT C C C sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 50445 423 5.75e-08 GCTGGTCGGC GCCTTGGAAGAA GCAACCCAGG 41409 437 4.02e-07 TTTATACTGT GCCTTGGAACAA CTCGTGTTCT 20360 162 1.55e-06 TCCCATCAAA GCCTTTGAAGAC GCTTACACGG 32692 185 3.00e-06 TTACACGAGC GCCTTGGAATAG AATACCGACA 12375 401 3.24e-06 CCGTCGACTT GCCGTGGCACAA AATTACGAAA 49373 9 3.24e-06 TTAAAAGA GCCGAGGAAGAC ACTTGATACA 14599 306 7.40e-06 TTCAAACGAT GCTTTGGTAGAG GAAAACGGAC 36034 92 1.65e-05 CAATATAGAG GCTGATGCAGAA AAATAGAAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50445 5.7e-08 422_[+3]_66 41409 4e-07 436_[+3]_52 20360 1.5e-06 161_[+3]_327 32692 3e-06 184_[+3]_304 12375 3.2e-06 400_[+3]_88 49373 3.2e-06 8_[+3]_480 14599 7.4e-06 305_[+3]_183 36034 1.6e-05 91_[+3]_397 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=8 50445 ( 423) GCCTTGGAAGAA 1 41409 ( 437) GCCTTGGAACAA 1 20360 ( 162) GCCTTTGAAGAC 1 32692 ( 185) GCCTTGGAATAG 1 12375 ( 401) GCCGTGGCACAA 1 49373 ( 9) GCCGAGGAAGAC 1 14599 ( 306) GCTTTGGTAGAG 1 36034 ( 92) GCTGATGCAGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 10.37 E= 7.8e+002 -965 -965 207 -965 -965 191 -965 -965 -965 150 -965 7 -965 -965 66 139 -4 -965 -965 165 -965 -965 165 7 -965 -965 207 -965 128 -9 -965 -93 195 -965 -965 -965 -965 -9 139 -93 195 -965 -965 -965 96 -9 7 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 7.8e+002 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.375000 0.625000 0.250000 0.000000 0.000000 0.750000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.625000 0.250000 0.000000 0.125000 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.625000 0.125000 1.000000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GC[CT][TG][TA][GT]G[AC]A[GC]A[ACG] -------------------------------------------------------------------------------- Time 5.01 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14599 2.84e-07 36_[+2(1.18e-06)]_88_[+1(1.23e-06)]_\ 145_[+3(7.40e-06)]_183 54920 8.38e-06 359_[+1(1.69e-07)]_52_\ [+2(1.59e-06)]_53 49373 1.95e-05 8_[+3(3.24e-06)]_337_[+1(1.10e-06)]_\ 123 16334 1.09e-09 86_[+1(3.36e-10)]_282_\ [+2(1.66e-07)]_96 55137 3.96e-04 45_[+1(7.61e-08)]_435 33463 9.89e-03 278_[+2(9.82e-05)]_121_\ [+2(7.81e-07)]_12_[+2(5.95e-05)]_41 50445 5.34e-04 422_[+3(5.75e-08)]_66 41409 4.25e-08 306_[+1(1.20e-08)]_110_\ [+3(4.02e-07)]_52 45959 1.83e-06 76_[+1(1.30e-06)]_96_[+2(4.08e-08)]_\ 292 20360 1.24e-02 161_[+3(1.55e-06)]_327 12375 1.83e-09 3_[+1(5.15e-08)]_67_[+2(2.64e-07)]_\ 294_[+3(3.24e-06)]_88 36034 3.20e-09 91_[+3(1.65e-05)]_248_\ [+1(5.69e-08)]_79_[+2(8.59e-08)]_34 32692 1.34e-10 71_[+2(2.60e-08)]_97_[+3(3.00e-06)]_\ 63_[+1(3.40e-08)]_119_[+1(8.65e-05)]_82 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************