******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/300/300.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47463 1.0000 500 47737 1.0000 500 10054 1.0000 500 43672 1.0000 500 9886 1.0000 500 49725 1.0000 500 43797 1.0000 500 26382 1.0000 500 35386 1.0000 500 27234 1.0000 500 41969 1.0000 500 31578 1.0000 500 36125 1.0000 500 43421 1.0000 500 34632 1.0000 500 39510 1.0000 500 34630 1.0000 500 44369 1.0000 500 43778 1.0000 500 35445 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/300/300.seqs.fa -oc motifs/300 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10000 N= 20 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.284 C 0.226 G 0.216 T 0.274 Background letter frequencies (from dataset with add-one prior applied): A 0.284 C 0.226 G 0.216 T 0.274 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 10 llr = 146 E-value = 5.2e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::89a:533a3134a47221 pos.-specific C 2:21:84:1:461::43:1: probability G 88:::::52:1366:::849 matrix T :2:::2124:2::::2::3: bits 2.2 2.0 1.8 * * * * 1.5 * * * * * Relative 1.3 ** *** * * * * Entropy 1.1 ****** * ** ** * (21.1 bits) 0.9 ****** * **** ** * 0.7 ******* * **** ** * 0.4 ******** * ******* * 0.2 ******** *********** 0.0 -------------------- Multilevel GGAAACAGTACCGGAAAGGG consensus CTC TCAA AGAA CCAT sequence TG T T A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 31578 444 1.09e-10 CACTCGTTGC GGAAACAGTATCGGACAGTG GTGTGCTGCT 47737 447 5.07e-10 TACCATTGTG GGAAACAGGACCGAAACGGG CACCGGCACA 43672 313 2.88e-08 GCTGAGCTCC GTCAACTGTACCGGAAAGGG ATTGACGCTC 43797 301 3.90e-08 TCTCAACGGA GGAAATCTTACCAGAAAGAG GGATTCAATC 47463 231 1.49e-07 GGGCACGCAT GGAAACAAAAGGGAACCAGG CCCTGGATGC 41969 109 2.51e-07 CTTTGAATAA GGACACCGGAAGGGATCGCG TAGGCGCTCC 44369 31 3.32e-07 ATGATGGAAA GTAAACAATAAGCAACAGAG CGATACTGCT 26382 347 3.32e-07 TGATTTTCTC CGCAATCTAATCGGACAGTG TTTGATGGGG 43421 331 4.05e-07 AGAACACGAG CGAAACCGAACCAAATAGTA TGTGAGCAAC 49725 408 4.60e-07 ATGTACGGGA GGAAACAACAAAAGAAAAGG TTGGAAGGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31578 1.1e-10 443_[+1]_37 47737 5.1e-10 446_[+1]_34 43672 2.9e-08 312_[+1]_168 43797 3.9e-08 300_[+1]_180 47463 1.5e-07 230_[+1]_250 41969 2.5e-07 108_[+1]_372 44369 3.3e-07 30_[+1]_450 26382 3.3e-07 346_[+1]_134 43421 4.1e-07 330_[+1]_150 49725 4.6e-07 407_[+1]_73 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=10 31578 ( 444) GGAAACAGTATCGGACAGTG 1 47737 ( 447) GGAAACAGGACCGAAACGGG 1 43672 ( 313) GTCAACTGTACCGGAAAGGG 1 43797 ( 301) GGAAATCTTACCAGAAAGAG 1 47463 ( 231) GGAAACAAAAGGGAACCAGG 1 41969 ( 109) GGACACCGGAAGGGATCGCG 1 44369 ( 31) GTAAACAATAAGCAACAGAG 1 26382 ( 347) CGCAATCTAATCGGACAGTG 1 43421 ( 331) CGAAACCGAACCAAATAGTA 1 49725 ( 408) GGAAACAACAAAAGAAAAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 9620 bayes= 9.34207 E= 5.2e-001 -997 -17 188 -997 -997 -997 188 -45 149 -17 -997 -997 166 -117 -997 -997 181 -997 -997 -997 -997 183 -997 -45 82 83 -997 -145 8 -997 121 -45 8 -117 -11 55 181 -997 -997 -997 8 83 -111 -45 -150 141 47 -997 8 -117 147 -997 49 -997 147 -997 181 -997 -997 -997 49 83 -997 -45 130 41 -997 -997 -51 -997 188 -997 -51 -117 89 13 -150 -997 205 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 10 E= 5.2e-001 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.800000 0.200000 0.800000 0.200000 0.000000 0.000000 0.900000 0.100000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.500000 0.400000 0.000000 0.100000 0.300000 0.000000 0.500000 0.200000 0.300000 0.100000 0.200000 0.400000 1.000000 0.000000 0.000000 0.000000 0.300000 0.400000 0.100000 0.200000 0.100000 0.600000 0.300000 0.000000 0.300000 0.100000 0.600000 0.000000 0.400000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.400000 0.000000 0.200000 0.700000 0.300000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.200000 0.100000 0.400000 0.300000 0.100000 0.000000 0.900000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GC][GT][AC]AA[CT][AC][GAT][TAG]A[CAT][CG][GA][GA]A[ACT][AC][GA][GTA]G -------------------------------------------------------------------------------- Time 3.55 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 12 llr = 141 E-value = 7.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::1:3:::24:33:1 pos.-specific C 17::223:::1a3::: probability G 91:55132:82::4:5 matrix T :3a43548a:3:43a4 bits 2.2 * 2.0 * 1.8 * * * * * 1.5 * * ** * * Relative 1.3 * * *** * * Entropy 1.1 * * *** * * (16.9 bits) 0.9 *** *** * * 0.7 ***** *** * ** 0.4 ***** **** ***** 0.2 **************** 0.0 ---------------- Multilevel GCTGGTTTTGACTGTG consensus T TTAG T CT T sequence C AA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 26382 450 1.06e-07 TCACTGTATT GCTTGCCTTGACCGTG TTTTGGAACG 9886 203 1.48e-07 AACCGGGGGT GCTGGTGGTGACAGTG GACCGGATCT 43778 199 3.98e-07 GATCAAAGAA GCTTCTTTTGACTTTT GGTTCATGGG 34630 221 5.10e-07 CCATAGGTTG GTTGGAGTTGACCATG GCAATGGAGA 10054 467 8.10e-07 TTTTGAAAAA GCTGCATTTGGCTGTT CCTGGAGCGC 47737 349 8.10e-07 TGTCTACTGT GCTGGTGTTGTCCATA AAAAAGAGTC 49725 318 2.79e-06 GGAATTGACG CCTTTTTTTGTCTGTT CAAGCCCAGT 43672 191 3.27e-06 CCGCGGATCA GCTTGCTTTATCTATG CCAATGCCTT 47463 104 4.75e-06 TGCGGTGGTG GTTGTGGGTGACTGTG AGTAGATGCC 27234 413 6.25e-06 TAGAGCGATA GGTGTATTTGTCATTT CGATTAGAAA 41969 417 7.08e-06 ACAGTTAATT GCTTTTCTTAGCATTT TAAGCTTGAC 43421 24 7.55e-06 TTCGGAAAAC GTTAGTCTTGCCCTTG CATAAGTTTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 26382 1.1e-07 449_[+2]_35 9886 1.5e-07 202_[+2]_282 43778 4e-07 198_[+2]_286 34630 5.1e-07 220_[+2]_264 10054 8.1e-07 466_[+2]_18 47737 8.1e-07 348_[+2]_136 49725 2.8e-06 317_[+2]_167 43672 3.3e-06 190_[+2]_294 47463 4.8e-06 103_[+2]_381 27234 6.2e-06 412_[+2]_72 41969 7.1e-06 416_[+2]_68 43421 7.6e-06 23_[+2]_461 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=12 26382 ( 450) GCTTGCCTTGACCGTG 1 9886 ( 203) GCTGGTGGTGACAGTG 1 43778 ( 199) GCTTCTTTTGACTTTT 1 34630 ( 221) GTTGGAGTTGACCATG 1 10054 ( 467) GCTGCATTTGGCTGTT 1 47737 ( 349) GCTGGTGTTGTCCATA 1 49725 ( 318) CCTTTTTTTGTCTGTT 1 43672 ( 191) GCTTGCTTTATCTATG 1 47463 ( 104) GTTGTGGGTGACTGTG 1 27234 ( 413) GGTGTATTTGTCATTT 1 41969 ( 417) GCTTTTCTTAGCATTT 1 43421 ( 24) GTTAGTCTTGCCCTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 9700 bayes= 10.105 E= 7.0e+001 -1023 -143 208 -1023 -1023 156 -138 -13 -1023 -1023 -1023 187 -177 -1023 121 60 -1023 -44 121 28 -18 -44 -138 87 -1023 15 62 60 -1023 -1023 -38 160 -1023 -1023 -1023 187 -77 -1023 194 -1023 55 -143 -38 28 -1023 215 -1023 -1023 -18 56 -1023 60 -18 -1023 94 28 -1023 -1023 -1023 187 -177 -1023 121 60 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 12 E= 7.0e+001 0.000000 0.083333 0.916667 0.000000 0.000000 0.666667 0.083333 0.250000 0.000000 0.000000 0.000000 1.000000 0.083333 0.000000 0.500000 0.416667 0.000000 0.166667 0.500000 0.333333 0.250000 0.166667 0.083333 0.500000 0.000000 0.250000 0.333333 0.416667 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 1.000000 0.166667 0.000000 0.833333 0.000000 0.416667 0.083333 0.166667 0.333333 0.000000 1.000000 0.000000 0.000000 0.250000 0.333333 0.000000 0.416667 0.250000 0.000000 0.416667 0.333333 0.000000 0.000000 0.000000 1.000000 0.083333 0.000000 0.500000 0.416667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[CT]T[GT][GT][TA][TGC]TTG[AT]C[TCA][GTA]T[GT] -------------------------------------------------------------------------------- Time 7.17 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 8 llr = 94 E-value = 2.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 8::49a:::::: pos.-specific C :::::::::111 probability G ::a61:14:4:: matrix T 3a::::96a599 bits 2.2 * 2.0 * 1.8 ** * * 1.5 ** * * Relative 1.3 ** *** * ** Entropy 1.1 ********* ** (17.0 bits) 0.9 ********* ** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel ATGGAATTTTTT consensus T A G G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 39510 129 1.24e-07 ACACCTTTCC ATGGAATTTTTT CTAATCCCAA 43672 273 5.59e-07 TTTCCAGAAG ATGAAATTTTTT TCTTCAGGGT 27234 150 1.23e-06 TAATCAGTAA ATGGAATTTCTT GCATCTGTTG 36125 330 1.40e-06 TCCTATTTTG TTGGAATGTGTT GTCAAGACGA 34630 92 3.04e-06 GTGGCCAGCA ATGGAAGGTGTT CTCCAAACAA 26382 184 3.84e-06 CGTGATTCTG ATGAGATTTTTT TCAAGAGTTA 35445 52 5.86e-06 CTTGCCCTGC TTGGAATTTTCT TAACTTCAGA 49725 143 5.86e-06 GAAACAAAGG ATGAAATGTGTC TAGTGGCAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39510 1.2e-07 128_[+3]_360 43672 5.6e-07 272_[+3]_216 27234 1.2e-06 149_[+3]_339 36125 1.4e-06 329_[+3]_159 34630 3e-06 91_[+3]_397 26382 3.8e-06 183_[+3]_305 35445 5.9e-06 51_[+3]_437 49725 5.9e-06 142_[+3]_346 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=8 39510 ( 129) ATGGAATTTTTT 1 43672 ( 273) ATGAAATTTTTT 1 27234 ( 150) ATGGAATTTCTT 1 36125 ( 330) TTGGAATGTGTT 1 34630 ( 92) ATGGAAGGTGTT 1 26382 ( 184) ATGAGATTTTTT 1 35445 ( 52) TTGGAATTTTCT 1 49725 ( 143) ATGAAATGTGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9780 bayes= 10.2544 E= 2.2e+003 140 -965 -965 -13 -965 -965 -965 187 -965 -965 221 -965 40 -965 153 -965 162 -965 -79 -965 181 -965 -965 -965 -965 -965 -79 167 -965 -965 79 119 -965 -965 -965 187 -965 -85 79 87 -965 -85 -965 167 -965 -85 -965 167 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 2.2e+003 0.750000 0.000000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.375000 0.000000 0.625000 0.000000 0.875000 0.000000 0.125000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.375000 0.625000 0.000000 0.000000 0.000000 1.000000 0.000000 0.125000 0.375000 0.500000 0.000000 0.125000 0.000000 0.875000 0.000000 0.125000 0.000000 0.875000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AT]TG[GA]AAT[TG]T[TG]TT -------------------------------------------------------------------------------- Time 10.55 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47463 1.82e-05 103_[+2(4.75e-06)]_111_\ [+1(1.49e-07)]_250 47737 2.78e-08 348_[+2(8.10e-07)]_82_\ [+1(5.07e-10)]_34 10054 4.47e-03 466_[+2(8.10e-07)]_18 43672 2.16e-09 190_[+2(3.27e-06)]_66_\ [+3(5.59e-07)]_28_[+1(2.88e-08)]_131_[+3(4.28e-05)]_25 9886 7.42e-04 202_[+2(1.48e-07)]_282 49725 2.05e-07 142_[+3(5.86e-06)]_163_\ [+2(2.79e-06)]_74_[+1(4.60e-07)]_73 43797 1.11e-04 300_[+1(3.90e-08)]_180 26382 5.17e-09 183_[+3(3.84e-06)]_151_\ [+1(3.32e-07)]_83_[+2(1.06e-07)]_35 35386 2.92e-01 500 27234 1.37e-05 149_[+3(1.23e-06)]_251_\ [+2(6.25e-06)]_72 41969 4.89e-05 108_[+1(2.51e-07)]_288_\ [+2(7.08e-06)]_68 31578 2.03e-06 443_[+1(1.09e-10)]_37 36125 7.88e-03 329_[+3(1.40e-06)]_159 43421 1.26e-05 23_[+2(7.55e-06)]_291_\ [+1(4.05e-07)]_150 34632 6.68e-02 166_[+1(6.20e-05)]_314 39510 4.84e-04 128_[+3(1.24e-07)]_360 34630 8.10e-06 91_[+3(3.04e-06)]_117_\ [+2(5.10e-07)]_264 44369 1.11e-03 30_[+1(3.32e-07)]_450 43778 3.90e-03 198_[+2(3.98e-07)]_286 35445 3.40e-02 51_[+3(5.86e-06)]_437 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************