******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/441/441.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9033 1.0000 500 9388 1.0000 500 54855 1.0000 500 49258 1.0000 500 12863 1.0000 500 31594 1.0000 500 47019 1.0000 500 47936 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/441/441.seqs.fa -oc motifs/441 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4000 N= 8 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.245 C 0.263 G 0.244 T 0.248 Background letter frequencies (from dataset with add-one prior applied): A 0.245 C 0.263 G 0.244 T 0.248 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 14 sites = 4 llr = 61 E-value = 2.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :335a38:a:::aa pos.-specific C a::3::::::8::: probability G :383:83::a3a:: matrix T :5:::::a:::::: bits 2.0 * *** *** 1.8 * * *** *** 1.6 * * *** *** 1.4 * * *** *** Relative 1.2 * * ********** Entropy 1.0 * * ********** (22.0 bits) 0.8 * * ********** 0.6 *** ********** 0.4 ************** 0.2 ************** 0.0 -------------- Multilevel CTGAAGATAGCGAA consensus AAC AG G sequence G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 9388 274 3.26e-09 GATGATGATT CTGAAGATAGCGAA GAGTACGAAA 49258 25 2.93e-08 CCAAACGGCA CTGAAAATAGCGAA CCACCTCACC 9033 61 1.66e-07 CTTTAGCTAG CGGCAGGTAGCGAA CCGAGAGTCG 31594 190 2.72e-07 TAGGGTGATG CAAGAGATAGGGAA ATTCCTGATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9388 3.3e-09 273_[+1]_213 49258 2.9e-08 24_[+1]_462 9033 1.7e-07 60_[+1]_426 31594 2.7e-07 189_[+1]_297 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=14 seqs=4 9388 ( 274) CTGAAGATAGCGAA 1 49258 ( 25) CTGAAAATAGCGAA 1 9033 ( 61) CGGCAGGTAGCGAA 1 31594 ( 190) CAAGAGATAGGGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 3896 bayes= 9.9263 E= 2.4e+002 -865 192 -865 -865 3 -865 4 101 3 -865 162 -865 103 -7 4 -865 202 -865 -865 -865 3 -865 162 -865 161 -865 4 -865 -865 -865 -865 201 202 -865 -865 -865 -865 -865 203 -865 -865 151 4 -865 -865 -865 203 -865 202 -865 -865 -865 202 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 4 E= 2.4e+002 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.250000 0.500000 0.250000 0.000000 0.750000 0.000000 0.500000 0.250000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[TAG][GA][ACG]A[GA][AG]TAG[CG]GAA -------------------------------------------------------------------------------- Time 0.59 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 2 llr = 46 E-value = 2.3e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::aaa5:5:aa:::::::: pos.-specific C ::::::5:a:::::5:::: probability G a5:::555:::aaa5:aaa matrix T :5:::::::::::::a::: bits 2.0 * *** ***** **** 1.8 * *** ****** **** 1.6 * *** ****** **** 1.4 * *** ****** **** Relative 1.2 * *** ****** **** Entropy 1.0 ******************* (33.4 bits) 0.8 ******************* 0.6 ******************* 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel GGAAAACACAAGGGCTGGG consensus T GGG G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 54855 241 4.30e-11 TCGCAAACCC GGAAAGCGCAAGGGGTGGG CGCCTGATGT 12863 54 7.75e-11 GTCTTGCCTT GTAAAAGACAAGGGCTGGG GCTTTGGGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54855 4.3e-11 240_[+2]_241 12863 7.7e-11 53_[+2]_428 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=2 54855 ( 241) GGAAAGCGCAAGGGGTGGG 1 12863 ( 54) GTAAAAGACAAGGGCTGGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 3856 bayes= 10.9121 E= 2.3e+003 -765 -765 203 -765 -765 -765 103 101 202 -765 -765 -765 202 -765 -765 -765 202 -765 -765 -765 102 -765 103 -765 -765 92 103 -765 102 -765 103 -765 -765 192 -765 -765 202 -765 -765 -765 202 -765 -765 -765 -765 -765 203 -765 -765 -765 203 -765 -765 -765 203 -765 -765 92 103 -765 -765 -765 -765 201 -765 -765 203 -765 -765 -765 203 -765 -765 -765 203 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 2 E= 2.3e+003 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.500000 0.500000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[GT]AAA[AG][CG][AG]CAAGGG[CG]TGGG -------------------------------------------------------------------------------- Time 1.17 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 2 llr = 31 E-value = 3.6e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 5::::a:::::: pos.-specific C ::::::::::a: probability G 5aaaa:aa5a:: matrix T ::::::::5::a bits 2.0 ******* * * 1.8 ******* *** 1.6 ******* *** 1.4 ******* *** Relative 1.2 ******* *** Entropy 1.0 ************ (22.3 bits) 0.8 ************ 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AGGGGAGGGGCT consensus G T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47019 475 9.82e-08 CGCACTTGTG AGGGGAGGGGCT GAAACAGTGA 12863 474 1.98e-07 TGGTTCCAGC GGGGGAGGTGCT ACCACGAATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47019 9.8e-08 474_[+3]_14 12863 2e-07 473_[+3]_15 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=2 47019 ( 475) AGGGGAGGGGCT 1 12863 ( 474) GGGGGAGGTGCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3912 bayes= 10.933 E= 3.6e+003 102 -765 103 -765 -765 -765 203 -765 -765 -765 203 -765 -765 -765 203 -765 -765 -765 203 -765 202 -765 -765 -765 -765 -765 203 -765 -765 -765 203 -765 -765 -765 103 101 -765 -765 203 -765 -765 192 -765 -765 -765 -765 -765 201 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 2 E= 3.6e+003 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AG]GGGGAGG[GT]GCT -------------------------------------------------------------------------------- Time 1.67 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9033 1.92e-03 60_[+1(1.66e-07)]_426 9388 4.31e-05 273_[+1(3.26e-09)]_213 54855 1.35e-06 240_[+2(4.30e-11)]_241 49258 2.30e-04 24_[+1(2.93e-08)]_462 12863 1.06e-09 53_[+2(7.75e-11)]_401_\ [+3(1.98e-07)]_15 31594 2.26e-03 189_[+1(2.72e-07)]_297 47019 8.73e-04 474_[+3(9.82e-08)]_14 47936 7.03e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************