******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/281/281.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1143 1.0000 500 50245 1.0000 500 34639 1.0000 500 48426 1.0000 500 33060 1.0000 500 46078 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/281/281.seqs.fa -oc motifs/281 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 6 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3000 N= 6 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.263 C 0.243 G 0.224 T 0.270 Background letter frequencies (from dataset with add-one prior applied): A 0.263 C 0.243 G 0.224 T 0.270 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 6 llr = 83 E-value = 9.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::33:::::::8:7 pos.-specific C :::22::2:2:2:23: probability G a:25272::7:83:53 matrix T :a833:88a2a:7:2: bits 2.2 * 1.9 ** * * 1.7 ** * * 1.5 ** * ** Relative 1.3 *** *** ** * Entropy 1.1 *** **** **** * (19.9 bits) 0.9 *** ********* * 0.6 **** *********** 0.4 **** *********** 0.2 **** *********** 0.0 ---------------- Multilevel GTTGAGTTTGTGTAGA consensus TTA G CG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 46078 277 7.52e-09 TGCGTCGGAG GTTTGGTTTGTGTAGA GTGTGTAGTT 33060 297 4.06e-08 ATTTTGATTG GTTGAGTTTTTGTACA GAAACAAATT 34639 167 1.73e-07 GCACGACAAA GTTTTGTTTGTGTCCG TGCTATTTTA 48426 247 3.39e-07 GTAAATCTTT GTGGAATCTGTGTAGA TTGATGCCTA 1143 456 9.56e-07 TTTTACAACC GTTGTGTTTCTCGATA ATGCTGACGG 50245 83 1.07e-06 TTTTTCATTT GTTCCAGTTGTGGAGG TTTCCCATTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46078 7.5e-09 276_[+1]_208 33060 4.1e-08 296_[+1]_188 34639 1.7e-07 166_[+1]_318 48426 3.4e-07 246_[+1]_238 1143 9.6e-07 455_[+1]_29 50245 1.1e-06 82_[+1]_402 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=6 46078 ( 277) GTTTGGTTTGTGTAGA 1 33060 ( 297) GTTGAGTTTTTGTACA 1 34639 ( 167) GTTTTGTTTGTGTCCG 1 48426 ( 247) GTGGAATCTGTGTAGA 1 1143 ( 456) GTTGTGTTTCTCGATA 1 50245 ( 83) GTTCCAGTTGTGGAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 2910 bayes= 8.91886 E= 9.7e+001 -923 -923 216 -923 -923 -923 -923 189 -923 -923 -42 162 -923 -54 116 30 34 -54 -42 30 34 -923 157 -923 -923 -923 -42 162 -923 -54 -923 162 -923 -923 -923 189 -923 -54 157 -70 -923 -923 -923 189 -923 -54 190 -923 -923 -923 57 130 166 -54 -923 -923 -923 45 116 -70 134 -923 57 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 9.7e+001 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.166667 0.500000 0.333333 0.333333 0.166667 0.166667 0.333333 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.666667 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.333333 0.666667 0.833333 0.166667 0.000000 0.000000 0.000000 0.333333 0.500000 0.166667 0.666667 0.000000 0.333333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GTT[GT][AT][GA]TTTGTG[TG]A[GC][AG] -------------------------------------------------------------------------------- Time 0.33 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 4 llr = 54 E-value = 2.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 35:a::::::3a pos.-specific C 85::5::8::5: probability G ::a::aa3aa3: matrix T ::::5::::::: bits 2.2 * ** ** 1.9 ** ** ** * 1.7 ** ** ** * 1.5 ** ** ** * Relative 1.3 * ** ***** * Entropy 1.1 **** ***** * (19.6 bits) 0.9 ********** * 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CAGACGGCGGCA consensus AC T G A sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 46078 90 1.45e-07 AACGCGAAGG CAGATGGCGGCA CGCCGCCTTG 34639 193 2.09e-07 TGCTATTTTA CAGACGGCGGGA GGCGACATCT 33060 8 5.70e-07 CAAACCT ACGACGGCGGCA CAACCGCCTA 1143 96 1.10e-06 ATCTAGCAAA CCGATGGGGGAA TCTGATGAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46078 1.5e-07 89_[+2]_399 34639 2.1e-07 192_[+2]_296 33060 5.7e-07 7_[+2]_481 1143 1.1e-06 95_[+2]_393 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=4 46078 ( 90) CAGATGGCGGCA 1 34639 ( 193) CAGACGGCGGGA 1 33060 ( 8) ACGACGGCGGCA 1 1143 ( 96) CCGATGGGGGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 2934 bayes= 9.51668 E= 2.1e+002 -7 162 -865 -865 93 104 -865 -865 -865 -865 216 -865 193 -865 -865 -865 -865 104 -865 89 -865 -865 216 -865 -865 -865 216 -865 -865 162 16 -865 -865 -865 216 -865 -865 -865 216 -865 -7 104 16 -865 193 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 4 E= 2.1e+002 0.250000 0.750000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.500000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CA][AC]GA[CT]GG[CG]GG[CAG]A -------------------------------------------------------------------------------- Time 0.69 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 2 llr = 50 E-value = 2.3e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::a:::5::5:::a:a::a:: pos.-specific C :::a5::5:::5:::::::5a probability G aa::5a:5:5a5a:a:a5:5: matrix T ::::::5:a::::::::5::: bits 2.2 ** * * * * * 1.9 **** * * * ***** * * 1.7 **** * * * ***** * * 1.5 **** * * * ***** * * Relative 1.3 **** * * * ***** * * Entropy 1.1 ****** ************** (36.2 bits) 0.9 ********************* 0.6 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GGACCGACTAGCGAGAGGACC consensus G TG G G T G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 46078 112 2.79e-12 CGCCGCCTTG GGACCGTGTGGGGAGAGGACC ATTGACAGTG 34639 302 1.10e-11 ATGGAATGGC GGACGGACTAGCGAGAGTAGC CGAGGACAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46078 2.8e-12 111_[+3]_368 34639 1.1e-11 301_[+3]_178 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=2 46078 ( 112) GGACCGTGTGGGGAGAGGACC 1 34639 ( 302) GGACGGACTAGCGAGAGTAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 2880 bayes= 10.4909 E= 2.3e+003 -765 -765 215 -765 -765 -765 215 -765 192 -765 -765 -765 -765 203 -765 -765 -765 104 116 -765 -765 -765 215 -765 93 -765 -765 88 -765 104 116 -765 -765 -765 -765 188 93 -765 116 -765 -765 -765 215 -765 -765 104 116 -765 -765 -765 215 -765 192 -765 -765 -765 -765 -765 215 -765 192 -765 -765 -765 -765 -765 215 -765 -765 -765 116 88 192 -765 -765 -765 -765 104 116 -765 -765 203 -765 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 2 E= 2.3e+003 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GGAC[CG]G[AT][CG]T[AG]G[CG]GAGAG[GT]A[CG]C -------------------------------------------------------------------------------- Time 1.08 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1143 1.66e-05 95_[+2(1.10e-06)]_348_\ [+1(9.56e-07)]_29 50245 4.57e-03 82_[+1(1.07e-06)]_402 34639 3.37e-14 166_[+1(1.73e-07)]_10_\ [+2(2.09e-07)]_97_[+3(1.10e-11)]_178 48426 5.33e-03 246_[+1(3.39e-07)]_238 33060 2.98e-07 7_[+2(5.70e-07)]_277_[+1(4.06e-08)]_\ 188 46078 3.28e-16 89_[+2(1.45e-07)]_10_[+3(2.79e-12)]_\ 144_[+1(7.52e-09)]_208 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************