******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/359/359.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 13150 1.0000 500 43688 1.0000 500 49162 1.0000 500 49268 1.0000 500 50299 1.0000 500 33254 1.0000 500 5605 1.0000 500 45070 1.0000 500 31558 1.0000 500 48064 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/359/359.seqs.fa -oc motifs/359 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.247 C 0.264 G 0.218 T 0.270 Background letter frequencies (from dataset with add-one prior applied): A 0.247 C 0.264 G 0.218 T 0.270 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 10 llr = 103 E-value = 9.7e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :29a77::4836 pos.-specific C :1:::::562:4 probability G a61:31a2::5: matrix T :1:::2:3::2: bits 2.2 * * 2.0 * * * 1.8 * * * 1.5 * ** * Relative 1.3 * ** * * Entropy 1.1 * *** * * * (14.9 bits) 0.9 * ***** ** * 0.7 * ***** **** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GGAAAAGCCAGA consensus A GT TACAC sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 49268 24 2.46e-07 CTTCTCCGAC GGAAAAGGCAGA TTCCTCGTGC 31558 198 6.11e-07 GAATACTTAT GGAAGAGCCAGC CAATTAGACA 33254 337 1.83e-06 GTTTGTTCCG GAAAAAGCCAAA CCTTTTTCAG 50299 151 2.72e-06 TCGTCCTCCG GGAAGAGTAAGC TACGAGGAAC 48064 479 3.73e-06 ACACCAACAT GTAAAAGCAAGA TGCACGAACC 5605 446 7.83e-06 CAACCCAGGG GGAAATGGAAGC CGTACCCAGG 45070 323 1.22e-05 CTGCCTCAGA GGAAGAGTCCAA AAGCAAATCC 49162 397 1.83e-05 GTGCTTTGGA GGAAAGGCAATC GATAGATTGA 13150 246 3.67e-05 GGCGCGTTTG GCGAAAGCCAAA GAACTCTTGT 43688 289 6.40e-05 CCGACCGACT GAAAATGTCCTA TACGACTTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49268 2.5e-07 23_[+1]_465 31558 6.1e-07 197_[+1]_291 33254 1.8e-06 336_[+1]_152 50299 2.7e-06 150_[+1]_338 48064 3.7e-06 478_[+1]_10 5605 7.8e-06 445_[+1]_43 45070 1.2e-05 322_[+1]_166 49162 1.8e-05 396_[+1]_92 13150 3.7e-05 245_[+1]_243 43688 6.4e-05 288_[+1]_200 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=10 49268 ( 24) GGAAAAGGCAGA 1 31558 ( 198) GGAAGAGCCAGC 1 33254 ( 337) GAAAAAGCCAAA 1 50299 ( 151) GGAAGAGTAAGC 1 48064 ( 479) GTAAAAGCAAGA 1 5605 ( 446) GGAAATGGAAGC 1 45070 ( 323) GGAAGAGTCCAA 1 49162 ( 397) GGAAAGGCAATC 1 13150 ( 246) GCGAAAGCCAAA 1 43688 ( 289) GAAAATGTCCTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 9.18275 E= 9.7e+000 -997 -997 219 -997 -30 -140 146 -143 186 -997 -112 -997 202 -997 -997 -997 150 -997 46 -997 150 -997 -112 -43 -997 -997 219 -997 -997 92 -13 15 69 118 -997 -997 169 -40 -997 -997 28 -997 120 -43 128 60 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 9.7e+000 0.000000 0.000000 1.000000 0.000000 0.200000 0.100000 0.600000 0.100000 0.900000 0.000000 0.100000 0.000000 1.000000 0.000000 0.000000 0.000000 0.700000 0.000000 0.300000 0.000000 0.700000 0.000000 0.100000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.200000 0.300000 0.400000 0.600000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.300000 0.000000 0.500000 0.200000 0.600000 0.400000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[GA]AA[AG][AT]G[CTG][CA][AC][GAT][AC] -------------------------------------------------------------------------------- Time 0.88 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 17 sites = 5 llr = 80 E-value = 1.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::4:::22::::22:: pos.-specific C 8:::2::84:228:2:: probability G :a84:a8:4:8:282:a matrix T 2:228:2::a:8::4a: bits 2.2 * * * 2.0 * * * ** 1.8 * * * ** 1.5 * * * * ** Relative 1.3 ** *** ** ** ** Entropy 1.1 *** **** ***** ** (23.2 bits) 0.9 *** **** ***** ** 0.7 *** **** ***** ** 0.4 ************** ** 0.2 ************** ** 0.0 ----------------- Multilevel CGGATGGCCTGTCGTTG consensus T TGC TAG CCGAA sequence T A C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ----------------- 50299 280 9.19e-09 CAACCTAAAC CGGGTGGCATGTCAGTG TTTGTGACAA 49162 301 1.20e-08 TCTTTCCGAA CGGTCGGCCTGTCGATG GCCTTTTTCA 49268 100 1.65e-08 TTCTCACCCA CGTGTGTCGTGTCGTTG TCCAACAGCT 13150 382 3.27e-08 ACAATAGTAC CGGATGGCCTCTGGCTG AATGGGCACT 48064 26 4.52e-08 GTGACGCTGA TGGATGGAGTGCCGTTG GCCGATCAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50299 9.2e-09 279_[+2]_204 49162 1.2e-08 300_[+2]_183 49268 1.6e-08 99_[+2]_384 13150 3.3e-08 381_[+2]_102 48064 4.5e-08 25_[+2]_458 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=17 seqs=5 50299 ( 280) CGGGTGGCATGTCAGTG 1 49162 ( 301) CGGTCGGCCTGTCGATG 1 49268 ( 100) CGTGTGTCGTGTCGTTG 1 13150 ( 382) CGGATGGCCTCTGGCTG 1 48064 ( 26) TGGATGGAGTGCCGTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 17 n= 4840 bayes= 10.1691 E= 1.5e+002 -897 160 -897 -43 -897 -897 219 -897 -897 -897 187 -43 69 -897 87 -43 -897 -40 -897 156 -897 -897 219 -897 -897 -897 187 -43 -30 160 -897 -897 -30 60 87 -897 -897 -897 -897 188 -897 -40 187 -897 -897 -40 -897 156 -897 160 -13 -897 -30 -897 187 -897 -30 -40 -13 56 -897 -897 -897 188 -897 -897 219 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 17 nsites= 5 E= 1.5e+002 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.400000 0.000000 0.400000 0.200000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.200000 0.800000 0.000000 0.000000 0.200000 0.400000 0.400000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.800000 0.200000 0.000000 0.200000 0.000000 0.800000 0.000000 0.200000 0.200000 0.200000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CT]G[GT][AGT][TC]G[GT][CA][CGA]T[GC][TC][CG][GA][TACG]TG -------------------------------------------------------------------------------- Time 1.94 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 4 llr = 56 E-value = 6.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::a5a::a53aa pos.-specific C a8:::a::3::: probability G :3:5::a:38:: matrix T :::::::::::: bits 2.2 * 2.0 * * **** ** 1.8 * * **** ** 1.5 * * **** ** Relative 1.3 * * **** *** Entropy 1.1 ******** *** (20.3 bits) 0.9 ******** *** 0.7 ******** *** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CCAAACGAAGAA consensus G G CA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 33254 113 4.33e-08 CTTTCTCACG CCAGACGAAGAA ACGTATCTGT 48064 293 3.08e-07 TCTACGGAAT CCAAACGACGAA TTCTACTGTT 13150 282 3.08e-07 GTTTACCGCT CGAGACGAAGAA CTTGTCAATG 5605 8 6.94e-07 ATCGGCA CCAAACGAGAAA CCGTGAACGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33254 4.3e-08 112_[+3]_376 48064 3.1e-07 292_[+3]_196 13150 3.1e-07 281_[+3]_207 5605 6.9e-07 7_[+3]_481 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=4 33254 ( 113) CCAGACGAAGAA 1 48064 ( 293) CCAAACGACGAA 1 13150 ( 282) CGAGACGAAGAA 1 5605 ( 8) CCAAACGAGAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 10.9919 E= 6.5e+002 -865 192 -865 -865 -865 150 20 -865 201 -865 -865 -865 102 -865 119 -865 201 -865 -865 -865 -865 192 -865 -865 -865 -865 219 -865 201 -865 -865 -865 102 -8 20 -865 2 -865 178 -865 201 -865 -865 -865 201 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 4 E= 6.5e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[CG]A[AG]ACGA[ACG][GA]AA -------------------------------------------------------------------------------- Time 2.74 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 13150 1.31e-08 245_[+1(3.67e-05)]_24_\ [+3(3.08e-07)]_88_[+2(3.27e-08)]_102 43688 6.30e-02 288_[+1(6.40e-05)]_200 49162 7.98e-06 238_[+2(4.63e-05)]_45_\ [+2(1.20e-08)]_79_[+1(1.83e-05)]_92 49268 1.41e-07 23_[+1(2.46e-07)]_64_[+2(1.65e-08)]_\ 384 50299 5.24e-07 150_[+1(2.72e-06)]_117_\ [+2(9.19e-09)]_204 33254 2.79e-06 112_[+3(4.33e-08)]_212_\ [+1(1.83e-06)]_152 5605 6.96e-05 7_[+3(6.94e-07)]_222_[+1(8.08e-05)]_\ 192_[+1(7.83e-06)]_43 45070 2.05e-02 322_[+1(1.22e-05)]_166 31558 9.53e-03 197_[+1(6.11e-07)]_291 48064 2.17e-09 25_[+2(4.52e-08)]_250_\ [+3(3.08e-07)]_174_[+1(3.73e-06)]_10 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************