******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/162/162.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 8171 1.0000 500 10128 1.0000 500 50742 1.0000 500 5142 1.0000 500 45423 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/162/162.seqs.fa -oc motifs/162 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 5 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 2500 N= 5 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.229 G 0.251 T 0.249 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.229 G 0.251 T 0.249 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 18 sites = 4 llr = 70 E-value = 1.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::::a5::::3:33::: pos.-specific C ::88::::a:8::38::a probability G 833:3:5::33::3:8:: matrix T 38:38::a:8:8a3:3a: bits 2.1 * * 1.9 * ** * ** 1.7 * ** * ** 1.5 * ** * ** Relative 1.3 ****** **** * **** Entropy 1.1 ****** ****** **** (25.3 bits) 0.9 ************* **** 0.6 ************* **** 0.4 ************* **** 0.2 ************* **** 0.0 ------------------ Multilevel GTCCTAATCTCTTACGTC consensus TGGTG G GGA CAT sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 8171 52 3.02e-10 GTTGGTGAGG GTCCGAGTCTCTTTCGTC TCTTCGAGAA 10128 276 3.86e-09 CAAGAGTTAC TGCCTAATCTCTTACGTC AATATCTCTA 5142 42 9.41e-09 GTTGCTTTCC GTGCTAGTCTGTTGCTTC AATAGCGACG 45423 54 2.99e-08 ACGGATAGGA GTCTTAATCGCATCAGTC ACATTTCACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8171 3e-10 51_[+1]_431 10128 3.9e-09 275_[+1]_207 5142 9.4e-09 41_[+1]_441 45423 3e-08 53_[+1]_429 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=18 seqs=4 8171 ( 52) GTCCGAGTCTCTTTCGTC 1 10128 ( 276) TGCCTAATCTCTTACGTC 1 5142 ( 42) GTGCTAGTCTGTTGCTTC 1 45423 ( 54) GTCTTAATCGCATCAGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 2415 bayes= 9.23542 E= 1.0e+002 -865 -865 158 0 -865 -865 0 159 -865 171 0 -865 -865 171 -865 0 -865 -865 0 159 188 -865 -865 -865 88 -865 99 -865 -865 -865 -865 200 -865 212 -865 -865 -865 -865 0 159 -865 171 0 -865 -12 -865 -865 159 -865 -865 -865 200 -12 13 0 0 -12 171 -865 -865 -865 -865 158 0 -865 -865 -865 200 -865 212 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 4 E= 1.0e+002 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.750000 0.000000 0.750000 0.250000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.250000 0.750000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.750000 0.250000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.000000 0.000000 1.000000 0.250000 0.250000 0.250000 0.250000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GT][TG][CG][CT][TG]A[AG]TC[TG][CG][TA]T[ACGT][CA][GT]TC -------------------------------------------------------------------------------- Time 0.29 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 3 llr = 62 E-value = 7.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::3::::a73::a:::::a: pos.-specific C a3:a777:33aa:37::a:7 probability G ::7:3::::3:::7:a3::: matrix T :7:::33:::::::3:7::3 bits 2.1 * * ** * 1.9 * * * *** * ** 1.7 * * * *** * ** 1.5 * * * *** * ** Relative 1.3 * * * *** * ** Entropy 1.1 ********* ********** (30.0 bits) 0.9 ********* ********** 0.6 ********* ********** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel CTGCCCCAAACCAGCGTCAC consensus CA GTT CC CT G T sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 8171 408 2.86e-12 TTGTTACGGA CTGCCCCAACCCACCGTCAC TGACGCTCGC 50742 8 6.50e-10 GATATCG CTACCTTAAACCAGCGTCAT GACAAAACGA 5142 469 7.93e-10 GTGTCGTCGT CCGCGCCACGCCAGTGGCAC TTTGCAAGGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8171 2.9e-12 407_[+2]_73 50742 6.5e-10 7_[+2]_473 5142 7.9e-10 468_[+2]_12 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=3 8171 ( 408) CTGCCCCAACCCACCGTCAC 1 50742 ( 8) CTACCTTAAACCAGCGTCAT 1 5142 ( 469) CCGCGCCACGCCAGTGGCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 2405 bayes= 10.093 E= 7.0e+002 -823 212 -823 -823 -823 54 -823 142 30 -823 141 -823 -823 212 -823 -823 -823 154 41 -823 -823 154 -823 42 -823 154 -823 42 188 -823 -823 -823 129 54 -823 -823 30 54 41 -823 -823 212 -823 -823 -823 212 -823 -823 188 -823 -823 -823 -823 54 141 -823 -823 154 -823 42 -823 -823 199 -823 -823 -823 41 142 -823 212 -823 -823 188 -823 -823 -823 -823 154 -823 42 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 3 E= 7.0e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.333333 0.000000 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.666667 0.000000 0.333333 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.333333 0.333333 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[TC][GA]C[CG][CT][CT]A[AC][ACG]CCA[GC][CT]G[TG]CA[CT] -------------------------------------------------------------------------------- Time 0.54 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 4 llr = 71 E-value = 1.1e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::::::3853a3a3:::3 pos.-specific C :3::::85::3:3:3:a:8 probability G 88:3aa::::::::3::a: matrix T 3:a8::33355:5:3a::: bits 2.1 * 1.9 * ** * * *** 1.7 * ** * * *** 1.5 * ** * * *** Relative 1.3 ******* * * **** Entropy 1.1 ******* * * * **** (25.6 bits) 0.9 ******* ** * * **** 0.6 ********** * * **** 0.4 ************** **** 0.2 ************** **** 0.0 ------------------- Multilevel GGTTGGCCAATATAATCGC consensus TC G TATTA A C A sequence T C C G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 50742 102 2.71e-11 TGTTTCGACG GGTTGGCCAATATAATCGC GGAGATGAAA 45423 317 1.09e-09 CCTACGGTAA GGTTGGCCATCATAGTCGA AAAAGTGACA 10128 209 9.66e-09 CCAGATTGAC GGTTGGCATAAAAACTCGC CCACGCTCAT 5142 290 3.75e-08 AACCGGAAAC TCTGGGTTATTACATTCGC GTGGGCTTGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50742 2.7e-11 101_[+3]_380 45423 1.1e-09 316_[+3]_165 10128 9.7e-09 208_[+3]_273 5142 3.7e-08 289_[+3]_192 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=4 50742 ( 102) GGTTGGCCAATATAATCGC 1 45423 ( 317) GGTTGGCCATCATAGTCGA 1 10128 ( 209) GGTTGGCATAAAAACTCGC 1 5142 ( 290) TCTGGGTTATTACATTCGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 2410 bayes= 9.23242 E= 1.1e+003 -865 -865 158 0 -865 13 158 -865 -865 -865 -865 200 -865 -865 0 159 -865 -865 199 -865 -865 -865 199 -865 -865 171 -865 0 -12 113 -865 0 147 -865 -865 0 88 -865 -865 100 -12 13 -865 100 188 -865 -865 -865 -12 13 -865 100 188 -865 -865 -865 -12 13 0 0 -865 -865 -865 200 -865 212 -865 -865 -865 -865 199 -865 -12 171 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 4 E= 1.1e+003 0.000000 0.000000 0.750000 0.250000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.250000 0.500000 0.000000 0.250000 0.750000 0.000000 0.000000 0.250000 0.500000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.500000 1.000000 0.000000 0.000000 0.000000 0.250000 0.250000 0.000000 0.500000 1.000000 0.000000 0.000000 0.000000 0.250000 0.250000 0.250000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GT][GC]T[TG]GG[CT][CAT][AT][AT][TAC]A[TAC]A[ACGT]TCG[CA] -------------------------------------------------------------------------------- Time 0.77 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8171 3.56e-14 51_[+1(3.02e-10)]_338_\ [+2(2.86e-12)]_73 10128 2.67e-09 208_[+3(9.66e-09)]_48_\ [+1(3.86e-09)]_207 50742 7.04e-13 7_[+2(6.50e-10)]_74_[+3(2.71e-11)]_\ 380 5142 2.38e-14 41_[+1(9.41e-09)]_230_\ [+3(3.75e-08)]_160_[+2(7.93e-10)]_12 45423 2.15e-09 53_[+1(2.99e-08)]_245_\ [+3(1.09e-09)]_165 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************