******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/467/467.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 36310 1.0000 500 50971 1.0000 500 47666 1.0000 500 40612 1.0000 500 54378 1.0000 500 41584 1.0000 500 32886 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/467/467.seqs.fa -oc motifs/467 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.263 C 0.238 G 0.219 T 0.281 Background letter frequencies (from dataset with add-one prior applied): A 0.263 C 0.238 G 0.219 T 0.281 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 4 llr = 78 E-value = 1.1e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 33a::35a5::::::::::: pos.-specific C :8::58::353:33a8:8:: probability G 8:::::::3::::5::a::a matrix T :::a5:5::58a83:3:3a: bits 2.2 * * 2.0 * * * * * 1.8 ** * * * * ** 1.5 ** * * * * ** Relative 1.3 **** * * * * * ** Entropy 1.1 **** * * *** ****** (28.0 bits) 0.9 ******** **** ****** 0.7 ******** *********** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GCATCCAAACTTTGCCGCTG consensus AA TAT CTC CC T T sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 41584 311 9.11e-11 GGCAACTGTA GCATTCAAACTTTGCTGCTG TAGATCCACA 40612 154 6.52e-10 CGCCAACCGG GCATTCTAGTCTTCCCGCTG ACATTTCAAA 32886 458 1.36e-09 TGCTTCACCT GAATCCAAACTTTTCCGTTG TCTTTGTCGA 36310 124 3.11e-09 CAACAGTAAA ACATCATACTTTCGCCGCTG CGGTAGCACG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41584 9.1e-11 310_[+1]_170 40612 6.5e-10 153_[+1]_327 32886 1.4e-09 457_[+1]_23 36310 3.1e-09 123_[+1]_357 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=4 41584 ( 311) GCATTCAAACTTTGCTGCTG 1 40612 ( 154) GCATTCTAGTCTTCCCGCTG 1 32886 ( 458) GAATCCAAACTTTTCCGTTG 1 36310 ( 124) ACATCATACTTTCGCCGCTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 3367 bayes= 9.71553 E= 1.1e+003 -7 -865 177 -865 -7 165 -865 -865 193 -865 -865 -865 -865 -865 -865 183 -865 107 -865 83 -7 165 -865 -865 93 -865 -865 83 193 -865 -865 -865 93 7 19 -865 -865 107 -865 83 -865 7 -865 142 -865 -865 -865 183 -865 7 -865 142 -865 7 119 -17 -865 207 -865 -865 -865 165 -865 -17 -865 -865 219 -865 -865 165 -865 -17 -865 -865 -865 183 -865 -865 219 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 1.1e+003 0.250000 0.000000 0.750000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.500000 0.250000 0.750000 0.000000 0.000000 0.500000 0.000000 0.000000 0.500000 1.000000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.250000 0.500000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA][CA]AT[CT][CA][AT]A[ACG][CT][TC]T[TC][GCT]C[CT]G[CT]TG -------------------------------------------------------------------------------- Time 0.50 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 13 sites = 2 llr = 35 E-value = 3.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::a:::::a: pos.-specific C a:::::a::a::: probability G :5a:a::5a:a:a matrix T :5:a:::5::::: bits 2.2 * * * * * 2.0 * * *** ***** 1.8 * ***** ***** 1.5 * ***** ***** Relative 1.3 * ***** ***** Entropy 1.1 ************* (24.9 bits) 0.9 ************* 0.7 ************* 0.4 ************* 0.2 ************* 0.0 ------------- Multilevel CGGTGACGGCGAG consensus T T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- 50971 75 6.33e-09 AGATTTCATC CGGTGACGGCGAG GACGGTGCCA 54378 297 3.30e-08 TTTTGATTGA CTGTGACTGCGAG TTCACTGGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50971 6.3e-09 74_[+2]_413 54378 3.3e-08 296_[+2]_191 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=13 seqs=2 50971 ( 75) CGGTGACGGCGAG 1 54378 ( 297) CTGTGACTGCGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 3416 bayes= 10.7372 E= 3.2e+003 -765 207 -765 -765 -765 -765 119 83 -765 -765 219 -765 -765 -765 -765 183 -765 -765 219 -765 192 -765 -765 -765 -765 207 -765 -765 -765 -765 119 83 -765 -765 219 -765 -765 207 -765 -765 -765 -765 219 -765 192 -765 -765 -765 -765 -765 219 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 2 E= 3.2e+003 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[GT]GTGAC[GT]GCGAG -------------------------------------------------------------------------------- Time 0.95 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 17 sites = 2 llr = 43 E-value = 3.7e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::::5a:a5::::aa: pos.-specific C a::a:::::5::::::: probability G :a5:a5:a:::5aa::a matrix T ::5:::::::a5::::: bits 2.2 * * * ** * 2.0 ** ** *** ***** 1.8 ** ** *** * ***** 1.5 ** ** *** * ***** Relative 1.3 ** ** *** * ***** Entropy 1.1 ***************** (30.9 bits) 0.9 ***************** 0.7 ***************** 0.4 ***************** 0.2 ***************** 0.0 ----------------- Multilevel CGGCGAAGAATGGGAAG consensus T G C T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ----------------- 36310 21 2.11e-10 ACTCCTGAAC CGTCGGAGAATGGGAAG TGAAAACGAA 50971 102 2.75e-10 GTGCCAGCGA CGGCGAAGACTTGGAAG ATTTCGAAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36310 2.1e-10 20_[+3]_463 50971 2.8e-10 101_[+3]_382 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=17 seqs=2 36310 ( 21) CGTCGGAGAATGGGAAG 1 50971 ( 102) CGGCGAAGACTTGGAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 17 n= 3388 bayes= 10.7254 E= 3.7e+003 -765 207 -765 -765 -765 -765 219 -765 -765 -765 119 83 -765 207 -765 -765 -765 -765 219 -765 93 -765 119 -765 192 -765 -765 -765 -765 -765 219 -765 192 -765 -765 -765 93 107 -765 -765 -765 -765 -765 183 -765 -765 119 83 -765 -765 219 -765 -765 -765 219 -765 192 -765 -765 -765 192 -765 -765 -765 -765 -765 219 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 17 nsites= 2 E= 3.7e+003 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CG[GT]CG[AG]AGA[AC]T[GT]GGAAG -------------------------------------------------------------------------------- Time 1.43 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36310 1.42e-11 20_[+3(2.11e-10)]_86_[+1(3.11e-09)]_\ 357 50971 1.76e-10 74_[+2(6.33e-09)]_14_[+3(2.75e-10)]_\ 382 47666 3.98e-01 500 40612 1.05e-05 153_[+1(6.52e-10)]_327 54378 8.90e-05 296_[+2(3.30e-08)]_191 41584 4.38e-06 310_[+1(9.11e-11)]_170 32886 6.39e-05 457_[+1(1.36e-09)]_23 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************