******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/222/222.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 46971 1.0000 500 38979 1.0000 500 40431 1.0000 500 49851 1.0000 500 25876 1.0000 500 33500 1.0000 500 37408 1.0000 500 40610 1.0000 500 48931 1.0000 500 35355 1.0000 500 40624 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/222/222.seqs.fa -oc motifs/222 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.251 C 0.234 G 0.235 T 0.280 Background letter frequencies (from dataset with add-one prior applied): A 0.251 C 0.234 G 0.235 T 0.280 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 4 llr = 114 E-value = 3.7e-009 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::aa3::a:::aaa:3:a:aa pos.-specific C :a::::::aaa:::a8a:::: probability G a:::8aa:::::::::::::: matrix T ::::::::::::::::::a:: bits 2.1 **** ********** ** ** 1.9 **** ********** ***** 1.7 **** ********** ***** 1.5 **** ********** ***** Relative 1.3 ********************* Entropy 1.0 ********************* (41.1 bits) 0.8 ********************* 0.6 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GCAAGGGACCCAAACCCATAA consensus A A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 40624 413 1.30e-13 CGGAGGCAGG GCAAGGGACCCAAACCCATAA GTGGTTCTTT 35355 413 1.30e-13 CGGAGGCAGG GCAAGGGACCCAAACCCATAA GTGGTTCTTT 37408 408 1.30e-13 CGGAGGCAGG GCAAGGGACCCAAACCCATAA AGTGGTTCTT 40610 21 5.56e-13 TAACAAAACA GCAAAGGACCCAAACACATAA GCTGTTTAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40624 1.3e-13 412_[+1]_67 35355 1.3e-13 412_[+1]_67 37408 1.3e-13 407_[+1]_72 40610 5.6e-13 20_[+1]_459 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=4 40624 ( 413) GCAAGGGACCCAAACCCATAA 1 35355 ( 413) GCAAGGGACCCAAACCCATAA 1 37408 ( 408) GCAAGGGACCCAAACCCATAA 1 40610 ( 21) GCAAAGGACCCAAACACATAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 10.3652 E= 3.7e-009 -865 -865 209 -865 -865 209 -865 -865 199 -865 -865 -865 199 -865 -865 -865 -1 -865 167 -865 -865 -865 209 -865 -865 -865 209 -865 199 -865 -865 -865 -865 209 -865 -865 -865 209 -865 -865 -865 209 -865 -865 199 -865 -865 -865 199 -865 -865 -865 199 -865 -865 -865 -865 209 -865 -865 -1 168 -865 -865 -865 209 -865 -865 199 -865 -865 -865 -865 -865 -865 183 199 -865 -865 -865 199 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 3.7e-009 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GCAA[GA]GGACCCAAAC[CA]CATAA -------------------------------------------------------------------------------- Time 1.09 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 4 llr = 108 E-value = 4.0e-007 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::::::a:::a8:::a:: pos.-specific C :3:3:::a::aa8::aaa::a probability G a8:::aa:::::::3:::::: matrix T ::a8a:::a:::3::::::a: bits 2.1 * *** *** * **** * 1.9 * * ******** * ****** 1.7 * * ******** * ****** 1.5 * * ******** * ****** Relative 1.3 *** ***************** Entropy 1.0 ********************* (39.1 bits) 0.8 ********************* 0.6 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GGTTTGGCTACCCAACCCATC consensus C C T G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 40624 480 1.87e-13 CTTTCTTATT GGTTTGGCTACCCAACCCATC 35355 480 1.87e-13 CTTTCTTATT GGTTTGGCTACCCAACCCATC 37408 480 1.87e-13 CTTTCTTATT GGTTTGGCTACCCAACCCATC 40610 81 2.92e-12 TTCTCATATT GCTCTGGCTACCTAGCCCATC ATGGTGTTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40624 1.9e-13 479_[+2] 35355 1.9e-13 479_[+2] 37408 1.9e-13 479_[+2] 40610 2.9e-12 80_[+2]_399 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=4 40624 ( 480) GGTTTGGCTACCCAACCCATC 1 35355 ( 480) GGTTTGGCTACCCAACCCATC 1 37408 ( 480) GGTTTGGCTACCCAACCCATC 1 40610 ( 81) GCTCTGGCTACCTAGCCCATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 10.3652 E= 4.0e-007 -865 -865 209 -865 -865 9 167 -865 -865 -865 -865 183 -865 9 -865 142 -865 -865 -865 183 -865 -865 209 -865 -865 -865 209 -865 -865 209 -865 -865 -865 -865 -865 183 199 -865 -865 -865 -865 209 -865 -865 -865 209 -865 -865 -865 168 -865 -16 199 -865 -865 -865 158 -865 9 -865 -865 209 -865 -865 -865 209 -865 -865 -865 209 -865 -865 199 -865 -865 -865 -865 -865 -865 183 -865 209 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 4.0e-007 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.250000 1.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[GC]T[TC]TGGCTACC[CT]A[AG]CCCATC -------------------------------------------------------------------------------- Time 2.11 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 115 E-value = 1.8e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::22::2a48::::4:::: pos.-specific C a2:2:::a:::::88a2:::2 probability G :8a8:8::8:6:a:2:2::a8 matrix T ::::8:a::::2:2::2aa:: bits 2.1 * * * * * * * 1.9 * * ** * * * *** 1.7 * * ** * * * *** 1.5 **** ** * * ** **** Relative 1.3 **** ***** ***** **** Entropy 1.0 **************** **** (33.1 bits) 0.8 **************** **** 0.6 **************** **** 0.4 **************** **** 0.2 **************** **** 0.0 --------------------- Multilevel CGGGTGTCGAGAGCCCATTGG consensus C CAA A AT TG C C sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 40624 184 1.49e-13 CTCGGAATCA CGGGTGTCGAGAGCCCATTGG AGTCTTCAGG 35355 184 1.49e-13 CTCGGAATCA CGGGTGTCGAGAGCCCATTGG AGTCTTCAGG 37408 179 5.87e-13 CTCGGAATCA CGGGTGTCGAGAGCCCCTTGG AGTCTTCAGG 25876 388 1.66e-10 TGACTCGTGA CGGCTGTCGAATGCGCTTTGG GAGACGGCGA 46971 310 9.34e-10 GACTGATTGG CCGGAATCAAAAGTCCGTTGC CCGTCGCTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40624 1.5e-13 183_[+3]_296 35355 1.5e-13 183_[+3]_296 37408 5.9e-13 178_[+3]_301 25876 1.7e-10 387_[+3]_92 46971 9.3e-10 309_[+3]_170 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 40624 ( 184) CGGGTGTCGAGAGCCCATTGG 1 35355 ( 184) CGGGTGTCGAGAGCCCATTGG 1 37408 ( 179) CGGGTGTCGAGAGCCCCTTGG 1 25876 ( 388) CGGCTGTCGAATGCGCTTTGG 1 46971 ( 310) CCGGAATCAAAAGTCCGTTGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 10.2948 E= 1.8e-006 -897 209 -897 -897 -897 -23 177 -897 -897 -897 209 -897 -897 -23 177 -897 -33 -897 -897 151 -33 -897 177 -897 -897 -897 -897 184 -897 209 -897 -897 -33 -897 177 -897 199 -897 -897 -897 67 -897 135 -897 167 -897 -897 -48 -897 -897 209 -897 -897 177 -897 -48 -897 177 -23 -897 -897 209 -897 -897 67 -23 -23 -48 -897 -897 -897 184 -897 -897 -897 184 -897 -897 209 -897 -897 -23 177 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 1.8e-006 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.800000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.200000 0.200000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[GC]G[GC][TA][GA]TC[GA]A[GA][AT]G[CT][CG]C[ACGT]TTG[GC] -------------------------------------------------------------------------------- Time 3.12 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46971 8.41e-06 309_[+3(9.34e-10)]_170 38979 7.41e-01 500 40431 9.42e-01 500 49851 9.27e-01 500 25876 3.49e-06 387_[+3(1.66e-10)]_92 33500 9.91e-01 500 37408 3.82e-27 101_[+3(6.96e-06)]_56_\ [+3(5.87e-13)]_208_[+1(1.30e-13)]_51_[+2(1.87e-13)] 40610 3.51e-16 20_[+1(5.56e-13)]_39_[+2(2.92e-12)]_\ 399 48931 3.60e-01 500 35355 1.01e-27 106_[+3(6.96e-06)]_56_\ [+3(1.49e-13)]_208_[+1(1.30e-13)]_46_[+2(1.87e-13)] 40624 1.01e-27 106_[+3(6.96e-06)]_56_\ [+3(1.49e-13)]_208_[+1(1.30e-13)]_46_[+2(1.87e-13)] -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************