******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/487/487.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10256 1.0000 500 20889 1.0000 500 22658 1.0000 500 25842 1.0000 500 269066 1.0000 500 269590 1.0000 500 3111 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/487/487.seqs.fa -oc motifs/487 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.269 C 0.224 G 0.226 T 0.281 Background letter frequencies (from dataset with add-one prior applied): A 0.269 C 0.224 G 0.226 T 0.281 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 4 llr = 78 E-value = 2.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :a:3a::8::8:855:35: pos.-specific C a:a5:a8:a8:a333385: probability G :::3::33:33::338::a matrix T ::::::::::::::::::: bits 2.2 * * * * * * 1.9 *** ** * * * 1.7 *** ** * * * 1.5 *** ** * * * Relative 1.3 *** *** ** * ** * Entropy 1.1 *** ********* **** (28.1 bits) 0.9 *** ********* **** 0.6 ************* **** 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel CACCACCACCACAAAGCAG consensus A GG GG CCCCAC sequence G GG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 25842 477 2.91e-11 TGGCCAAAGC CACCACCGCCACAAAGCCG TCTTC 20889 437 1.93e-10 CTGAGACTGA CACCACCACCGCACCGCCG TTGTTGCGTA 269590 16 2.80e-09 TACTCAAGAG CACAACCACCACCGGCCAG TTAGTCTCGA 269066 240 3.60e-09 GGGGAGTACG CACGACGACGACAAAGAAG AAGGAAAGGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25842 2.9e-11 476_[+1]_5 20889 1.9e-10 436_[+1]_45 269590 2.8e-09 15_[+1]_466 269066 3.6e-09 239_[+1]_242 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=4 25842 ( 477) CACCACCGCCACAAAGCCG 1 20889 ( 437) CACCACCACCGCACCGCCG 1 269590 ( 16) CACAACCACCACCGGCCAG 1 269066 ( 240) CACGACGACGACAAAGAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 3374 bayes= 9.71853 E= 2.2e+000 -865 215 -865 -865 189 -865 -865 -865 -865 215 -865 -865 -11 115 15 -865 189 -865 -865 -865 -865 215 -865 -865 -865 174 15 -865 147 -865 15 -865 -865 215 -865 -865 -865 174 15 -865 147 -865 15 -865 -865 215 -865 -865 147 16 -865 -865 89 16 15 -865 89 16 15 -865 -865 16 173 -865 -11 174 -865 -865 89 115 -865 -865 -865 -865 214 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 4 E= 2.2e+000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 0.250000 0.750000 0.000000 0.250000 0.750000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CAC[CAG]AC[CG][AG]C[CG][AG]C[AC][ACG][ACG][GC][CA][AC]G -------------------------------------------------------------------------------- Time 0.43 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 7 llr = 95 E-value = 1.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :9611:47:14::::4 pos.-specific C :11::::1:::::::: probability G a::9:74:391:a766 matrix T ::3:93117:4a:34: bits 2.2 * * 1.9 * ** 1.7 * ** 1.5 * * * ** Relative 1.3 ** *** * *** Entropy 1.1 ** *** ** ***** (19.6 bits) 0.9 ** *** *** ***** 0.6 ** ******* ***** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GAAGTGAATGATGGGG consensus T TG G T TTA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 10256 205 8.60e-09 CATTATGATG GATGTGGATGATGGGA AACAATCGTT 3111 134 1.11e-07 CAAAGCAAGC GAAGTTGAGGTTGGTA CTTCAACATT 20889 278 1.84e-07 GATAACCTTT GAAATGAATGTTGTGG TTGTATCCCT 269066 304 2.04e-07 TTCGAGAGGC GAAGTGAAGAATGGTG TCAACGTCGA 269590 68 2.89e-07 CCGGACATCA GAAGTGTCTGGTGGGG ATAACAAGAG 25842 4 7.83e-07 GAT GATGTGATTGTTGTTA GATAGATGGT 22658 92 1.63e-06 GAGAGAAAGA GCCGATGATGATGGGG AGGAAGAGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10256 8.6e-09 204_[+2]_280 3111 1.1e-07 133_[+2]_351 20889 1.8e-07 277_[+2]_207 269066 2e-07 303_[+2]_181 269590 2.9e-07 67_[+2]_417 25842 7.8e-07 3_[+2]_481 22658 1.6e-06 91_[+2]_393 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=7 10256 ( 205) GATGTGGATGATGGGA 1 3111 ( 134) GAAGTTGAGGTTGGTA 1 20889 ( 278) GAAATGAATGTTGTGG 1 269066 ( 304) GAAGTGAAGAATGGTG 1 269590 ( 68) GAAGTGTCTGGTGGGG 1 25842 ( 4) GATGTGATTGTTGTTA 1 22658 ( 92) GCCGATGATGATGGGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 3395 bayes= 8.91886 E= 1.4e+001 -945 -945 215 -945 167 -65 -945 -945 108 -65 -945 3 -91 -945 192 -945 -91 -945 -945 161 -945 -945 166 3 67 -945 92 -97 141 -65 -945 -97 -945 -945 34 135 -91 -945 192 -945 67 -945 -66 61 -945 -945 -945 183 -945 -945 215 -945 -945 -945 166 3 -945 -945 134 61 67 -945 134 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 1.4e+001 0.000000 0.000000 1.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.571429 0.142857 0.000000 0.285714 0.142857 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 0.714286 0.285714 0.428571 0.000000 0.428571 0.142857 0.714286 0.142857 0.000000 0.142857 0.000000 0.000000 0.285714 0.714286 0.142857 0.000000 0.857143 0.000000 0.428571 0.000000 0.142857 0.428571 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.714286 0.285714 0.000000 0.000000 0.571429 0.428571 0.428571 0.000000 0.571429 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GA[AT]GT[GT][AG]A[TG]G[AT]TG[GT][GT][GA] -------------------------------------------------------------------------------- Time 0.84 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 4 llr = 79 E-value = 3.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :33:5a3333::a83a::a:3 pos.-specific C 88:a3:8:5:3a:38:8:::8 probability G 3:::3::5::5:::::3a:8: matrix T ::8::::3383::::::::3: bits 2.2 * * * 1.9 * * ** * ** 1.7 * * ** * ** 1.5 * * ** * ** Relative 1.3 ** * ** ** ******* Entropy 1.1 **** ** * ********** (28.4 bits) 0.9 **** ** * ********** 0.6 **** ** ************ 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CCTCAACGCTGCAACACGAGC consensus GAA C AAAAC CA G TA sequence G TT T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 3111 221 1.32e-10 GGCACATCAG CCTCGACTCTCCAACACGATC GTGCCGATCG 10256 118 4.89e-10 CACAAAGATA CCACAAAGCTTCAACAGGAGC GAAGGCAACA 269066 187 8.07e-10 CAGAATACAA CATCCACGTTGCACAACGAGC CAAACAACCA 22658 250 1.26e-09 TCCCCGCACT GCTCAACAAAGCAACACGAGA CGAGCGGCCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3111 1.3e-10 220_[+3]_259 10256 4.9e-10 117_[+3]_362 269066 8.1e-10 186_[+3]_293 22658 1.3e-09 249_[+3]_230 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=4 3111 ( 221) CCTCGACTCTCCAACACGATC 1 10256 ( 118) CCACAAAGCTTCAACAGGAGC 1 269066 ( 187) CATCCACGTTGCACAACGAGC 1 22658 ( 250) GCTCAACAAAGCAACACGAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3360 bayes= 9.71253 E= 3.2e+002 -865 174 15 -865 -11 174 -865 -865 -11 -865 -865 142 -865 215 -865 -865 89 16 15 -865 189 -865 -865 -865 -11 174 -865 -865 -11 -865 115 -17 -11 115 -865 -17 -11 -865 -865 142 -865 16 115 -17 -865 215 -865 -865 189 -865 -865 -865 147 16 -865 -865 -11 174 -865 -865 189 -865 -865 -865 -865 174 15 -865 -865 -865 214 -865 189 -865 -865 -865 -865 -865 173 -17 -11 174 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 3.2e+002 0.000000 0.750000 0.250000 0.000000 0.250000 0.750000 0.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.250000 0.000000 0.500000 0.250000 0.250000 0.500000 0.000000 0.250000 0.250000 0.000000 0.000000 0.750000 0.000000 0.250000 0.500000 0.250000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.250000 0.750000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG][CA][TA]C[ACG]A[CA][GAT][CAT][TA][GCT]CA[AC][CA]A[CG]GA[GT][CA] -------------------------------------------------------------------------------- Time 1.23 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10256 2.84e-10 117_[+3(4.89e-10)]_66_\ [+2(8.60e-09)]_280 20889 2.59e-09 277_[+2(1.84e-07)]_143_\ [+1(1.93e-10)]_45 22658 2.80e-08 65_[+2(9.69e-05)]_10_[+2(1.63e-06)]_\ 142_[+3(1.26e-09)]_230 25842 1.86e-09 3_[+2(7.83e-07)]_94_[+2(5.23e-05)]_\ 347_[+1(2.91e-11)]_5 269066 4.87e-14 186_[+3(8.07e-10)]_32_\ [+1(3.60e-09)]_45_[+2(2.04e-07)]_181 269590 2.62e-08 15_[+1(2.80e-09)]_33_[+2(2.89e-07)]_\ 191_[+1(6.96e-05)]_207 3111 7.43e-11 112_[+3(5.39e-05)]_[+2(1.11e-07)]_\ 71_[+3(1.32e-10)]_259 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************