******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/217/217.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 37038 1.0000 500 37436 1.0000 500 47599 1.0000 500 48389 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/217/217.seqs.fa -oc motifs/217 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 4 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 2000 N= 4 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.310 C 0.229 G 0.211 T 0.251 Background letter frequencies (from dataset with add-one prior applied): A 0.310 C 0.229 G 0.211 T 0.250 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 4 llr = 80 E-value = 3.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a:a8::::55:3:3:::::a: pos.-specific C :3:3a:::5:::8:8383a:a probability G :8:::355:58::3:83:::: matrix T :::::855::38353::8::: bits 2.2 2.0 * * * 1.8 * * * *** 1.6 * * * *** Relative 1.3 *** ** * * *** *** Entropy 1.1 *** **** *** ******* (28.7 bits) 0.9 ************* ******* 0.7 ************* ******* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel AGAACTGGAAGTCTCGCTCAC consensus C C GTTCGTATATCGC sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 47599 197 1.24e-11 AGTTGTAGAG AGAACTGTAAGTCGCGCTCAC GGCTGCCTCG 37038 360 5.54e-10 CTTTGACATG ACAACTTTCGGACACGCTCAC TTTTTGACGA 48389 409 8.31e-10 TTGGAACGGA AGACCTGGCGTTCTCGGCCAC ATTTTCAATA 37436 474 2.27e-09 GAAACTGGCA AGAACGTGAAGTTTTCCTCAC AGCAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47599 1.2e-11 196_[+1]_283 37038 5.5e-10 359_[+1]_120 48389 8.3e-10 408_[+1]_71 37436 2.3e-09 473_[+1]_6 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=4 47599 ( 197) AGAACTGTAAGTCGCGCTCAC 1 37038 ( 360) ACAACTTTCGGACACGCTCAC 1 48389 ( 409) AGACCTGGCGTTCTCGGCCAC 1 37436 ( 474) AGAACGTGAAGTTTTCCTCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 1920 bayes= 8.90388 E= 3.5e+001 169 -865 -865 -865 -865 13 183 -865 169 -865 -865 -865 127 13 -865 -865 -865 213 -865 -865 -865 -865 24 158 -865 -865 124 100 -865 -865 124 100 69 113 -865 -865 69 -865 124 -865 -865 -865 183 0 -31 -865 -865 158 -865 171 -865 0 -31 -865 24 100 -865 171 -865 0 -865 13 183 -865 -865 171 24 -865 -865 13 -865 158 -865 213 -865 -865 169 -865 -865 -865 -865 213 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 3.5e+001 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.500000 0.500000 0.500000 0.500000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 0.750000 0.250000 0.250000 0.000000 0.000000 0.750000 0.000000 0.750000 0.000000 0.250000 0.250000 0.000000 0.250000 0.500000 0.000000 0.750000 0.000000 0.250000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[GC]A[AC]C[TG][GT][GT][AC][AG][GT][TA][CT][TAG][CT][GC][CG][TC]CAC -------------------------------------------------------------------------------- Time 0.17 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 4 llr = 50 E-value = 3.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::::3::a:: pos.-specific C a3a::5:a::3a probability G :3:a333:8:8: matrix T :5::835:3::: bits 2.2 * 2.0 * ** * * 1.8 * ** * * * 1.6 * ** * * * Relative 1.3 * *** ***** Entropy 1.1 * *** ***** (18.2 bits) 0.9 * *** ***** 0.7 ****** ***** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CTCGTCTCGAGC consensus C GGA T C sequence G TG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 48389 362 2.21e-07 TCAAAGAGAC CTCGTCACGAGC TTTATAGGAA 37436 56 7.30e-07 GTAAAAGTAA CGCGTGGCGAGC CGAATCTGGT 47599 319 1.73e-06 ACCTACGATG CCCGGTTCGAGC TTTCTCTCGA 37038 471 1.91e-06 TCACCACTCT CTCGTCTCTACC GTGATTCAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48389 2.2e-07 361_[+2]_127 37436 7.3e-07 55_[+2]_433 47599 1.7e-06 318_[+2]_170 37038 1.9e-06 470_[+2]_18 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=4 48389 ( 362) CTCGTCACGAGC 1 37436 ( 56) CGCGTGGCGAGC 1 47599 ( 319) CCCGGTTCGAGC 1 37038 ( 471) CTCGTCTCTACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 1956 bayes= 8.93074 E= 3.7e+002 -865 213 -865 -865 -865 13 24 100 -865 213 -865 -865 -865 -865 224 -865 -865 -865 24 158 -865 113 24 0 -31 -865 24 100 -865 213 -865 -865 -865 -865 183 0 169 -865 -865 -865 -865 13 183 -865 -865 213 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 4 E= 3.7e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.250000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.500000 0.250000 0.250000 0.250000 0.000000 0.250000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[TCG]CG[TG][CGT][TAG]C[GT]A[GC]C -------------------------------------------------------------------------------- Time 0.33 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 4 llr = 52 E-value = 5.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::a::3:3::5: pos.-specific C a::8a:a:8::8 probability G :a:3:::::8:: matrix T :::::8:83353 bits 2.2 * 2.0 ** * * 1.8 *** * * 1.6 *** * * Relative 1.3 ***** * ** * Entropy 1.1 ********** * (18.7 bits) 0.9 ************ 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CGACCTCTCGAC consensus G A ATTTT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 48389 444 3.11e-08 TCAATACAAG CGACCTCTCGTC GAACCGGGAA 47599 338 6.97e-08 AGCTTTCTCT CGACCTCTCGAC CATGGCGTAC 37436 224 2.72e-06 TAGTTCCGTG CGAGCTCTTTAC TCCATATACG 37038 388 3.19e-06 CACTTTTTGA CGACCACACGTT CAATCGTTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48389 3.1e-08 443_[+3]_45 47599 7e-08 337_[+3]_151 37436 2.7e-06 223_[+3]_265 37038 3.2e-06 387_[+3]_101 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=4 48389 ( 444) CGACCTCTCGTC 1 47599 ( 338) CGACCTCTCGAC 1 37436 ( 224) CGAGCTCTTTAC 1 37038 ( 388) CGACCACACGTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 1956 bayes= 8.93074 E= 5.1e+002 -865 213 -865 -865 -865 -865 224 -865 169 -865 -865 -865 -865 171 24 -865 -865 213 -865 -865 -31 -865 -865 158 -865 213 -865 -865 -31 -865 -865 158 -865 171 -865 0 -865 -865 183 0 69 -865 -865 100 -865 171 -865 0 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 4 E= 5.1e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.750000 0.250000 0.500000 0.000000 0.000000 0.500000 0.000000 0.750000 0.000000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CGA[CG]C[TA]C[TA][CT][GT][AT][CT] -------------------------------------------------------------------------------- Time 0.46 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37038 1.69e-10 359_[+1(5.54e-10)]_7_[+3(3.19e-06)]_\ 71_[+2(1.91e-06)]_18 37436 2.22e-10 55_[+2(7.30e-07)]_156_\ [+3(2.72e-06)]_238_[+1(2.27e-09)]_6 47599 1.20e-13 196_[+1(1.24e-11)]_101_\ [+2(1.73e-06)]_7_[+3(6.97e-08)]_151 48389 4.24e-13 361_[+2(2.21e-07)]_35_\ [+1(8.31e-10)]_14_[+3(3.11e-08)]_45 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************