******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/54/54.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31420 1.0000 500 43187 1.0000 500 47445 1.0000 500 43595 1.0000 500 49093 1.0000 500 49147 1.0000 500 50097 1.0000 500 33616 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/54/54.seqs.fa -oc motifs/54 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4000 N= 8 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.266 C 0.229 G 0.216 T 0.289 Background letter frequencies (from dataset with add-one prior applied): A 0.265 C 0.229 G 0.216 T 0.289 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 4 llr = 79 E-value = 2.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::::8333538::3:::3:: pos.-specific C :::a::3:8:3:a5:8a::8: probability G a5::a333::3:::8::a:3a matrix T :5a:::35:533:5:3::8:: bits 2.2 * ** * ** * 2.0 * ** * ** * 1.8 * *** * ** * 1.5 * *** * ** * Relative 1.3 * *** * * **** ** Entropy 1.1 ****** * ** ******* (28.7 bits) 0.9 ****** ** ********** 0.7 ****** ** ********** 0.4 ****** *** ********** 0.2 ****** *** ********** 0.0 --------------------- Multilevel GGTCGAATCAAACCGCCGTCG consensus T GCAATCT TAT AG sequence GG G T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 43595 215 1.75e-10 GAGCGCGTGA GTTCGAATCTCACCGTCGTCG TGTGAAATTT 49093 225 5.02e-10 ACTTCCTCCC GTTCGACACTTTCTGCCGTCG GCAGCAGTCG 50097 401 8.78e-10 ATTGTCTTGG GGTCGATGCAGACTACCGTGG GCCTCGCCCT 33616 429 9.53e-10 AAGAGTGATT GGTCGGGTAAAACCGCCGACG AGTCGAATCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43595 1.8e-10 214_[+1]_265 49093 5e-10 224_[+1]_255 50097 8.8e-10 400_[+1]_79 33616 9.5e-10 428_[+1]_51 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=4 43595 ( 215) GTTCGAATCTCACCGTCGTCG 1 49093 ( 225) GTTCGACACTTTCTGCCGTCG 1 50097 ( 401) GGTCGATGCAGACTACCGTGG 1 33616 ( 429) GGTCGGGTAAAACCGCCGACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3840 bayes= 9.90539 E= 2.2e+002 -865 -865 221 -865 -865 -865 121 79 -865 -865 -865 179 -865 212 -865 -865 -865 -865 221 -865 150 -865 21 -865 -9 12 21 -21 -9 -865 21 79 -9 171 -865 -865 91 -865 -865 79 -9 12 21 -21 150 -865 -865 -21 -865 212 -865 -865 -865 112 -865 79 -9 -865 179 -865 -865 171 -865 -21 -865 212 -865 -865 -865 -865 221 -865 -9 -865 -865 137 -865 171 21 -865 -865 -865 221 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 2.2e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.250000 0.250000 0.250000 0.250000 0.250000 0.000000 0.250000 0.500000 0.250000 0.750000 0.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.250000 0.250000 0.250000 0.250000 0.750000 0.000000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.250000 0.000000 0.750000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[GT]TCG[AG][ACGT][TAG][CA][AT][ACGT][AT]C[CT][GA][CT]CG[TA][CG]G -------------------------------------------------------------------------------- Time 0.59 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 4 llr = 55 E-value = 7.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :3::a::::::: pos.-specific C ::a::a::8:88 probability G 8::a::8:3a:3 matrix T 38::::3a::3: bits 2.2 ** * * 2.0 **** * 1.8 **** * * 1.5 **** * * Relative 1.3 * ********** Entropy 1.1 ************ (19.9 bits) 0.9 ************ 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GTCGACGTCGCC consensus TA T G TG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 33616 134 1.54e-07 GGTCCATGTC GTCGACGTCGTC GTCACGTTTG 31420 465 2.35e-07 CATTGGACTC TTCGACGTCGCC TTGTATTCTC 43595 154 3.14e-07 TCAAAGCAAA GACGACGTGGCC GAGTGGTTAA 43187 70 6.49e-07 AACGCTCACA GTCGACTTCGCG ATCGGCTTAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33616 1.5e-07 133_[+2]_355 31420 2.3e-07 464_[+2]_24 43595 3.1e-07 153_[+2]_335 43187 6.5e-07 69_[+2]_419 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=4 33616 ( 134) GTCGACGTCGTC 1 31420 ( 465) TTCGACGTCGCC 1 43595 ( 154) GACGACGTGGCC 1 43187 ( 70) GTCGACTTCGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3912 bayes= 9.93221 E= 7.4e+002 -865 -865 179 -21 -9 -865 -865 137 -865 212 -865 -865 -865 -865 221 -865 191 -865 -865 -865 -865 212 -865 -865 -865 -865 179 -21 -865 -865 -865 179 -865 171 21 -865 -865 -865 221 -865 -865 171 -865 -21 -865 171 21 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 4 E= 7.4e+002 0.000000 0.000000 0.750000 0.250000 0.250000 0.000000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.750000 0.250000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GT][TA]CGAC[GT]T[CG]G[CT][CG] -------------------------------------------------------------------------------- Time 1.11 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 5 llr = 64 E-value = 7.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :82:a4::a:2: pos.-specific C ::2::::::::a probability G a::8:62a:a4: matrix T :262::8:::4: bits 2.2 * * * * 2.0 * * *** * 1.8 * * *** * 1.5 * * *** * Relative 1.3 * ** *** * Entropy 1.1 ** ******* * (18.4 bits) 0.9 ** ******* * 0.7 ** ******* * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GATGAGTGAGGC consensus TAT AG T sequence C A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 31420 200 3.60e-07 AATTACCTGT GACGAGTGAGTC GATCTTTCCA 43595 476 6.10e-07 AACATTATCA GAAGAATGAGGC TGTAAGAAAC 47445 24 6.91e-07 GTCTGTCACC GATTAGTGAGGC CAAGGCATTG 43187 286 8.26e-07 TTTCGAAAAG GTTGAGTGAGTC ACGACTCCAA 50097 76 1.59e-06 CCCGATGCCA GATGAAGGAGAC TTTAGAGACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31420 3.6e-07 199_[+3]_289 43595 6.1e-07 475_[+3]_13 47445 6.9e-07 23_[+3]_465 43187 8.3e-07 285_[+3]_203 50097 1.6e-06 75_[+3]_413 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=5 31420 ( 200) GACGAGTGAGTC 1 43595 ( 476) GAAGAATGAGGC 1 47445 ( 24) GATTAGTGAGGC 1 43187 ( 286) GTTGAGTGAGTC 1 50097 ( 76) GATGAAGGAGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3912 bayes= 9.86175 E= 7.4e+002 -897 -897 221 -897 159 -897 -897 -53 -41 -20 -897 105 -897 -897 189 -53 191 -897 -897 -897 59 -897 147 -897 -897 -897 -11 147 -897 -897 221 -897 191 -897 -897 -897 -897 -897 221 -897 -41 -897 89 47 -897 212 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 5 E= 7.4e+002 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.200000 0.200000 0.000000 0.600000 0.000000 0.000000 0.800000 0.200000 1.000000 0.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.400000 0.400000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[AT][TAC][GT]A[GA][TG]GAG[GTA]C -------------------------------------------------------------------------------- Time 1.79 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31420 1.47e-06 199_[+3(3.60e-07)]_253_\ [+2(2.35e-07)]_24 43187 9.20e-06 69_[+2(6.49e-07)]_204_\ [+3(8.26e-07)]_203 47445 6.52e-03 23_[+3(6.91e-07)]_465 43595 2.26e-12 153_[+2(3.14e-07)]_49_\ [+1(1.75e-10)]_240_[+3(6.10e-07)]_13 49093 2.51e-05 224_[+1(5.02e-10)]_255 49147 9.77e-01 500 50097 2.47e-08 75_[+3(1.59e-06)]_313_\ [+1(8.78e-10)]_79 33616 2.20e-09 133_[+2(1.54e-07)]_283_\ [+1(9.53e-10)]_51 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************