******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/238/238.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 15355 1.0000 500 45253 1.0000 500 45581 1.0000 500 46016 1.0000 500 48392 1.0000 500 48677 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/238/238.seqs.fa -oc motifs/238 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 6 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3000 N= 6 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.269 C 0.214 G 0.230 T 0.287 Background letter frequencies (from dataset with add-one prior applied): A 0.269 C 0.214 G 0.230 T 0.287 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 98 E-value = 3.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :33777:a3372235:2:::: pos.-specific C 8:::33a::2::8:52::287 probability G 277::::::327:2::23823 matrix T :::3::::7222:5:877::: bits 2.2 * 2.0 ** 1.8 ** 1.6 * ** * ** Relative 1.3 * ** * *** Entropy 1.1 *** **** * ** *** (23.6 bits) 0.9 ********* * ** **** 0.7 ********* *** ******* 0.4 ********* *********** 0.2 ********* *********** 0.0 --------------------- Multilevel CGGAAACATAAGCTATTTGCC consensus AATCC AG AC G G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 48677 307 1.35e-11 CTAGGAATAC CGGAAACATAAGCTATTTGCG GTTTGCGGTG 48392 458 8.09e-10 CTGACAGCCT CGGAAACATAGGCACCTTGCC ATTTCCAAAC 45581 320 2.07e-08 TGATGTCACC CGGACACAAGAGCACTGGGGG TGAAGACTGA 46016 125 5.40e-08 TGCGCAACTC GGATACCAACATCTCTTTGCC GTAGCTACGC 15355 388 6.56e-08 AAAGACGGTC CAGAAACATGAACGATAGCCC GACCGGAAGT 45253 204 1.31e-07 ACGGTGGATG CAATCCCATTTGATATTTGCC CAGTGTAATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48677 1.4e-11 306_[+1]_173 48392 8.1e-10 457_[+1]_22 45581 2.1e-08 319_[+1]_160 46016 5.4e-08 124_[+1]_355 15355 6.6e-08 387_[+1]_92 45253 1.3e-07 203_[+1]_276 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 48677 ( 307) CGGAAACATAAGCTATTTGCG 1 48392 ( 458) CGGAAACATAGGCACCTTGCC 1 45581 ( 320) CGGACACAAGAGCACTGGGGG 1 46016 ( 125) GGATACCAACATCTCTTTGCC 1 15355 ( 388) CAGAAACATGAACGATAGCCC 1 45253 ( 204) CAATCCCATTTGATATTTGCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 2880 bayes= 9.35214 E= 3.1e+001 -923 196 -47 -923 31 -923 153 -923 31 -923 153 -923 131 -923 -923 22 131 64 -923 -923 131 64 -923 -923 -923 222 -923 -923 189 -923 -923 -923 31 -923 -923 121 31 -36 53 -78 131 -923 -47 -78 -69 -923 153 -78 -69 196 -923 -923 31 -923 -47 80 89 122 -923 -923 -923 -36 -923 154 -69 -923 -47 121 -923 -923 53 121 -923 -36 185 -923 -923 196 -47 -923 -923 164 53 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 3.1e+001 0.000000 0.833333 0.166667 0.000000 0.333333 0.000000 0.666667 0.000000 0.333333 0.000000 0.666667 0.000000 0.666667 0.000000 0.000000 0.333333 0.666667 0.333333 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.000000 0.000000 0.666667 0.333333 0.166667 0.333333 0.166667 0.666667 0.000000 0.166667 0.166667 0.166667 0.000000 0.666667 0.166667 0.166667 0.833333 0.000000 0.000000 0.333333 0.000000 0.166667 0.500000 0.500000 0.500000 0.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.166667 0.000000 0.166667 0.666667 0.000000 0.000000 0.333333 0.666667 0.000000 0.166667 0.833333 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.666667 0.333333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[GA][GA][AT][AC][AC]CA[TA][AG]AGC[TA][AC]TT[TG]GC[CG] -------------------------------------------------------------------------------- Time 0.38 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 5 llr = 87 E-value = 3.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::8:2:64:4::22:484a pos.-specific C ::a2:22:26:::8::4422: probability G 42:::82a2::6a2:82::2: matrix T 68:82:4:::a:::8:42:2: bits 2.2 * * * 2.0 * * * * 1.8 * * * * * 1.6 * * * ** * Relative 1.3 * * * * ** * * * Entropy 1.1 ***** * ******* * * (25.1 bits) 0.9 ****** * ******* * * 0.7 ****** ********* * * 0.4 ****** ************ * 0.2 ****** ************ * 0.0 --------------------- Multilevel TTCTAGTGACTGGCTGCAAAA consensus GG CTCA CA A GAATCCC sequence C G GT G G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45581 360 3.58e-11 AGAAAGCGGC GTCTAGTGACTAGCTGTAAAA CAGTGCTGGA 48677 386 2.96e-10 GGTGATGCGT TGCTAGTGACTAGCTGTAAAA AGCCCCAAAC 48392 294 1.62e-08 GTGTAGTACT TTCCAGGGCATGGCTGGTACA AAGGTGTTAT 45253 164 2.10e-08 TTCCATGGTA GTCTTGAGGATGGCTGCCCGA TGACATCAAA 15355 273 4.65e-08 GGTGAGCTAT TTCTACCGACTGGGAACCATA TTCTGGTGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45581 3.6e-11 359_[+2]_120 48677 3e-10 385_[+2]_94 48392 1.6e-08 293_[+2]_186 45253 2.1e-08 163_[+2]_316 15355 4.7e-08 272_[+2]_207 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=5 45581 ( 360) GTCTAGTGACTAGCTGTAAAA 1 48677 ( 386) TGCTAGTGACTAGCTGTAAAA 1 48392 ( 294) TTCCAGGGCATGGCTGGTACA 1 45253 ( 164) GTCTTGAGGATGGCTGCCCGA 1 15355 ( 273) TTCTACCGACTGGGAACCATA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 2880 bayes= 9.41936 E= 3.7e+002 -897 -897 79 106 -897 -897 -20 148 -897 222 -897 -897 -897 -10 -897 148 157 -897 -897 -52 -897 -10 179 -897 -43 -10 -20 48 -897 -897 212 -897 116 -10 -20 -897 57 149 -897 -897 -897 -897 -897 180 57 -897 138 -897 -897 -897 212 -897 -897 190 -20 -897 -43 -897 -897 148 -43 -897 179 -897 -897 90 -20 48 57 90 -897 -52 157 -10 -897 -897 57 -10 -20 -52 189 -897 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 3.7e+002 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.200000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.800000 0.000000 0.000000 0.200000 0.000000 0.200000 0.800000 0.000000 0.200000 0.200000 0.200000 0.400000 0.000000 0.000000 1.000000 0.000000 0.600000 0.200000 0.200000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.800000 0.000000 0.000000 0.400000 0.200000 0.400000 0.400000 0.400000 0.000000 0.200000 0.800000 0.200000 0.000000 0.000000 0.400000 0.200000 0.200000 0.200000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TG][TG]C[TC][AT][GC][TACG]G[ACG][CA]T[GA]G[CG][TA][GA][CTG][ACT][AC][ACGT]A -------------------------------------------------------------------------------- Time 0.73 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 4 llr = 54 E-value = 4.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a:a::a3:a:aa pos.-specific C :3::8:53:8:: probability G :8:a3:38:::: matrix T :::::::::3:: bits 2.2 * 2.0 * ** * * ** 1.8 * ** * * ** 1.6 * ** * * ** Relative 1.3 ****** ***** Entropy 1.1 ****** ***** (19.5 bits) 0.9 ****** ***** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AGAGCACGACAA consensus C G AC T sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 46016 215 4.52e-08 AGGTTGTGCT AGAGCACGACAA ACGCTTGTTC 48677 194 1.51e-07 CTTTTGTCAA AGAGCAAGACAA AAAAAGTAAG 45581 222 1.10e-06 ATAATAGAAA AGAGGACGATAA GAATTTAAGC 15355 12 1.14e-06 TCTCGCGAAG ACAGCAGCACAA GTTTACCTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46016 4.5e-08 214_[+3]_274 48677 1.5e-07 193_[+3]_295 45581 1.1e-06 221_[+3]_267 15355 1.1e-06 11_[+3]_477 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=4 46016 ( 215) AGAGCACGACAA 1 48677 ( 194) AGAGCAAGACAA 1 45581 ( 222) AGAGGACGATAA 1 15355 ( 12) ACAGCAGCACAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 2934 bayes= 9.51668 E= 4.0e+002 189 -865 -865 -865 -865 23 170 -865 189 -865 -865 -865 -865 -865 212 -865 -865 181 12 -865 189 -865 -865 -865 -11 122 12 -865 -865 23 170 -865 189 -865 -865 -865 -865 181 -865 -20 189 -865 -865 -865 189 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 4 E= 4.0e+002 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.250000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[GC]AG[CG]A[CAG][GC]A[CT]AA -------------------------------------------------------------------------------- Time 1.06 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 15355 1.72e-10 11_[+3(1.14e-06)]_249_\ [+2(4.65e-08)]_94_[+1(6.56e-08)]_92 45253 1.14e-07 163_[+2(2.10e-08)]_19_\ [+1(1.31e-07)]_276 45581 6.61e-14 221_[+3(1.10e-06)]_86_\ [+1(2.07e-08)]_19_[+2(3.58e-11)]_120 46016 1.03e-07 124_[+1(5.40e-08)]_69_\ [+3(4.52e-08)]_274 48392 9.14e-10 293_[+2(1.62e-08)]_143_\ [+1(8.09e-10)]_22 48677 6.92e-17 193_[+3(1.51e-07)]_101_\ [+1(1.35e-11)]_58_[+2(2.96e-10)]_94 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************