******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/12/12.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47806 1.0000 500 11128 1.0000 500 45834 1.0000 500 33402 1.0000 500 43804 1.0000 500 47563 1.0000 500 34004 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/12/12.seqs.fa -oc motifs/12 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.285 C 0.222 G 0.217 T 0.276 Background letter frequencies (from dataset with add-one prior applied): A 0.285 C 0.222 G 0.217 T 0.276 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 13 sites = 7 llr = 84 E-value = 1.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::7:::9:93:1: pos.-specific C a::::1:3::::1 probability G ::3:67:::6:39 matrix T :a:a411711a6: bits 2.2 * 2.0 * 1.8 ** * * 1.5 ** * * * Relative 1.3 ** * * * * * Entropy 1.1 ********* * * (17.3 bits) 0.9 ********* * * 0.7 ************* 0.4 ************* 0.2 ************* 0.0 ------------- Multilevel CTATGGATAGTTG consensus G T C A G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- 33402 208 5.58e-08 ATGCATCTGA CTATTGATAGTTG CCAAGCAGGC 47563 108 1.27e-07 GCAGAATTTC CTATGGATAATTG CGGGCTTATC 43804 274 3.53e-07 ATTCAAATAT CTATGCATAGTTG TATGTATAAT 45834 80 2.22e-06 TGCTCTGGAA CTATGGATTGTAG ACAGATCTCA 47806 82 3.56e-06 TCATATCTAC CTGTTGTCAGTTG ATCTTTACAT 11128 434 3.97e-06 ACACCACCGT CTATGGATATTGC TCCAATGAGG 34004 273 9.85e-06 AATCCTGCAT CTGTTTACAATGG TCGCAAACTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33402 5.6e-08 207_[+1]_280 47563 1.3e-07 107_[+1]_380 43804 3.5e-07 273_[+1]_214 45834 2.2e-06 79_[+1]_408 47806 3.6e-06 81_[+1]_406 11128 4e-06 433_[+1]_54 34004 9.8e-06 272_[+1]_215 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=13 seqs=7 33402 ( 208) CTATTGATAGTTG 1 47563 ( 108) CTATGGATAATTG 1 43804 ( 274) CTATGCATAGTTG 1 45834 ( 80) CTATGGATTGTAG 1 47806 ( 82) CTGTTGTCAGTTG 1 11128 ( 434) CTATGGATATTGC 1 34004 ( 273) CTGTTTACAATGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 3416 bayes= 8.92778 E= 1.6e+001 -945 217 -945 -945 -945 -945 -945 186 133 -945 39 -945 -945 -945 -945 186 -945 -945 139 64 -945 -64 171 -95 159 -945 -945 -95 -945 36 -945 137 159 -945 -945 -95 1 -945 139 -95 -945 -945 -945 186 -99 -945 39 105 -945 -64 198 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 7 E= 1.6e+001 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.714286 0.000000 0.285714 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.571429 0.428571 0.000000 0.142857 0.714286 0.142857 0.857143 0.000000 0.000000 0.142857 0.000000 0.285714 0.000000 0.714286 0.857143 0.000000 0.000000 0.142857 0.285714 0.000000 0.571429 0.142857 0.000000 0.000000 0.000000 1.000000 0.142857 0.000000 0.285714 0.571429 0.000000 0.142857 0.857143 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CT[AG]T[GT]GA[TC]A[GA]T[TG]G -------------------------------------------------------------------------------- Time 0.54 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 6 llr = 79 E-value = 9.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :2::a:::::8::: pos.-specific C 8:32::82:8::57 probability G 2838:8::::25:3 matrix T ::3::228a2:55: bits 2.2 2.0 1.8 * * 1.5 ** **** ** Relative 1.3 ** ******** * Entropy 1.1 ** *********** (18.9 bits) 0.9 ** *********** 0.7 ** *********** 0.4 ************** 0.2 ************** 0.0 -------------- Multilevel CGCGAGCTTCAGCC consensus G TTG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 45834 371 2.37e-08 AGAAGTTCAA CGGGAGCTTCATTC CAACCGATGA 11128 374 2.16e-07 CCATGGACGA CGCGAGCTTTATCC AAAGCCTGGT 34004 51 2.97e-07 TGTGCATCAG CGTGAGTTTCATCC ATAATGCAGC 43804 94 1.01e-06 CGGACGAAGC GGGGAGCTTCGGCG AAGATCTGAT 47563 392 1.31e-06 CAAAGACAAA CATCAGCTTCAGTC ACAATTAATG 33402 452 1.60e-06 CGGGGCGAGG CGCGATCCTCAGTG GAAGTTGCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45834 2.4e-08 370_[+2]_116 11128 2.2e-07 373_[+2]_113 34004 3e-07 50_[+2]_436 43804 1e-06 93_[+2]_393 47563 1.3e-06 391_[+2]_95 33402 1.6e-06 451_[+2]_35 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=6 45834 ( 371) CGGGAGCTTCATTC 1 11128 ( 374) CGCGAGCTTTATCC 1 34004 ( 51) CGTGAGTTTCATCC 1 43804 ( 94) GGGGAGCTTCGGCG 1 47563 ( 392) CATCAGCTTCAGTC 1 33402 ( 452) CGCGATCCTCAGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 3409 bayes= 9.59577 E= 9.0e+001 -923 190 -38 -923 -77 -923 194 -923 -923 58 62 27 -923 -41 194 -923 181 -923 -923 -923 -923 -923 194 -72 -923 190 -923 -72 -923 -41 -923 159 -923 -923 -923 186 -923 190 -923 -72 155 -923 -38 -923 -923 -923 120 86 -923 117 -923 86 -923 158 62 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 6 E= 9.0e+001 0.000000 0.833333 0.166667 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.333333 0.333333 0.333333 0.000000 0.166667 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.166667 0.833333 0.000000 0.166667 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.500000 0.000000 0.500000 0.000000 0.666667 0.333333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CG[CGT]GAGCTTCA[GT][CT][CG] -------------------------------------------------------------------------------- Time 1.07 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 87 E-value = 5.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 8::26::4:2:4:::682aaa pos.-specific C :824::8::::2::82:2::: probability G ::8:28262:a26:2222::: matrix T 22:422::88:24a:::4::: bits 2.2 * 2.0 * 1.8 * * *** 1.5 * * * ** *** Relative 1.3 ** ** * ** *** Entropy 1.1 *** ****** *** * *** (25.1 bits) 0.9 *** ****** *** * *** 0.7 *** ****** ***** *** 0.4 *********** ***** *** 0.2 *********** ***** *** 0.0 --------------------- Multilevel ACGCAGCGTTGAGTCAATAAA consensus TTCTGTGAGA CT GCGA sequence AT G G C T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45834 43 2.08e-11 TCATGGCAAT ACGCAGCATTGAGTCAAAAAA ATACGGTGCT 43804 239 2.00e-10 ATACGGACCC ACGAAGCGTTGATTCCATAAA TTGGATTCAA 47563 365 4.26e-09 GTCAGCACTT ACCTTGCGTTGGGTCGACAAA GACAAACATC 47806 345 4.24e-08 CAAACAAGGC ACGTATCAGAGTGTGAATAAA AATGCTTCGA 11128 207 6.97e-08 CCCGGGATGT TTGCGGGGTTGCTTCAGGAAA GTTTCACATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45834 2.1e-11 42_[+3]_437 43804 2e-10 238_[+3]_241 47563 4.3e-09 364_[+3]_115 47806 4.2e-08 344_[+3]_135 11128 7e-08 206_[+3]_273 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 45834 ( 43) ACGCAGCATTGAGTCAAAAAA 1 43804 ( 239) ACGAAGCGTTGATTCCATAAA 1 47563 ( 365) ACCTTGCGTTGGGTCGACAAA 1 47806 ( 345) ACGTATCAGAGTGTGAATAAA 1 11128 ( 207) TTGCGGGGTTGCTTCAGGAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3360 bayes= 9.64205 E= 5.6e+002 149 -897 -897 -46 -897 185 -897 -46 -897 -15 188 -897 -51 85 -897 54 107 -897 -12 -46 -897 -897 188 -46 -897 185 -12 -897 49 -897 146 -897 -897 -897 -12 154 -51 -897 -897 154 -897 -897 220 -897 49 -15 -12 -46 -897 -897 146 54 -897 -897 -897 186 -897 185 -12 -897 107 -15 -12 -897 149 -897 -12 -897 -51 -15 -12 54 181 -897 -897 -897 181 -897 -897 -897 181 -897 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 5.6e+002 0.800000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 0.200000 0.800000 0.000000 0.200000 0.400000 0.000000 0.400000 0.600000 0.000000 0.200000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.800000 0.200000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.000000 0.200000 0.800000 0.200000 0.000000 0.000000 0.800000 0.000000 0.000000 1.000000 0.000000 0.400000 0.200000 0.200000 0.200000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.200000 0.000000 0.600000 0.200000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 0.200000 0.200000 0.200000 0.400000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AT][CT][GC][CTA][AGT][GT][CG][GA][TG][TA]G[ACGT][GT]T[CG][ACG][AG][TACG]AAA -------------------------------------------------------------------------------- Time 1.64 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47806 5.50e-06 81_[+1(3.56e-06)]_250_\ [+3(4.24e-08)]_135 11128 2.43e-09 206_[+3(6.97e-08)]_146_\ [+2(2.16e-07)]_46_[+1(3.97e-06)]_54 45834 8.86e-14 42_[+3(2.08e-11)]_16_[+1(2.22e-06)]_\ 278_[+2(2.37e-08)]_116 33402 3.00e-06 207_[+1(5.58e-08)]_231_\ [+2(1.60e-06)]_35 43804 4.56e-12 93_[+2(1.01e-06)]_131_\ [+3(2.00e-10)]_14_[+1(3.53e-07)]_214 47563 3.92e-11 107_[+1(1.27e-07)]_244_\ [+3(4.26e-09)]_6_[+2(1.31e-06)]_95 34004 1.80e-05 50_[+2(2.97e-07)]_208_\ [+1(9.85e-06)]_215 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************