******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/172/172.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31846 1.0000 500 43061 1.0000 500 13921 1.0000 500 47918 1.0000 500 54246 1.0000 500 35819 1.0000 500 44122 1.0000 500 43743 1.0000 500 49723 1.0000 500 34419 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/172/172.seqs.fa -oc motifs/172 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.255 C 0.264 G 0.223 T 0.258 Background letter frequencies (from dataset with add-one prior applied): A 0.255 C 0.264 G 0.223 T 0.258 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 4 llr = 82 E-value = 1.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :3a33aaa::88:3a355aa pos.-specific C ::::5:::58:::8:::5:: probability G a8:83:::3333a::8:::: matrix T ::::::::3:::::::5::: bits 2.2 * * 1.9 * * *** * * ** 1.7 * * *** * * ** 1.5 * * *** * * ** Relative 1.3 **** *** *** ** ** Entropy 1.1 **** *** ******* ** (29.7 bits) 0.9 **** *** *********** 0.6 **** *** *********** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GGAGCAAACCAAGCAGAAAA consensus A AA GGGG A ATC sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 31846 409 4.01e-11 TGTCATACTG GGAACAAACCAAGCAGTCAA TCCATTCCTT 43061 49 6.28e-11 CGGAAGTATC GGAGCAAAGCGAGCAGAAAA ATTTCTAGGC 44122 414 2.86e-10 ACGAAGCGAG GGAGAAAATCAAGCAAACAA ACGGATTCGT 47918 51 1.81e-09 TGTCCTGTGA GAAGGAAACGAGGAAGTAAA TCGATATCTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31846 4e-11 408_[+1]_72 43061 6.3e-11 48_[+1]_432 44122 2.9e-10 413_[+1]_67 47918 1.8e-09 50_[+1]_430 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=4 31846 ( 409) GGAACAAACCAAGCAGTCAA 1 43061 ( 49) GGAGCAAAGCGAGCAGAAAA 1 44122 ( 414) GGAGAAAATCAAGCAAACAA 1 47918 ( 51) GAAGGAAACGAGGAAGTAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4810 bayes= 10.2306 E= 1.1e+001 -865 -865 216 -865 -3 -865 174 -865 197 -865 -865 -865 -3 -865 174 -865 -3 92 16 -865 197 -865 -865 -865 197 -865 -865 -865 197 -865 -865 -865 -865 92 16 -5 -865 151 16 -865 156 -865 16 -865 156 -865 16 -865 -865 -865 216 -865 -3 151 -865 -865 197 -865 -865 -865 -3 -865 174 -865 97 -865 -865 95 97 92 -865 -865 197 -865 -865 -865 197 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 1.1e+001 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.250000 0.500000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.750000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.500000 0.000000 0.000000 0.500000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[GA]A[GA][CAG]AAA[CGT][CG][AG][AG]G[CA]A[GA][AT][AC]AA -------------------------------------------------------------------------------- Time 0.98 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 6 llr = 92 E-value = 8.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::2:::a:8::282: pos.-specific C :7:55:5:a:228::: probability G :3a:2:5:::72:28: matrix T a::33a:::227:::a bits 2.2 * 1.9 * * * ** * 1.7 * * * ** * 1.5 * * * ** ** Relative 1.3 * * * *** **** Entropy 1.1 *** ***** **** (22.0 bits) 0.9 *** ****** **** 0.6 *** *********** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TCGCCTCACAGTCAGT consensus G TT G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 47918 191 9.17e-09 ATAGGTAGCC TCGCCTCACATTCAGT TCAATGTTAG 31846 37 1.05e-08 CGTGCGTGCG TGGACTCACAGTCAGT CCCGCCCGCC 34419 410 2.70e-08 ACTGTCAGTC TGGTCTCACAGGCAGT CCGTATCTTA 49723 440 8.26e-08 ATTCCCCCTA TCGCTTGACACCCAGT GTCAGCTGGT 13921 117 9.85e-08 TGGAGACACG TCGCTTGACTGTAAGT GACCGCCTGA 54246 205 2.26e-07 CGCTGGTGGA TCGTGTGACAGTCGAT AGACCAGCCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47918 9.2e-09 190_[+2]_294 31846 1.1e-08 36_[+2]_448 34419 2.7e-08 409_[+2]_75 49723 8.3e-08 439_[+2]_45 13921 9.8e-08 116_[+2]_368 54246 2.3e-07 204_[+2]_280 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=6 47918 ( 191) TCGCCTCACATTCAGT 1 31846 ( 37) TGGACTCACAGTCAGT 1 34419 ( 410) TGGTCTCACAGGCAGT 1 49723 ( 440) TCGCTTGACACCCAGT 1 13921 ( 117) TCGCTTGACTGTAAGT 1 54246 ( 205) TCGTGTGACAGTCGAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4850 bayes= 10.105 E= 8.9e+000 -923 -923 -923 195 -923 134 58 -923 -923 -923 216 -923 -61 92 -923 37 -923 92 -42 37 -923 -923 -923 195 -923 92 116 -923 197 -923 -923 -923 -923 192 -923 -923 171 -923 -923 -63 -923 -66 158 -63 -923 -66 -42 137 -61 166 -923 -923 171 -923 -42 -923 -61 -923 190 -923 -923 -923 -923 195 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 8.9e+000 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.500000 0.000000 0.333333 0.000000 0.500000 0.166667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.166667 0.666667 0.166667 0.000000 0.166667 0.166667 0.666667 0.166667 0.833333 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[CG]G[CT][CT]T[CG]ACAGTCAGT -------------------------------------------------------------------------------- Time 1.97 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 5 llr = 68 E-value = 4.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::a:::::a22: pos.-specific C :a:8:::a:6:8 probability G :::::a:::222 matrix T a::2a:a:::6: bits 2.2 * 1.9 *** ***** 1.7 *** ***** 1.5 *** ***** Relative 1.3 ********* * Entropy 1.1 ********* * (19.5 bits) 0.9 ********* * 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TCACTGTCACTC consensus T AAG sequence GG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 31846 273 8.24e-08 GATCGTCATA TCACTGTCACTC ACCCATTCAA 34419 398 2.23e-07 ATAGAGTCAG TCACTGTCAGTC TGGTCTCACA 13921 340 5.34e-07 GTCGCCCCAC TCATTGTCACTC CATCAACCGT 54246 47 7.32e-07 TCCACGCGTG TCACTGTCAAGC ATAGCGTGAA 47918 256 1.20e-06 GTTTTCTACT TCACTGTCACAG TCAATTACCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31846 8.2e-08 272_[+3]_216 34419 2.2e-07 397_[+3]_91 13921 5.3e-07 339_[+3]_149 54246 7.3e-07 46_[+3]_442 47918 1.2e-06 255_[+3]_233 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=5 31846 ( 273) TCACTGTCACTC 1 34419 ( 398) TCACTGTCAGTC 1 13921 ( 340) TCATTGTCACTC 1 54246 ( 47) TCACTGTCAAGC 1 47918 ( 256) TCACTGTCACAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 10.184 E= 4.0e+002 -897 -897 -897 195 -897 192 -897 -897 197 -897 -897 -897 -897 160 -897 -37 -897 -897 -897 195 -897 -897 216 -897 -897 -897 -897 195 -897 192 -897 -897 197 -897 -897 -897 -35 119 -16 -897 -35 -897 -16 121 -897 160 -16 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 5 E= 4.0e+002 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.600000 0.200000 0.000000 0.200000 0.000000 0.200000 0.600000 0.000000 0.800000 0.200000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TCA[CT]TGTCA[CAG][TAG][CG] -------------------------------------------------------------------------------- Time 2.77 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31846 3.35e-15 36_[+2(1.05e-08)]_220_\ [+3(8.24e-08)]_124_[+1(4.01e-11)]_72 43061 1.89e-06 48_[+1(6.28e-11)]_432 13921 1.33e-06 116_[+2(9.85e-08)]_207_\ [+3(5.34e-07)]_149 47918 1.37e-12 50_[+1(1.81e-09)]_120_\ [+2(9.17e-09)]_49_[+3(1.20e-06)]_233 54246 2.00e-06 46_[+3(7.32e-07)]_146_\ [+2(2.26e-07)]_280 35819 9.54e-01 500 44122 7.69e-06 158_[+1(7.29e-05)]_235_\ [+1(2.86e-10)]_67 43743 8.20e-01 500 49723 5.71e-05 439_[+2(8.26e-08)]_45 34419 9.35e-08 397_[+3(2.23e-07)]_[+2(2.70e-08)]_\ 75 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************