******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/149/149.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 41676 1.0000 500 43278 1.0000 500 13076 1.0000 500 13526 1.0000 500 47054 1.0000 500 47264 1.0000 500 47929 1.0000 500 48080 1.0000 500 38686 1.0000 500 43544 1.0000 500 32846 1.0000 500 41511 1.0000 500 26029 1.0000 500 12161 1.0000 500 43320 1.0000 500 40435 1.0000 500 47627 1.0000 500 45836 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/149/149.seqs.fa -oc motifs/149 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.262 C 0.250 G 0.227 T 0.261 Background letter frequencies (from dataset with add-one prior applied): A 0.262 C 0.250 G 0.227 T 0.261 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 11 llr = 128 E-value = 8.7e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 7:::a:51::82 pos.-specific C :82::a1:::14 probability G 2:1a::19:a:5 matrix T 127:::4:a:1: bits 2.1 * * 1.9 *** ** 1.7 *** *** 1.5 *** *** Relative 1.3 * *** *** Entropy 1.1 * *** **** (16.8 bits) 0.9 ****** **** 0.6 ****** ***** 0.4 ****** ***** 0.2 ************ 0.0 ------------ Multilevel ACTGACAGTGAG consensus T C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 38686 224 5.33e-08 ATTGACAGTG ACTGACAGTGAG TAGAATGCCA 47054 273 1.65e-07 CGATACTGAA ACTGACAGTGAC CGTCACAATT 45836 412 3.31e-07 CCGTGAGTGA GCTGACAGTGAG TGACTGACGT 12161 301 4.43e-07 AAATCTCCAT ACTGACTGTGAA TGCCGATTGC 43544 329 4.89e-07 CACAGACAGG ACTGACGGTGAG CGGAAACAGC 47627 240 9.06e-07 CACTGATATT ATTGACTGTGAG GCTTCTTGTG 43278 196 9.06e-07 TACTTTCACT ATTGACTGTGAG ATGCTGGTCT 43320 467 8.62e-06 CCGCTTTTCC ACCGACTGTGTC CTTTGGATCC 47929 19 9.55e-06 GCAAATAACA GCGGACAGTGAA CTTGAAGTTT 13526 350 9.55e-06 CCCGTGCCAT ACTGACCATGAC GACTTCTGTA 41676 483 2.27e-05 CACGGTATTA TCCGACAGTGCC ATTTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38686 5.3e-08 223_[+1]_265 47054 1.6e-07 272_[+1]_216 45836 3.3e-07 411_[+1]_77 12161 4.4e-07 300_[+1]_188 43544 4.9e-07 328_[+1]_160 47627 9.1e-07 239_[+1]_249 43278 9.1e-07 195_[+1]_293 43320 8.6e-06 466_[+1]_22 47929 9.5e-06 18_[+1]_470 13526 9.5e-06 349_[+1]_139 41676 2.3e-05 482_[+1]_6 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=11 38686 ( 224) ACTGACAGTGAG 1 47054 ( 273) ACTGACAGTGAC 1 45836 ( 412) GCTGACAGTGAG 1 12161 ( 301) ACTGACTGTGAA 1 43544 ( 329) ACTGACGGTGAG 1 47627 ( 240) ATTGACTGTGAG 1 43278 ( 196) ATTGACTGTGAG 1 43320 ( 467) ACCGACTGTGTC 1 47929 ( 19) GCGGACAGTGAA 1 13526 ( 350) ACTGACCATGAC 1 41676 ( 483) TCCGACAGTGCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 9.99787 E= 8.7e-003 147 -1010 -32 -152 -1010 171 -1010 -52 -1010 -46 -132 148 -1010 -1010 214 -1010 193 -1010 -1010 -1010 -1010 200 -1010 -1010 80 -146 -132 48 -152 -1010 200 -1010 -1010 -1010 -1010 194 -1010 -1010 214 -1010 164 -146 -1010 -152 -52 54 100 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 8.7e-003 0.727273 0.000000 0.181818 0.090909 0.000000 0.818182 0.000000 0.181818 0.000000 0.181818 0.090909 0.727273 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.454545 0.090909 0.090909 0.363636 0.090909 0.000000 0.909091 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.818182 0.090909 0.000000 0.090909 0.181818 0.363636 0.454545 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- ACTGAC[AT]GTGA[GC] -------------------------------------------------------------------------------- Time 3.66 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 12 llr = 166 E-value = 5.7e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 882565287a452828382:9 pos.-specific C :1834:3:3::14:82326a: probability G 21:1:453::33331:312:1 matrix T :::1:1::::312:::1:1:: bits 2.1 1.9 * * 1.7 * * 1.5 * ** Relative 1.3 * * * * ** Entropy 1.1 *** * *** * * ** (19.9 bits) 0.9 *** * *** *** * ** 0.6 *** ****** *** * ** 0.4 ************ *** **** 0.2 ********************* 0.0 --------------------- Multilevel AACAAAGAAAAACACAAACCA consensus CCGCGC TGGG C sequence G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 41511 5 1.13e-09 GACA AACAAGCAAAGGAACACACCA ATTCCACACG 38686 324 2.46e-09 TGGACATCGC AACCAAGAAAAACACAGATCA GCGCCAAATC 12161 148 1.80e-08 CACAGTCTGA AACGCAGACAGACGCACACCA CGGGAACACC 45836 336 4.41e-08 CAAAATAATG AACTCGGGCATGGACAAACCA GATGGCAAGA 47929 473 8.97e-08 GTACGAAACG AGCAAGCAAAAAAACATACCA ATACACG 13076 317 2.23e-07 CCCATTCAAA GAACCAGAAAAATACAAAGCA TAGCAAACGA 47627 111 3.38e-07 GACGGTTCAC ACCACACAAAACCACCGACCA CCGTTTGGAT 43544 153 6.25e-07 AGTGCTGATT GACAAACACATGGAGACCCCA CGGAATACGT 43278 376 6.25e-07 AATCGGAGCT AACCAGGAAAATGGCACAACG CGAACGACAA 13526 291 6.72e-07 CGACTGCGTT AACAATAAAATGCAAAAGCCA AACAACACTG 47054 45 1.16e-06 GCTATATCAT AAACAGAGCATACACCAAGCA TTTGATTATT 43320 11 1.41e-06 ACTACGGGAC AACACAGGAAGATGAAGCACA CACCTTTCGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41511 1.1e-09 4_[+2]_475 38686 2.5e-09 323_[+2]_156 12161 1.8e-08 147_[+2]_332 45836 4.4e-08 335_[+2]_144 47929 9e-08 472_[+2]_7 13076 2.2e-07 316_[+2]_163 47627 3.4e-07 110_[+2]_369 43544 6.3e-07 152_[+2]_327 43278 6.3e-07 375_[+2]_104 13526 6.7e-07 290_[+2]_189 47054 1.2e-06 44_[+2]_435 43320 1.4e-06 10_[+2]_469 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=12 41511 ( 5) AACAAGCAAAGGAACACACCA 1 38686 ( 324) AACCAAGAAAAACACAGATCA 1 12161 ( 148) AACGCAGACAGACGCACACCA 1 45836 ( 336) AACTCGGGCATGGACAAACCA 1 47929 ( 473) AGCAAGCAAAAAAACATACCA 1 13076 ( 317) GAACCAGAAAAATACAAAGCA 1 47627 ( 111) ACCACACAAAACCACCGACCA 1 43544 ( 153) GACAAACACATGGAGACCCCA 1 43278 ( 376) AACCAGGAAAATGGCACAACG 1 13526 ( 291) AACAATAAAATGCAAAAGCCA 1 47054 ( 45) AAACAGAGCATACACCAAGCA 1 43320 ( 11) AACACAGGAAGATGAAGCACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 9.93784 E= 5.7e-001 167 -1023 -44 -1023 167 -158 -144 -1023 -65 173 -1023 -1023 93 41 -144 -165 116 73 -1023 -1023 93 -1023 88 -165 -65 41 114 -1023 152 -1023 14 -1023 135 41 -1023 -1023 193 -1023 -1023 -1023 67 -1023 14 35 93 -158 55 -165 -65 73 14 -65 152 -1023 14 -1023 -65 158 -144 -1023 167 -59 -1023 -1023 35 41 14 -165 152 -59 -144 -1023 -65 122 -44 -165 -1023 200 -1023 -1023 181 -1023 -144 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 12 E= 5.7e-001 0.833333 0.000000 0.166667 0.000000 0.833333 0.083333 0.083333 0.000000 0.166667 0.833333 0.000000 0.000000 0.500000 0.333333 0.083333 0.083333 0.583333 0.416667 0.000000 0.000000 0.500000 0.000000 0.416667 0.083333 0.166667 0.333333 0.500000 0.000000 0.750000 0.000000 0.250000 0.000000 0.666667 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.416667 0.000000 0.250000 0.333333 0.500000 0.083333 0.333333 0.083333 0.166667 0.416667 0.250000 0.166667 0.750000 0.000000 0.250000 0.000000 0.166667 0.750000 0.083333 0.000000 0.833333 0.166667 0.000000 0.000000 0.333333 0.333333 0.250000 0.083333 0.750000 0.166667 0.083333 0.000000 0.166667 0.583333 0.166667 0.083333 0.000000 1.000000 0.000000 0.000000 0.916667 0.000000 0.083333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- AAC[AC][AC][AG][GC][AG][AC]A[ATG][AG][CG][AG]CA[ACG]ACCA -------------------------------------------------------------------------------- Time 7.21 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 2 llr = 45 E-value = 2.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::::::::a::::: pos.-specific C ::::::a::5:a:::a probability G aaa:aa:aa5::aaa: matrix T :::a:::::::::::: bits 2.1 *** ** ** *** 1.9 ********* ****** 1.7 ********* ****** 1.5 ********* ****** Relative 1.3 ********* ****** Entropy 1.1 **************** (32.3 bits) 0.9 **************** 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GGGTGGCGGCACGGGC consensus G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 38686 387 8.81e-11 GTGTAGAATT GGGTGGCGGGACGGGC TTCAACGTGA 32846 23 1.85e-10 AAGGTTCGGC GGGTGGCGGCACGGGC AAGGAAGCTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38686 8.8e-11 386_[+3]_98 32846 1.9e-10 22_[+3]_462 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=2 38686 ( 387) GGGTGGCGGGACGGGC 1 32846 ( 23) GGGTGGCGGCACGGGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8730 bayes= 12.0914 E= 2.5e+002 -765 -765 213 -765 -765 -765 213 -765 -765 -765 213 -765 -765 -765 -765 193 -765 -765 213 -765 -765 -765 213 -765 -765 199 -765 -765 -765 -765 213 -765 -765 -765 213 -765 -765 99 114 -765 193 -765 -765 -765 -765 199 -765 -765 -765 -765 213 -765 -765 -765 213 -765 -765 -765 213 -765 -765 199 -765 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 2 E= 2.5e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GGGTGGCGG[CG]ACGGGC -------------------------------------------------------------------------------- Time 10.40 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41676 7.10e-02 482_[+1(2.27e-05)]_6 43278 1.88e-05 195_[+1(9.06e-07)]_168_\ [+2(6.25e-07)]_104 13076 1.88e-04 255_[+1(6.46e-05)]_49_\ [+2(2.23e-07)]_163 13526 7.99e-05 290_[+2(6.72e-07)]_38_\ [+1(9.55e-06)]_139 47054 6.53e-06 44_[+2(1.16e-06)]_207_\ [+1(1.65e-07)]_216 47264 1.16e-01 298_[+3(5.95e-05)]_186 47929 4.61e-06 18_[+1(9.55e-06)]_442_\ [+2(8.97e-08)]_7 48080 2.92e-01 500 38686 1.17e-15 223_[+1(5.33e-08)]_88_\ [+2(2.46e-09)]_42_[+3(8.81e-11)]_98 43544 7.52e-06 152_[+2(6.25e-07)]_155_\ [+1(4.89e-07)]_160 32846 7.78e-07 22_[+3(1.85e-10)]_102_\ [+3(6.97e-05)]_344 41511 1.94e-05 4_[+2(1.13e-09)]_475 26029 2.55e-01 500 12161 2.06e-07 147_[+2(1.80e-08)]_7_[+2(6.10e-05)]_\ 104_[+1(4.43e-07)]_188 43320 2.32e-04 10_[+2(1.41e-06)]_435_\ [+1(8.62e-06)]_22 40435 2.48e-01 500 47627 2.38e-06 110_[+2(3.38e-07)]_108_\ [+1(9.06e-07)]_249 45836 5.41e-07 335_[+2(4.41e-08)]_40_\ [+1(2.05e-05)]_3_[+1(3.31e-07)]_77 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************