******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/164/164.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 17504 1.0000 500 42867 1.0000 500 43081 1.0000 500 46871 1.0000 500 47559 1.0000 500 49777 1.0000 500 10261 1.0000 500 45064 1.0000 500 19661 1.0000 500 27278 1.0000 500 43939 1.0000 500 47103 1.0000 500 43205 1.0000 500 46474 1.0000 500 36529 1.0000 500 35416 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/164/164.seqs.fa -oc motifs/164 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.258 C 0.261 G 0.217 T 0.265 Background letter frequencies (from dataset with add-one prior applied): A 0.258 C 0.260 G 0.217 T 0.265 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 16 llr = 148 E-value = 2.7e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1:8:4:::8542 pos.-specific C :2:81::113:3 probability G :812:a:81234 matrix T 9:1:6:a1:132 bits 2.2 * 2.0 ** 1.8 ** 1.5 ** ** Relative 1.3 ** * *** Entropy 1.1 ** * **** (13.3 bits) 0.9 **** **** 0.7 ********* 0.4 ********* * 0.2 *********** 0.0 ------------ Multilevel TGACTGTGAAAG consensus A CTC sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47559 388 4.75e-08 AGCTAGTCGA TGACTGTGAAAG ACCCGGTACC 46871 261 1.42e-06 TCCTTGTTGT TGACAGTGAATC TACGGTATAC 45064 343 2.39e-06 TCATACCGAC TGACTGTGACTC ACACGCCAAA 36529 70 3.74e-06 TGGATATCGT TGACAGTGACTC CCACCTCGAT 49777 341 4.08e-06 GGTATTTGAC TGACTGTGAGGT TTATCTGTGA 35416 2 4.89e-06 A TGGCAGTGAATG CGAGTTGCCA 47103 387 1.23e-05 CTGACGAGAC AGACAGTGAAAG ATCCGACCAT 43205 154 2.45e-05 GTAACAAAAT TGACTGTTAAAA TTATGAAGGC 19661 105 2.69e-05 TTTCAATCAC TGACAGTCACAC ACACTATGCT 43939 284 3.70e-05 ATCCGAATTT TCACAGTGACGT GTTTTGCCCC 43081 224 4.24e-05 GTTGACACAT TGTCTGTGATTG GTTGGTATAA 27278 289 4.93e-05 GATTGACCCG TCACTGTGCAAT ACGTCACGCC 10261 285 7.07e-05 ATCTTTCCTA TGACTGTCGAGG TCTCGGATGC 42867 389 7.07e-05 TGTGTGTGTG TGTGTGTGAGAA GTGTCTACCC 46474 50 1.64e-04 TCGGCGACGA TGGGTGTGCGGG TGGTACCTGA 17504 238 1.64e-04 CAGCACTTAC TCAGCGTGAAAA TATGTCGGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47559 4.8e-08 387_[+1]_101 46871 1.4e-06 260_[+1]_228 45064 2.4e-06 342_[+1]_146 36529 3.7e-06 69_[+1]_419 49777 4.1e-06 340_[+1]_148 35416 4.9e-06 1_[+1]_487 47103 1.2e-05 386_[+1]_102 43205 2.4e-05 153_[+1]_335 19661 2.7e-05 104_[+1]_384 43939 3.7e-05 283_[+1]_205 43081 4.2e-05 223_[+1]_265 27278 4.9e-05 288_[+1]_200 10261 7.1e-05 284_[+1]_204 42867 7.1e-05 388_[+1]_100 46474 0.00016 49_[+1]_439 17504 0.00016 237_[+1]_251 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=16 47559 ( 388) TGACTGTGAAAG 1 46871 ( 261) TGACAGTGAATC 1 45064 ( 343) TGACTGTGACTC 1 36529 ( 70) TGACAGTGACTC 1 49777 ( 341) TGACTGTGAGGT 1 35416 ( 2) TGGCAGTGAATG 1 47103 ( 387) AGACAGTGAAAG 1 43205 ( 154) TGACTGTTAAAA 1 19661 ( 105) TGACAGTCACAC 1 43939 ( 284) TCACAGTGACGT 1 43081 ( 224) TGTCTGTGATTG 1 27278 ( 289) TCACTGTGCAAT 1 10261 ( 285) TGACTGTCGAGG 1 42867 ( 389) TGTGTGTGAGAA 1 46474 ( 50) TGGGTGTGCGGG 1 17504 ( 238) TCAGCGTGAAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7824 bayes= 8.93074 E= 2.7e+000 -204 -1064 -1064 182 -1064 -47 190 -1064 154 -1064 -80 -108 -1064 164 -21 -1064 54 -206 -1064 109 -1064 -1064 220 -1064 -1064 -1064 -1064 192 -1064 -106 190 -208 166 -106 -179 -1064 96 -6 -21 -208 76 -1064 20 24 -46 -6 79 -50 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 16 E= 2.7e+000 0.062500 0.000000 0.000000 0.937500 0.000000 0.187500 0.812500 0.000000 0.750000 0.000000 0.125000 0.125000 0.000000 0.812500 0.187500 0.000000 0.375000 0.062500 0.000000 0.562500 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.125000 0.812500 0.062500 0.812500 0.125000 0.062500 0.000000 0.500000 0.250000 0.187500 0.062500 0.437500 0.000000 0.250000 0.312500 0.187500 0.250000 0.375000 0.187500 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TGAC[TA]GTGA[AC][ATG][GC] -------------------------------------------------------------------------------- Time 2.08 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 16 llr = 147 E-value = 7.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 52599817a7:8 pos.-specific C 1741:21::352 probability G 411:::83::5: matrix T 1:1:11:::1:: bits 2.2 2.0 * 1.8 * 1.5 ** * Relative 1.3 ** * * Entropy 1.1 ** *** ** (13.2 bits) 0.9 * ********* 0.7 * ********* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel ACAAAAGAAACA consensus G C G CG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47103 428 6.20e-08 TGAGTCGTAC ACAAAAGAAAGA CGGCTCAGCA 43939 196 5.56e-07 AGTCATGTGC ACAAAAGGAAGA TTTTCAGCAA 36529 394 6.04e-06 GACAATTATA TCAAAAGAAACA AACATCCAGC 27278 308 1.20e-05 AATACGTCAC GCCAATGAAAGA CCCCCGGATA 49777 119 1.20e-05 AGCAATGGCG ACACAAGAAACA TGAGCAGTTG 17504 295 1.44e-05 CGAGACTATC GAAAAAGGACGA CAACCGAAAA 46871 40 1.87e-05 GAGGGCGCCC GAGAAAGAAAGA ACGATTGCCG 43081 55 1.87e-05 AACGCCCAAA ACCAACGAAACC GACCCAACCA 35416 420 2.23e-05 AAGAGATACT ACCAAACAACCA AAGCTGTCCA 42867 27 2.23e-05 ACACTAGTTG ACCAAACAACCA AAAGCATCGT 43205 131 2.70e-05 GGGGCAGCAA GAAAAAGGAACC CGTAACAAAA 19661 36 3.71e-05 GTGCCAACTA GCCAAAAGACGA GGTCGCTGAC 47559 407 6.82e-05 AAGACCCGGT ACCAAAAAATGA AGCAAAAGCG 46474 488 7.36e-05 GCTTTCCTAA ACTATAGAAAGA A 45064 293 1.02e-04 CAGAATATTT CGAAACGAAACA TTGCAACATT 10261 449 1.09e-04 AGCCGCCTAT GGAAACGGAACC GCCTATGGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47103 6.2e-08 427_[+2]_61 43939 5.6e-07 195_[+2]_293 36529 6e-06 393_[+2]_95 27278 1.2e-05 307_[+2]_181 49777 1.2e-05 118_[+2]_370 17504 1.4e-05 294_[+2]_194 46871 1.9e-05 39_[+2]_449 43081 1.9e-05 54_[+2]_434 35416 2.2e-05 419_[+2]_69 42867 2.2e-05 26_[+2]_462 43205 2.7e-05 130_[+2]_358 19661 3.7e-05 35_[+2]_453 47559 6.8e-05 406_[+2]_82 46474 7.4e-05 487_[+2]_1 45064 0.0001 292_[+2]_196 10261 0.00011 448_[+2]_40 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=16 47103 ( 428) ACAAAAGAAAGA 1 43939 ( 196) ACAAAAGGAAGA 1 36529 ( 394) TCAAAAGAAACA 1 27278 ( 308) GCCAATGAAAGA 1 49777 ( 119) ACACAAGAAACA 1 17504 ( 295) GAAAAAGGACGA 1 46871 ( 40) GAGAAAGAAAGA 1 43081 ( 55) ACCAACGAAACC 1 35416 ( 420) ACCAAACAACCA 1 42867 ( 27) ACCAAACAACCA 1 43205 ( 131) GAAAAAGGAACC 1 19661 ( 36) GCCAAAAGACGA 1 47559 ( 407) ACCAAAAAATGA 1 46474 ( 488) ACTATAGAAAGA 1 45064 ( 293) CGAAACGAAACA 1 10261 ( 449) GGAAACGGAACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7824 bayes= 9.66888 E= 7.4e+001 96 -206 79 -208 -46 140 -80 -1064 96 53 -179 -208 186 -206 -1064 -1064 186 -1064 -1064 -208 154 -47 -1064 -208 -104 -106 179 -1064 141 -1064 52 -1064 196 -1064 -1064 -1064 141 -6 -1064 -208 -1064 94 120 -1064 166 -47 -1064 -1064 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 16 E= 7.4e+001 0.500000 0.062500 0.375000 0.062500 0.187500 0.687500 0.125000 0.000000 0.500000 0.375000 0.062500 0.062500 0.937500 0.062500 0.000000 0.000000 0.937500 0.000000 0.000000 0.062500 0.750000 0.187500 0.000000 0.062500 0.125000 0.125000 0.750000 0.000000 0.687500 0.000000 0.312500 0.000000 1.000000 0.000000 0.000000 0.000000 0.687500 0.250000 0.000000 0.062500 0.000000 0.500000 0.500000 0.000000 0.812500 0.187500 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AG]C[AC]AAAG[AG]A[AC][CG]A -------------------------------------------------------------------------------- Time 4.18 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 9 llr = 103 E-value = 6.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::4::::::21: pos.-specific C 2::a:::::422 probability G :a4:a2a:734: matrix T 8:1::8:a3:28 bits 2.2 * * * 2.0 * ** ** 1.8 * ** ** 1.5 * ** ** Relative 1.3 * ***** Entropy 1.1 ** ****** * (16.6 bits) 0.9 ** ****** * 0.7 ********* * 0.4 ********** * 0.2 ************ 0.0 ------------ Multilevel TGACGTGTGCGT consensus C G G TGCC sequence AT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 42867 49 1.40e-07 AAAGCATCGT TGACGTGTGGGT ACGGATTGGA 47103 242 7.77e-07 AGACAGCAGC TGGCGGGTGGGT CATTCGAAGC 17504 10 2.34e-06 TTCCTGTGC TGACGTGTTGTT CAGGTTTGCG 43081 319 2.69e-06 CCGAGACATT CGGCGTGTGCCT GTCGTCCGTT 27278 385 3.78e-06 CCTGAACAGG TGACGTGTTCGC GCAGTCGAAT 19661 238 4.44e-06 GCTCCAACCC TGTCGTGTTCGT TCGTTCGCTC 43205 185 4.85e-06 CCGAACACGT TGGCGGGTGATT CCGAACCAAC 45064 180 5.47e-06 GTATTCTCTG CGGCGTGTGACT CATGTGAATC 36529 313 6.29e-06 CAATAGGGAC TGACGTGTGCAC CAGAACAGCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42867 1.4e-07 48_[+3]_440 47103 7.8e-07 241_[+3]_247 17504 2.3e-06 9_[+3]_479 43081 2.7e-06 318_[+3]_170 27278 3.8e-06 384_[+3]_104 19661 4.4e-06 237_[+3]_251 43205 4.9e-06 184_[+3]_304 45064 5.5e-06 179_[+3]_309 36529 6.3e-06 312_[+3]_176 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=9 42867 ( 49) TGACGTGTGGGT 1 47103 ( 242) TGGCGGGTGGGT 1 17504 ( 10) TGACGTGTTGTT 1 43081 ( 319) CGGCGTGTGCCT 1 27278 ( 385) TGACGTGTTCGC 1 19661 ( 238) TGTCGTGTTCGT 1 43205 ( 185) TGGCGGGTGATT 1 45064 ( 180) CGGCGTGTGACT 1 36529 ( 313) TGACGTGTGCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7824 bayes= 9.89655 E= 6.1e+001 -982 -23 -982 155 -982 -982 220 -982 79 -982 103 -125 -982 194 -982 -982 -982 -982 220 -982 -982 -982 3 155 -982 -982 220 -982 -982 -982 -982 192 -982 -982 162 33 -21 77 62 -982 -121 -23 103 -25 -982 -23 -982 155 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 6.1e+001 0.000000 0.222222 0.000000 0.777778 0.000000 0.000000 1.000000 0.000000 0.444444 0.000000 0.444444 0.111111 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.222222 0.777778 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.222222 0.444444 0.333333 0.000000 0.111111 0.222222 0.444444 0.222222 0.000000 0.222222 0.000000 0.777778 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TC]G[AG]CG[TG]GT[GT][CGA][GCT][TC] -------------------------------------------------------------------------------- Time 6.18 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17504 7.25e-05 9_[+3(2.34e-06)]_273_[+2(1.44e-05)]_\ 194 42867 4.32e-06 26_[+2(2.23e-05)]_10_[+3(1.40e-07)]_\ 328_[+1(7.07e-05)]_100 43081 3.25e-05 54_[+2(1.87e-05)]_157_\ [+1(4.24e-05)]_83_[+3(2.69e-06)]_170 46871 2.56e-04 39_[+2(1.87e-05)]_209_\ [+1(1.42e-06)]_228 47559 8.22e-05 387_[+1(4.75e-08)]_7_[+2(6.82e-05)]_\ 82 49777 7.28e-04 118_[+2(1.20e-05)]_210_\ [+1(4.08e-06)]_148 10261 8.57e-03 284_[+1(7.07e-05)]_204 45064 2.13e-05 179_[+3(5.47e-06)]_151_\ [+1(2.39e-06)]_146 19661 6.14e-05 35_[+2(3.71e-05)]_57_[+1(2.69e-05)]_\ 121_[+3(4.44e-06)]_251 27278 3.38e-05 288_[+1(4.93e-05)]_7_[+2(1.20e-05)]_\ 65_[+3(3.78e-06)]_104 43939 1.45e-04 195_[+2(5.56e-07)]_76_\ [+1(3.70e-05)]_205 47103 2.06e-08 241_[+3(7.77e-07)]_133_\ [+1(1.23e-05)]_29_[+2(6.20e-08)]_61 43205 4.64e-05 130_[+2(2.70e-05)]_11_\ [+1(2.45e-05)]_19_[+3(4.85e-06)]_304 46474 4.81e-02 487_[+2(7.36e-05)]_1 36529 2.97e-06 69_[+1(3.74e-06)]_231_\ [+3(6.29e-06)]_69_[+2(6.04e-06)]_95 35416 1.30e-03 1_[+1(4.89e-06)]_406_[+2(2.23e-05)]_\ 69 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************