******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/131/131.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10525 1.0000 500 11594 1.0000 500 2120 1.0000 500 22183 1.0000 500 23544 1.0000 500 24207 1.0000 500 24273 1.0000 500 30310 1.0000 500 3176 1.0000 500 38652 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/131/131.seqs.fa -oc motifs/131 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.263 C 0.234 G 0.244 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.263 C 0.234 G 0.244 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 9 llr = 123 E-value = 2.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1:14:47::2364183:339a pos.-specific C :9:2a419714212:3277:: probability G :122:1::2:::14:36::1: matrix T 9:71::211722322:2:::: bits 2.1 * 1.9 * * 1.7 * * * * 1.5 ** * * ** Relative 1.3 ** * * * ** Entropy 1.0 ** * * * **** (19.7 bits) 0.8 *** * ** * **** 0.6 *** ****** * * ***** 0.4 *** ******** ******* 0.2 ********************* 0.0 --------------------- Multilevel TCTACAACCTCAAGAAGCCAA consensus GC CT GAACTCTCCAA sequence G TT T GT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 11594 194 1.05e-11 TCTACCGAGA TCTACAACCTAATGACGCCAA ATCTGCTCGC 24273 7 5.17e-09 AGAGTA TCGACCACCAAAATAGGCCAA CACTTTGAAA 3176 185 2.39e-08 TGAACTGCCT TCTACAACGTCCAAAAGCAAA GGAACTGAAA 22183 426 2.20e-07 CACGCACTAA TCTTCCACCTCTGCTCTCCAA TGCCACAGAT 10525 21 2.61e-07 CCTCTCTGTC TCTGCACCCTCCTGAGGACGA TGCACAAGCA 2120 48 7.68e-07 CGACGGTGCA ACTGCAACGTTTTGAGCACAA AACTGACACA 23544 386 1.31e-06 TGATGGTAGC TCGCCCTCTTCAACTACCAAA GATATCTTGT 24207 125 1.99e-06 CCAAAAACAA TCACCCTCCAAACTACTACAA CTCAATAGAG 38652 289 2.63e-06 AGAAACGCCT TGTACGATCCTAAGAAGCAAA TGTGCGATCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11594 1e-11 193_[+1]_286 24273 5.2e-09 6_[+1]_473 3176 2.4e-08 184_[+1]_295 22183 2.2e-07 425_[+1]_54 10525 2.6e-07 20_[+1]_459 2120 7.7e-07 47_[+1]_432 23544 1.3e-06 385_[+1]_94 24207 2e-06 124_[+1]_355 38652 2.6e-06 288_[+1]_191 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=9 11594 ( 194) TCTACAACCTAATGACGCCAA 1 24273 ( 7) TCGACCACCAAAATAGGCCAA 1 3176 ( 185) TCTACAACGTCCAAAAGCAAA 1 22183 ( 426) TCTTCCACCTCTGCTCTCCAA 1 10525 ( 21) TCTGCACCCTCCTGAGGACGA 1 2120 ( 48) ACTGCAACGTTTTGAGCACAA 1 23544 ( 386) TCGCCCTCTTCAACTACCAAA 1 24207 ( 125) TCACCCTCCAAACTACTACAA 1 38652 ( 289) TGTACGATCCTAAGAAGCAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 9.19073 E= 2.6e+002 -124 -982 -982 177 -982 193 -113 -982 -124 -982 -13 136 76 -7 -13 -122 -982 210 -982 -982 76 93 -113 -982 134 -107 -982 -22 -982 193 -982 -122 -982 151 -13 -122 -24 -107 -982 136 34 93 -982 -22 108 -7 -982 -22 76 -107 -113 36 -124 -7 87 -22 156 -982 -982 -22 34 51 45 -982 -982 -7 119 -22 34 151 -982 -982 34 151 -982 -982 176 -982 -113 -982 193 -982 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 2.6e+002 0.111111 0.000000 0.000000 0.888889 0.000000 0.888889 0.111111 0.000000 0.111111 0.000000 0.222222 0.666667 0.444444 0.222222 0.222222 0.111111 0.000000 1.000000 0.000000 0.000000 0.444444 0.444444 0.111111 0.000000 0.666667 0.111111 0.000000 0.222222 0.000000 0.888889 0.000000 0.111111 0.000000 0.666667 0.222222 0.111111 0.222222 0.111111 0.000000 0.666667 0.333333 0.444444 0.000000 0.222222 0.555556 0.222222 0.000000 0.222222 0.444444 0.111111 0.111111 0.333333 0.111111 0.222222 0.444444 0.222222 0.777778 0.000000 0.000000 0.222222 0.333333 0.333333 0.333333 0.000000 0.000000 0.222222 0.555556 0.222222 0.333333 0.666667 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.888889 0.000000 0.111111 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TC[TG][ACG]C[AC][AT]C[CG][TA][CAT][ACT][AT][GCT][AT][ACG][GCT][CA][CA]AA -------------------------------------------------------------------------------- Time 0.89 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 10 llr = 112 E-value = 4.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::2:::15::2:3:3 pos.-specific C :95::234222:1:1 probability G 412a:36::51a::: matrix T 6:1:a5:1835:6a6 bits 2.1 * * 1.9 ** * * 1.7 * ** * * 1.5 * ** * * Relative 1.3 * ** * * * Entropy 1.0 ** ** * * * (16.1 bits) 0.8 ** ** * * * * 0.6 ** ** **** **** 0.4 ** ******* **** 0.2 *************** 0.0 --------------- Multilevel TCCGTTGATGTGTTT consensus G A GCCCTA A A sequence G C CC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 23544 481 1.07e-09 ATTTCGTCCT TCCGTTGATGTGTTT ATTGG 3176 478 1.73e-06 AGAAGCAATC GCAGTGGCCGTGTTT TGCAACTC 38652 456 2.38e-06 GATTGTTCCA TCCGTTACTTCGTTT ACTTCATCTC 2120 16 3.80e-06 CTTGTAGTTG GCTGTGGCTGTGATA CTGCCGACGA 10525 85 3.80e-06 TCCATCAAGG TGCGTTGATGGGTTT GACTAGCTGG 24207 458 4.93e-06 TTATTCGTGT TCGGTTCATCAGTTA TATCTGTTAG 11594 327 4.93e-06 CTGGGTGAAT TCCGTCGACTTGATA GGAGCCGCCA 22183 71 5.35e-06 CACTCATTCA TCGGTTCATCTGCTT GACGGTGGTT 30310 311 1.04e-05 CGCTCGAGTT GCAGTGGCTTCGTTC TCAAAGATGG 24273 49 1.10e-05 ACGGAGACAT GCCGTCCTTGAGATT GCCGACGGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23544 1.1e-09 480_[+2]_5 3176 1.7e-06 477_[+2]_8 38652 2.4e-06 455_[+2]_30 2120 3.8e-06 15_[+2]_470 10525 3.8e-06 84_[+2]_401 24207 4.9e-06 457_[+2]_28 11594 4.9e-06 326_[+2]_159 22183 5.4e-06 70_[+2]_415 30310 1e-05 310_[+2]_175 24273 1.1e-05 48_[+2]_437 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=10 23544 ( 481) TCCGTTGATGTGTTT 1 3176 ( 478) GCAGTGGCCGTGTTT 1 38652 ( 456) TCCGTTACTTCGTTT 1 2120 ( 16) GCTGTGGCTGTGATA 1 10525 ( 85) TGCGTTGATGGGTTT 1 24207 ( 458) TCGGTTCATCAGTTA 1 11594 ( 327) TCCGTCGACTTGATA 1 22183 ( 71) TCGGTTCATCTGCTT 1 30310 ( 311) GCAGTGGCTTCGTTC 1 24273 ( 49) GCCGTCCTTGAGATT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4860 bayes= 8.92184 E= 4.8e+001 -997 -997 71 121 -997 194 -128 -997 -39 110 -29 -137 -997 -997 204 -997 -997 -997 -997 194 -997 -22 30 94 -139 36 130 -997 93 78 -997 -137 -997 -22 -997 162 -997 -22 104 21 -39 -22 -128 94 -997 -997 204 -997 19 -122 -997 121 -997 -997 -997 194 19 -122 -997 121 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 10 E= 4.8e+001 0.000000 0.000000 0.400000 0.600000 0.000000 0.900000 0.100000 0.000000 0.200000 0.500000 0.200000 0.100000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.300000 0.500000 0.100000 0.300000 0.600000 0.000000 0.500000 0.400000 0.000000 0.100000 0.000000 0.200000 0.000000 0.800000 0.000000 0.200000 0.500000 0.300000 0.200000 0.200000 0.100000 0.500000 0.000000 0.000000 1.000000 0.000000 0.300000 0.100000 0.000000 0.600000 0.000000 0.000000 0.000000 1.000000 0.300000 0.100000 0.000000 0.600000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TG]C[CAG]GT[TGC][GC][AC][TC][GTC][TAC]G[TA]T[TA] -------------------------------------------------------------------------------- Time 2.07 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 6 llr = 102 E-value = 3.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :83:72:83a:a2a5:852a pos.-specific C 7::8:22::::::::2253: probability G 3::233725:a:8:38::5: matrix T :27::32:2:::::2::::: bits 2.1 * 1.9 *** * * 1.7 *** * * 1.5 * ***** * * Relative 1.3 ** * * ***** ** * Entropy 1.0 ***** * ***** *** * (24.5 bits) 0.8 ***** ** ***** *** * 0.6 ***** ******** ***** 0.4 ***** ************** 0.2 ***** ************** 0.0 -------------------- Multilevel CATCAGGAGAGAGAAGAAGA consensus G A GT A G CC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 3176 3 1.09e-09 CA GATGAGGAGAGAGAAGAAGA ATGTGAATGG 22183 322 2.19e-09 GATTGACCTT CAACACGATAGAGAGGAAGA GTTTGCTTGC 23544 292 4.95e-09 AGCTGCCATT GATCATGGGAGAGATGAAGA AAGTATGAGA 38652 53 1.37e-08 TTGATCATAT CTTCGGTAAAGAGAAGACCA CAATCAGTGC 24273 238 1.85e-08 CAGTGTTGAG CAACAAGAGAGAGAACCCCA TCTCGCGCGC 2120 346 3.77e-08 TTAAATTTTG CATCGTCAAAGAAAGGACAA CAAAATCGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3176 1.1e-09 2_[+3]_478 22183 2.2e-09 321_[+3]_159 23544 5e-09 291_[+3]_189 38652 1.4e-08 52_[+3]_428 24273 1.8e-08 237_[+3]_243 2120 3.8e-08 345_[+3]_135 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=6 3176 ( 3) GATGAGGAGAGAGAAGAAGA 1 22183 ( 322) CAACACGATAGAGAGGAAGA 1 23544 ( 292) GATCATGGGAGAGATGAAGA 1 38652 ( 53) CTTCGGTAAAGAGAAGACCA 1 24273 ( 238) CAACAAGAGAGAGAACCCCA 1 2120 ( 346) CATCGTCAAAGAAAGGACAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4810 bayes= 9.30354 E= 3.7e+002 -923 151 45 -923 166 -923 -923 -64 34 -923 -923 136 -923 183 -55 -923 134 -923 45 -923 -66 -49 45 36 -923 -49 145 -64 166 -923 -55 -923 34 -923 103 -64 193 -923 -923 -923 -923 -923 203 -923 193 -923 -923 -923 -66 -923 177 -923 193 -923 -923 -923 93 -923 45 -64 -923 -49 177 -923 166 -49 -923 -923 93 110 -923 -923 -66 51 103 -923 193 -923 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 3.7e+002 0.000000 0.666667 0.333333 0.000000 0.833333 0.000000 0.000000 0.166667 0.333333 0.000000 0.000000 0.666667 0.000000 0.833333 0.166667 0.000000 0.666667 0.000000 0.333333 0.000000 0.166667 0.166667 0.333333 0.333333 0.000000 0.166667 0.666667 0.166667 0.833333 0.000000 0.166667 0.000000 0.333333 0.000000 0.500000 0.166667 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.333333 0.166667 0.000000 0.166667 0.833333 0.000000 0.833333 0.166667 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.166667 0.333333 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG]A[TA]C[AG][GT]GA[GA]AGAGA[AG]GA[AC][GC]A -------------------------------------------------------------------------------- Time 2.89 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10525 1.97e-05 20_[+1(2.61e-07)]_43_[+2(3.80e-06)]_\ 401 11594 2.31e-09 193_[+1(1.05e-11)]_112_\ [+2(4.93e-06)]_159 2120 4.21e-09 15_[+2(3.80e-06)]_17_[+1(7.68e-07)]_\ 277_[+3(3.77e-08)]_135 22183 1.29e-10 70_[+2(5.35e-06)]_236_\ [+3(2.19e-09)]_84_[+1(2.20e-07)]_54 23544 4.97e-13 291_[+3(4.95e-09)]_74_\ [+1(1.31e-06)]_74_[+2(1.07e-09)]_5 24207 2.14e-04 124_[+1(1.99e-06)]_312_\ [+2(4.93e-06)]_28 24273 5.59e-11 6_[+1(5.17e-09)]_21_[+2(1.10e-05)]_\ 174_[+3(1.85e-08)]_243 30310 1.77e-02 310_[+2(1.04e-05)]_175 3176 2.91e-12 2_[+3(1.09e-09)]_162_[+1(2.39e-08)]_\ 272_[+2(1.73e-06)]_8 38652 3.36e-09 52_[+3(1.37e-08)]_216_\ [+1(2.63e-06)]_146_[+2(2.38e-06)]_30 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************