******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/368/368.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42576 1.0000 500 46555 1.0000 500 46556 1.0000 500 47669 1.0000 500 48727 1.0000 500 16261 1.0000 500 44522 1.0000 500 45737 1.0000 500 48461 1.0000 500 42887 1.0000 500 50576 1.0000 500 44276 1.0000 500 43252 1.0000 500 49606 1.0000 500 46889 1.0000 500 49991 1.0000 500 34848 1.0000 500 49122 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/368/368.seqs.fa -oc motifs/368 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.272 C 0.245 G 0.226 T 0.257 Background letter frequencies (from dataset with add-one prior applied): A 0.272 C 0.245 G 0.226 T 0.257 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 13 llr = 131 E-value = 2.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 2:9a5182::2: pos.-specific C 2:::2::28a:2 probability G 2a1:29:42::2 matrix T 4:::::23::87 bits 2.1 * 1.9 * * * 1.7 * * * * 1.5 *** * * Relative 1.3 *** * *** Entropy 1.1 *** ** *** (14.6 bits) 0.9 *** ** **** 0.6 *** ** **** 0.4 ****** **** 0.2 *********** 0.0 ------------ Multilevel TGAAAGAGCCTT consensus C C TTG sequence G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 48727 33 5.01e-07 CAAGTATCCG TGAACGAGCCTT ATAATTATGC 47669 180 1.97e-06 GGCCACCCCA CGAAAGACCCTT CACGAGAAAC 49991 177 2.22e-06 TCACTACCGC CGAAAGAGGCTT ACTTCTACAC 49122 411 4.91e-06 AGCCCATCCA GGAAAGATCCTC TTGAGCACCT 46555 460 4.91e-06 GTTCGCTAAC TGAAGGTTCCTT CTACCAAAGA 16261 76 7.27e-06 AGCGAAATGC TGAACGATCCTG TGTCGCCCTA 42887 269 8.38e-06 ACCTAGTTTC GGAAAGTGGCTT AAAAGTTATA 46556 298 9.60e-06 GCTCTTTGGT AGAAGGAACCTT CAGTTAGCGA 48461 21 1.63e-05 GAAGAAATAT TGAAAGATGCAT TTCCGATTTA 45737 158 2.01e-05 ACGACGACTC GGAAGGAACCTG GGTTTGACCG 46889 308 2.53e-05 CCGTACCCCG CGAACGTGCCTC GGGAGGGAAG 34848 7 3.28e-05 ATCTAA TGGAAGAGCCAT ACCAAAAGTG 44522 49 4.54e-05 GTGAACCCTC AGAAAAACCCTT GTGTTATTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48727 5e-07 32_[+1]_456 47669 2e-06 179_[+1]_309 49991 2.2e-06 176_[+1]_312 49122 4.9e-06 410_[+1]_78 46555 4.9e-06 459_[+1]_29 16261 7.3e-06 75_[+1]_413 42887 8.4e-06 268_[+1]_220 46556 9.6e-06 297_[+1]_191 48461 1.6e-05 20_[+1]_468 45737 2e-05 157_[+1]_331 46889 2.5e-05 307_[+1]_181 34848 3.3e-05 6_[+1]_482 44522 4.5e-05 48_[+1]_440 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=13 48727 ( 33) TGAACGAGCCTT 1 47669 ( 180) CGAAAGACCCTT 1 49991 ( 177) CGAAAGAGGCTT 1 49122 ( 411) GGAAAGATCCTC 1 46555 ( 460) TGAAGGTTCCTT 1 16261 ( 76) TGAACGATCCTG 1 42887 ( 269) GGAAAGTGGCTT 1 46556 ( 298) AGAAGGAACCTT 1 48461 ( 21) TGAAAGATGCAT 1 45737 ( 158) GGAAGGAACCTG 1 46889 ( 308) CGAACGTGCCTC 1 34848 ( 7) TGGAAGAGCCAT 1 44522 ( 49) AGAAAAACCCTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 9.15623 E= 2.3e+001 -82 -9 3 58 -1035 -1035 215 -1035 176 -1035 -155 -1035 188 -1035 -1035 -1035 98 -9 3 -1035 -182 -1035 203 -1035 150 -1035 -1035 -15 -82 -67 77 26 -1035 165 3 -1035 -1035 203 -1035 -1035 -82 -1035 -1035 172 -1035 -67 -55 143 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 2.3e+001 0.153846 0.230769 0.230769 0.384615 0.000000 0.000000 1.000000 0.000000 0.923077 0.000000 0.076923 0.000000 1.000000 0.000000 0.000000 0.000000 0.538462 0.230769 0.230769 0.000000 0.076923 0.000000 0.923077 0.000000 0.769231 0.000000 0.000000 0.230769 0.153846 0.153846 0.384615 0.307692 0.000000 0.769231 0.230769 0.000000 0.000000 1.000000 0.000000 0.000000 0.153846 0.000000 0.000000 0.846154 0.000000 0.153846 0.153846 0.692308 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TCG]GAA[ACG]G[AT][GT][CG]CTT -------------------------------------------------------------------------------- Time 3.10 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 9 llr = 119 E-value = 7.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 2:::6a:18217::69 pos.-specific C :3::2:a:23::93:: probability G :2a72::9:16:174: matrix T 84:3:::::333:::1 bits 2.1 * 1.9 * ** 1.7 * *** 1.5 * *** * Relative 1.3 * *** ** * Entropy 1.1 * ** **** ***** (19.1 bits) 0.9 * ** **** ***** 0.6 * ** **** ****** 0.4 ********* ****** 0.2 ********* ****** 0.0 ---------------- Multilevel TTGGAACGACGACGAA consensus AC TC CTTT CG sequence G G A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 46556 194 3.06e-08 ATCCACCTTC TGGGCACGACGACGGA CCGATGGGGC 45737 26 3.67e-08 GCCATCATTT TCGGAACGACTTCGAA TGTTCGTAAA 48727 125 3.67e-08 AACGGACGGG ACGGAACGACGACGGA AAAAAGTGTT 46889 79 1.90e-07 GTCCGGTTCG TCGGAACGAGTTCGGA GCATCCGATC 50576 53 2.90e-07 TTACGTTTTT TGGTGACGATGACCAA AACGTCCTCG 48461 353 5.40e-07 TGCAAGCTCG TTGGAACACTGACGAA CAAATTATGG 49606 428 1.20e-06 CACAGTCAGC TTGTCACGATGAGCAA CTTTTGAACA 16261 365 1.55e-06 CCGGAGTCTT TTGTGACGAATACGGT CAGGTTCGAG 43252 285 3.47e-06 AGCCAAGCCA ATGGAACGCAATCCAA CCCCACTTAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46556 3.1e-08 193_[+2]_291 45737 3.7e-08 25_[+2]_459 48727 3.7e-08 124_[+2]_360 46889 1.9e-07 78_[+2]_406 50576 2.9e-07 52_[+2]_432 48461 5.4e-07 352_[+2]_132 49606 1.2e-06 427_[+2]_57 16261 1.5e-06 364_[+2]_120 43252 3.5e-06 284_[+2]_200 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=9 46556 ( 194) TGGGCACGACGACGGA 1 45737 ( 26) TCGGAACGACTTCGAA 1 48727 ( 125) ACGGAACGACGACGGA 1 46889 ( 79) TCGGAACGAGTTCGGA 1 50576 ( 53) TGGTGACGATGACCAA 1 48461 ( 353) TTGGAACACTGACGAA 1 49606 ( 428) TTGTCACGATGAGCAA 1 16261 ( 365) TTGTGACGAATACGGT 1 43252 ( 285) ATGGAACGCAATCCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8730 bayes= 10.0548 E= 7.3e+001 -29 -982 -982 160 -982 44 -2 79 -982 -982 215 -982 -982 -982 156 38 103 -14 -2 -982 188 -982 -982 -982 -982 203 -982 -982 -129 -982 198 -982 151 -14 -982 -982 -29 44 -102 38 -129 -982 130 38 129 -982 -982 38 -982 186 -102 -982 -982 44 156 -982 103 -982 98 -982 171 -982 -982 -121 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 7.3e+001 0.222222 0.000000 0.000000 0.777778 0.000000 0.333333 0.222222 0.444444 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.555556 0.222222 0.222222 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.111111 0.000000 0.888889 0.000000 0.777778 0.222222 0.000000 0.000000 0.222222 0.333333 0.111111 0.333333 0.111111 0.000000 0.555556 0.333333 0.666667 0.000000 0.000000 0.333333 0.000000 0.888889 0.111111 0.000000 0.000000 0.333333 0.666667 0.000000 0.555556 0.000000 0.444444 0.000000 0.888889 0.000000 0.000000 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TA][TCG]G[GT][ACG]ACG[AC][CTA][GT][AT]C[GC][AG]A -------------------------------------------------------------------------------- Time 5.71 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 6 llr = 86 E-value = 4.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :a7:::22::::3a: pos.-specific C ::2:7223:::3::: probability G a:28:85:a:257:a matrix T :::23:25:a82::: bits 2.1 * * * 1.9 ** ** ** 1.7 ** ** ** 1.5 ** * * ** ** Relative 1.3 ** * * *** ** Entropy 1.1 ** *** *** *** (20.8 bits) 0.9 ** *** *** *** 0.6 ****** ******* 0.4 ****** ******** 0.2 *************** 0.0 --------------- Multilevel GAAGCGGTGTTGGAG consensus T C CA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 49122 363 1.73e-09 TTGTCAAGGA GAAGCGGTGTTCGAG TCTTTACCCA 46555 89 3.60e-09 GGCAAAATGC GAAGCGGTGTTGAAG ACTTCGTCTG 43252 37 2.45e-07 ACGCTTTTAC GACGTGACGTTGGAG GACAAACAAA 34848 86 2.96e-07 TTCGAAAGGT GAGGCGTCGTTTGAG TTTTCACGAT 44522 25 5.07e-07 ATGGCGAAAA GAATTGCTGTTGAAG TGAACCCTCA 16261 336 6.17e-07 GTTGTCACAA GAAGCCGAGTGCGAG TGGGCCGGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49122 1.7e-09 362_[+3]_123 46555 3.6e-09 88_[+3]_397 43252 2.4e-07 36_[+3]_449 34848 3e-07 85_[+3]_400 44522 5.1e-07 24_[+3]_461 16261 6.2e-07 335_[+3]_150 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=6 49122 ( 363) GAAGCGGTGTTCGAG 1 46555 ( 89) GAAGCGGTGTTGAAG 1 43252 ( 37) GACGTGACGTTGGAG 1 34848 ( 86) GAGGCGTCGTTTGAG 1 44522 ( 25) GAATTGCTGTTGAAG 1 16261 ( 336) GAAGCCGAGTGCGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 8748 bayes= 10.9565 E= 4.8e+002 -923 -923 215 -923 188 -923 -923 -923 129 -56 -44 -923 -923 -923 188 -62 -923 144 -923 38 -923 -56 188 -923 -71 -56 115 -62 -71 44 -923 96 -923 -923 215 -923 -923 -923 -923 196 -923 -923 -44 170 -923 44 115 -62 29 -923 156 -923 188 -923 -923 -923 -923 -923 215 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 6 E= 4.8e+002 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.166667 0.166667 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.666667 0.000000 0.333333 0.000000 0.166667 0.833333 0.000000 0.166667 0.166667 0.500000 0.166667 0.166667 0.333333 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.333333 0.500000 0.166667 0.333333 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GAAG[CT]GG[TC]GTT[GC][GA]AG -------------------------------------------------------------------------------- Time 8.62 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42576 7.44e-01 500 46555 5.48e-07 88_[+3(3.60e-09)]_356_\ [+1(4.91e-06)]_29 46556 6.33e-06 193_[+2(3.06e-08)]_88_\ [+1(9.60e-06)]_191 47669 2.34e-03 179_[+1(1.97e-06)]_309 48727 5.08e-07 32_[+1(5.01e-07)]_80_[+2(3.67e-08)]_\ 360 16261 1.93e-07 75_[+1(7.27e-06)]_248_\ [+3(6.17e-07)]_14_[+2(1.55e-06)]_120 44522 3.05e-04 24_[+3(5.07e-07)]_9_[+1(4.54e-05)]_\ 440 45737 1.98e-05 25_[+2(3.67e-08)]_116_\ [+1(2.01e-05)]_331 48461 7.70e-05 20_[+1(1.63e-05)]_320_\ [+2(5.40e-07)]_132 42887 2.07e-02 268_[+1(8.38e-06)]_220 50576 5.85e-03 52_[+2(2.90e-07)]_432 44276 9.99e-01 500 43252 1.73e-05 36_[+3(2.45e-07)]_233_\ [+2(3.47e-06)]_200 49606 6.73e-03 427_[+2(1.20e-06)]_57 46889 9.69e-05 78_[+2(1.90e-07)]_213_\ [+1(2.53e-05)]_181 49991 3.03e-03 176_[+1(2.22e-06)]_312 34848 2.13e-04 6_[+1(3.28e-05)]_67_[+3(2.96e-07)]_\ 400 49122 3.23e-07 362_[+3(1.73e-09)]_33_\ [+1(4.91e-06)]_78 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************