******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/113/113.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 8990 1.0000 500 27923 1.0000 500 27976 1.0000 500 46599 1.0000 500 29016 1.0000 500 40430 1.0000 500 50072 1.0000 500 41515 1.0000 500 42398 1.0000 500 44474 1.0000 500 12004 1.0000 500 42015 1.0000 500 36499 1.0000 500 46531 1.0000 500 47491 1.0000 500 47498 1.0000 500 50434 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/113/113.seqs.fa -oc motifs/113 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.240 C 0.249 G 0.234 T 0.276 Background letter frequencies (from dataset with add-one prior applied): A 0.240 C 0.249 G 0.234 T 0.276 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 6 llr = 89 E-value = 8.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::8a8:83:::237a pos.-specific C a::::::3::5:::: probability G :a2::7223a587:: matrix T ::::23:27::::3: bits 2.1 ** * * * 1.9 ** * * * 1.7 ** * * * 1.5 ***** * * * * Relative 1.3 ***** * * ** * Entropy 1.0 ******* ******* (21.4 bits) 0.8 ******* ******* 0.6 ******* ******* 0.4 ******* ******* 0.2 ******* ******* 0.0 --------------- Multilevel CGAAAGAATGCGGAA consensus T CG G AT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 46531 51 5.17e-10 GCAGGACTAA CGAAAGAATGGGGAA ACAGCTCTCT 27976 195 5.17e-10 TCTGCTTCCT CGAAAGAATGGGGAA GTTCCATTAG 29016 171 1.87e-07 GGGTTCGTCT CGAAAGACGGCAGTA GAGATCTTTC 36499 100 2.38e-07 AACATAGCGA CGAAATGCTGCGGTA CTTTGTCTGA 47491 422 2.86e-07 CCATCACCAA CGAATGATGGGGAAA CGTCGGATTA 40430 327 3.10e-07 TGGCACGAGT CGGAATAGTGCGAAA GGATAGTTGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46531 5.2e-10 50_[+1]_435 27976 5.2e-10 194_[+1]_291 29016 1.9e-07 170_[+1]_315 36499 2.4e-07 99_[+1]_386 47491 2.9e-07 421_[+1]_64 40430 3.1e-07 326_[+1]_159 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=6 46531 ( 51) CGAAAGAATGGGGAA 1 27976 ( 195) CGAAAGAATGGGGAA 1 29016 ( 171) CGAAAGACGGCAGTA 1 36499 ( 100) CGAAATGCTGCGGTA 1 47491 ( 422) CGAATGATGGGGAAA 1 40430 ( 327) CGGAATAGTGCGAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 8262 bayes= 11.5264 E= 8.8e+001 -923 201 -923 -923 -923 -923 209 -923 179 -923 -49 -923 205 -923 -923 -923 179 -923 -923 -73 -923 -923 151 27 179 -923 -49 -923 47 42 -49 -73 -923 -923 51 127 -923 -923 209 -923 -923 101 109 -923 -53 -923 183 -923 47 -923 151 -923 147 -923 -923 27 205 -923 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 6 E= 8.8e+001 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.000000 0.666667 0.333333 0.833333 0.000000 0.166667 0.000000 0.333333 0.333333 0.166667 0.166667 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.166667 0.000000 0.833333 0.000000 0.333333 0.000000 0.666667 0.000000 0.666667 0.000000 0.000000 0.333333 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CGAAA[GT]A[AC][TG]G[CG]G[GA][AT]A -------------------------------------------------------------------------------- Time 3.49 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 11 llr = 131 E-value = 7.1e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :1:7461:5:a:119 pos.-specific C :41:23381a:119: probability G :5434:::4::92:1 matrix T a:5:11621:::6:: bits 2.1 ** 1.9 * ** 1.7 * *** ** 1.5 * *** ** Relative 1.3 * * * *** ** Entropy 1.0 * * * *** ** (17.1 bits) 0.8 ** * * * *** ** 0.6 **** *** *** ** 0.4 **** ********** 0.2 *************** 0.0 --------------- Multilevel TGTAAATCACAGTCA consensus CGGGCC G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 36499 13 2.51e-08 CTCGGCACAA TGTAGACCGCAGTCA TACCGTAGCC 46531 198 5.35e-08 ATCCCTTTTG TCGACATCACAGTCA AGAAGGGTTG 50434 149 6.77e-08 GAACGCGTGC TCGAGCTCACAGTCA CTGTCAAAAA 27976 345 3.64e-07 TGTCGCTAAC TGTAACTCTCAGTCA GTAACATCGC 40430 138 4.56e-07 GCTTCTACCC TGTAAACCGCAGCCA CGCCATGCGC 50072 168 6.78e-07 CACCTATGCA TGCATATCACAGTCA GCTGTGCCGT 47491 244 3.73e-06 TGACTGGAAC TGTAAACCACACACA GAAACCAATG 29016 442 5.64e-06 CGCTGAGTCA TCTGGATTGCAGTAA TTGAAAGTAC 46599 336 8.20e-06 CACTGCACGG TAGGGCACGCAGTCA TCGTCCAGCG 41515 397 8.71e-06 ACCACGTTTG TCTACTTTACAGGCA CCACGCTAGC 44474 212 1.04e-05 CACGTTTGGA TGGGAATCCCAGGCG ATGGATTGGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36499 2.5e-08 12_[+2]_473 46531 5.4e-08 197_[+2]_288 50434 6.8e-08 148_[+2]_337 27976 3.6e-07 344_[+2]_141 40430 4.6e-07 137_[+2]_348 50072 6.8e-07 167_[+2]_318 47491 3.7e-06 243_[+2]_242 29016 5.6e-06 441_[+2]_44 46599 8.2e-06 335_[+2]_150 41515 8.7e-06 396_[+2]_89 44474 1e-05 211_[+2]_274 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=11 36499 ( 13) TGTAGACCGCAGTCA 1 46531 ( 198) TCGACATCACAGTCA 1 50434 ( 149) TCGAGCTCACAGTCA 1 27976 ( 345) TGTAACTCTCAGTCA 1 40430 ( 138) TGTAAACCGCAGCCA 1 50072 ( 168) TGCATATCACAGTCA 1 47491 ( 244) TGTAAACCACACACA 1 29016 ( 442) TCTGGATTGCAGTAA 1 46599 ( 336) TAGGGCACGCAGTCA 1 41515 ( 397) TCTACTTTACAGGCA 1 44474 ( 212) TGGGAATCCCAGGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 8262 bayes= 9.90644 E= 7.1e+000 -1010 -1010 -1010 185 -140 55 122 -1010 -1010 -145 63 98 160 -1010 22 -1010 60 -45 63 -160 140 13 -1010 -160 -140 13 -1010 120 -1010 172 -1010 -60 92 -145 63 -160 -1010 201 -1010 -1010 206 -1010 -1010 -1010 -1010 -145 195 -1010 -140 -145 -37 120 -140 187 -1010 -1010 192 -1010 -136 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 11 E= 7.1e+000 0.000000 0.000000 0.000000 1.000000 0.090909 0.363636 0.545455 0.000000 0.000000 0.090909 0.363636 0.545455 0.727273 0.000000 0.272727 0.000000 0.363636 0.181818 0.363636 0.090909 0.636364 0.272727 0.000000 0.090909 0.090909 0.272727 0.000000 0.636364 0.000000 0.818182 0.000000 0.181818 0.454545 0.090909 0.363636 0.090909 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.090909 0.909091 0.000000 0.090909 0.090909 0.181818 0.636364 0.090909 0.909091 0.000000 0.000000 0.909091 0.000000 0.090909 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[GC][TG][AG][AG][AC][TC]C[AG]CAGTCA -------------------------------------------------------------------------------- Time 6.72 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 17 llr = 164 E-value = 2.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :6:::2:::2111:: pos.-specific C :1:9:1272:4:222 probability G 5::121136436178 matrix T 52a:878:152261: bits 2.1 1.9 * 1.7 * 1.5 ** * Relative 1.3 ** * * Entropy 1.0 * *** * ** (14.0 bits) 0.8 ***** *** * ** 0.6 ********* * ** 0.4 ********** **** 0.2 *************** 0.0 --------------- Multilevel TATCTTTCGTCGTGG consensus GT G GCGGT C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 12004 60 6.98e-09 CACACCACGA GATCTTTCGTGGTGG GTAGAAAGCG 8990 458 1.03e-07 TCCAAAGCGC GATCTTTCGTCGAGG GACAAATGTT 50434 319 1.42e-06 AGATGTTGAC TTTCGTTCGTTGTGG TTGTTCTTCT 40430 29 1.82e-06 CGTGTCCGAT TATCGTTCGTCGCCG CCAGTCGTCG 44474 92 3.28e-06 GGGAACGCGA TATGTTTCCACGTGG TAATATTGGA 27976 242 4.45e-06 CAGTCGGTAG TATCTTTCGGCTTCC CAGTTTTCTC 50072 43 6.72e-06 GTTTTCGGTG TATCTATGGGAGTCG GCATTGGCGC 29016 354 8.19e-06 TCATCGTCTT GCTCTTCCGTGTTGG CGCTCTCGTC 27923 262 1.09e-05 TGTCGGACCG GATCTTTGGGCAGGG TTTCGAACGA 42398 91 1.30e-05 GCAAACGGAA TTTCTATCGTGTTCG CATTGCACAA 47491 269 2.37e-05 GAAACCAATG GATCGATCGATGTGC CCTACACGTG 42015 175 3.78e-05 GATTGGTGAA TATCTGTCTACGCGG TACTGCATTC 47498 435 4.07e-05 ACACATTTTT GTTCTTGCCTCGAGG ATCAAAGCTC 46599 37 4.07e-05 TCCCCGATCG TATCGTCGCTGATGG CGTCCACCAT 41515 165 6.20e-05 CTCTCTTGGC TCTCTTTGCGGTTGC TGGTACGTGA 36499 210 1.16e-04 TTCATTGTCA GTTGTCTGGGAGTGG AAACACAAGT 46531 407 1.54e-04 CTGTTTGAGA GATCTTCCTGTGCTG CACAAACCCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12004 7e-09 59_[+3]_426 8990 1e-07 457_[+3]_28 50434 1.4e-06 318_[+3]_167 40430 1.8e-06 28_[+3]_457 44474 3.3e-06 91_[+3]_394 27976 4.4e-06 241_[+3]_244 50072 6.7e-06 42_[+3]_443 29016 8.2e-06 353_[+3]_132 27923 1.1e-05 261_[+3]_224 42398 1.3e-05 90_[+3]_395 47491 2.4e-05 268_[+3]_217 42015 3.8e-05 174_[+3]_311 47498 4.1e-05 434_[+3]_51 46599 4.1e-05 36_[+3]_449 41515 6.2e-05 164_[+3]_321 36499 0.00012 209_[+3]_276 46531 0.00015 406_[+3]_79 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=17 12004 ( 60) GATCTTTCGTGGTGG 1 8990 ( 458) GATCTTTCGTCGAGG 1 50434 ( 319) TTTCGTTCGTTGTGG 1 40430 ( 29) TATCGTTCGTCGCCG 1 44474 ( 92) TATGTTTCCACGTGG 1 27976 ( 242) TATCTTTCGGCTTCC 1 50072 ( 43) TATCTATGGGAGTCG 1 29016 ( 354) GCTCTTCCGTGTTGG 1 27923 ( 262) GATCTTTGGGCAGGG 1 42398 ( 91) TTTCTATCGTGTTCG 1 47491 ( 269) GATCGATCGATGTGC 1 42015 ( 175) TATCTGTCTACGCGG 1 47498 ( 435) GTTCTTGCCTCGAGG 1 46599 ( 37) TATCGTCGCTGATGG 1 41515 ( 165) TCTCTTTGCGGTTGC 1 36499 ( 210) GTTGTCTGGGAGTGG 1 46531 ( 407) GATCTTCCTGTGCTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 8262 bayes= 8.99152 E= 2.7e+002 -1073 -1073 101 94 143 -108 -1073 -23 -1073 -1073 -1073 185 -1073 183 -99 -1073 -1073 -1073 1 147 -45 -208 -199 135 -1073 -49 -199 147 -1073 150 33 -1073 -1073 -8 146 -123 -45 -1073 59 77 -103 73 33 -65 -103 -1073 146 -23 -103 -49 -199 123 -1073 -8 159 -223 -1073 -49 181 -1073 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 17 E= 2.7e+002 0.000000 0.000000 0.470588 0.529412 0.647059 0.117647 0.000000 0.235294 0.000000 0.000000 0.000000 1.000000 0.000000 0.882353 0.117647 0.000000 0.000000 0.000000 0.235294 0.764706 0.176471 0.058824 0.058824 0.705882 0.000000 0.176471 0.058824 0.764706 0.000000 0.705882 0.294118 0.000000 0.000000 0.235294 0.647059 0.117647 0.176471 0.000000 0.352941 0.470588 0.117647 0.411765 0.294118 0.176471 0.117647 0.000000 0.647059 0.235294 0.117647 0.176471 0.058824 0.647059 0.000000 0.235294 0.705882 0.058824 0.000000 0.176471 0.823529 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TG][AT]TC[TG]TT[CG][GC][TG][CG][GT]T[GC]G -------------------------------------------------------------------------------- Time 9.79 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8990 2.42e-03 35_[+3(2.78e-05)]_407_\ [+3(1.03e-07)]_28 27923 6.42e-02 261_[+3(1.09e-05)]_224 27976 4.61e-11 82_[+1(3.77e-05)]_97_[+1(5.17e-10)]_\ 32_[+3(4.45e-06)]_88_[+2(3.64e-07)]_141 46599 3.55e-03 36_[+3(4.07e-05)]_284_\ [+2(8.20e-06)]_150 29016 2.34e-07 170_[+1(1.87e-07)]_168_\ [+3(8.19e-06)]_73_[+2(5.64e-06)]_44 40430 9.45e-09 28_[+3(1.82e-06)]_94_[+2(4.56e-07)]_\ 174_[+1(3.10e-07)]_159 50072 4.99e-05 42_[+3(6.72e-06)]_110_\ [+2(6.78e-07)]_318 41515 4.02e-03 164_[+3(6.20e-05)]_217_\ [+2(8.71e-06)]_89 42398 2.50e-02 90_[+3(1.30e-05)]_395 44474 2.07e-04 91_[+3(3.28e-06)]_105_\ [+2(1.04e-05)]_274 12004 9.46e-05 59_[+3(6.98e-09)]_218_\ [+3(3.50e-05)]_193 42015 1.51e-02 174_[+3(3.78e-05)]_311 36499 2.29e-08 12_[+2(2.51e-08)]_72_[+1(2.38e-07)]_\ 386 46531 2.04e-10 50_[+1(5.17e-10)]_132_\ [+2(5.35e-08)]_288 47491 6.16e-07 243_[+2(3.73e-06)]_10_\ [+3(2.37e-05)]_138_[+1(2.86e-07)]_64 47498 2.86e-02 434_[+3(4.07e-05)]_51 50434 3.23e-06 148_[+2(6.77e-08)]_155_\ [+3(1.42e-06)]_167 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************