******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/295/295.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 43259 1.0000 500 46855 1.0000 500 37372 1.0000 500 14084 1.0000 500 48567 1.0000 500 32823 1.0000 500 43871 1.0000 500 34324 1.0000 500 11954 1.0000 500 8469 1.0000 500 45868 1.0000 500 33641 1.0000 500 44552 1.0000 500 34125 1.0000 500 33638 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/295/295.seqs.fa -oc motifs/295 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.270 C 0.249 G 0.225 T 0.256 Background letter frequencies (from dataset with add-one prior applied): A 0.270 C 0.249 G 0.225 T 0.256 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 6 llr = 107 E-value = 4.1e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 5::2::5::::::53::5:2 pos.-specific C 5:::8:2::3:8:532:5:: probability G :2852a3:5:a2a::22::: matrix T :823:::a57::::378:a8 bits 2.2 * * * 1.9 * * * * * 1.7 * * * * * 1.5 * * * * * * Relative 1.3 ** ** * *** * ** Entropy 1.1 ** ** ****** * ** (25.8 bits) 0.9 *** ** ******* **** 0.6 ****** ******* ***** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel ATGGCGATGTGCGAATTATT consensus C T G TC CC C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 43871 146 8.73e-11 CAACGGAGTG CTGGCGATTCGCGCCTTCTT CAAGCGGGGT 46855 272 2.64e-10 TCCTTGAACC CTGTCGATGCGCGACTTATT GTCTCATGTC 34125 50 3.48e-09 ATCGGTCGGT CGGACGGTGTGCGCTTTCTT CGCAGTTACC 33641 2 6.84e-09 T ATGGGGGTGTGCGATTTATA TGCATGAGTA 48567 258 1.05e-08 CAACTACCGT ATGGCGCTTTGCGAACGCTT GCTTGTAGTA 44552 116 2.43e-08 ACGCACACTC ATTTCGATTTGGGCAGTATT ATCGTATACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43871 8.7e-11 145_[+1]_335 46855 2.6e-10 271_[+1]_209 34125 3.5e-09 49_[+1]_431 33641 6.8e-09 1_[+1]_479 48567 1.1e-08 257_[+1]_223 44552 2.4e-08 115_[+1]_365 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=6 43871 ( 146) CTGGCGATTCGCGCCTTCTT 1 46855 ( 272) CTGTCGATGCGCGACTTATT 1 34125 ( 50) CGGACGGTGTGCGCTTTCTT 1 33641 ( 2) ATGGGGGTGTGCGATTTATA 1 48567 ( 258) ATGGCGCTTTGCGAACGCTT 1 44552 ( 116) ATTTCGATTTGGGCAGTATT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7215 bayes= 10.6784 E= 4.1e+000 89 101 -923 -923 -923 -923 -43 170 -923 -923 189 -62 -69 -923 115 38 -923 174 -43 -923 -923 -923 215 -923 89 -58 56 -923 -923 -923 -923 196 -923 -923 115 96 -923 42 -923 138 -923 -923 215 -923 -923 174 -43 -923 -923 -923 215 -923 89 101 -923 -923 30 42 -923 38 -923 -58 -43 138 -923 -923 -43 170 89 101 -923 -923 -923 -923 -923 196 -69 -923 -923 170 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 4.1e+000 0.500000 0.500000 0.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.833333 0.166667 0.166667 0.000000 0.500000 0.333333 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.166667 0.333333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.333333 0.333333 0.000000 0.333333 0.000000 0.166667 0.166667 0.666667 0.000000 0.000000 0.166667 0.833333 0.500000 0.500000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.166667 0.000000 0.000000 0.833333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AC]TG[GT]CG[AG]T[GT][TC]GCG[AC][ACT]TT[AC]TT -------------------------------------------------------------------------------- Time 2.47 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 11 llr = 116 E-value = 5.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 12::a:3:::87 pos.-specific C 65:7:83::9:: probability G 2::3:2:a1121 matrix T 14a:::5:9::2 bits 2.2 * 1.9 * * * 1.7 * * * 1.5 * * *** Relative 1.3 **** **** Entropy 1.1 **** **** (15.2 bits) 0.9 **** ***** 0.6 **** ***** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CCTCACTGTCAA consensus T G A sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44552 298 2.13e-07 GACTGTCATT CCTCACCGTCAA AACAGTCGCA 33638 356 5.78e-07 CCGGCCACTA CATCACTGTCAA CTGTCCCCAC 33641 444 1.17e-06 TCGGCGGCAA CATCACCGTCAA GAACGACCAA 32823 124 2.36e-06 ATGGGTTGGT CCTCACTGTCAG TTCTTCGATC 46855 373 2.36e-06 GGGATGCCTC GTTCACAGTCAA TTCTAATCCG 34324 15 7.85e-06 ATGATGATTT CTTCACAGGCAA CAACCCATAG 37372 8 9.05e-06 CTTTCGT CCTGACTGTGAA AGACTTTTGG 48567 148 9.51e-06 CCCTGTTTCT CCTCAGCGTCGA TCACTTTCGG 43259 437 2.33e-05 TCCCCGATTA ACTGACTGTCAT AATAACGAAC 8469 156 2.79e-05 TAGTAAATAT TTTGAGTGTCAA TGTTTTCCCT 45868 350 3.12e-05 GGTTCCCATC GTTCACAGTCGT CATCCCGTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44552 2.1e-07 297_[+2]_191 33638 5.8e-07 355_[+2]_133 33641 1.2e-06 443_[+2]_45 32823 2.4e-06 123_[+2]_365 46855 2.4e-06 372_[+2]_116 34324 7.8e-06 14_[+2]_474 37372 9e-06 7_[+2]_481 48567 9.5e-06 147_[+2]_341 43259 2.3e-05 436_[+2]_52 8469 2.8e-05 155_[+2]_333 45868 3.1e-05 349_[+2]_139 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=11 44552 ( 298) CCTCACCGTCAA 1 33638 ( 356) CATCACTGTCAA 1 33641 ( 444) CATCACCGTCAA 1 32823 ( 124) CCTCACTGTCAG 1 46855 ( 373) GTTCACAGTCAA 1 34324 ( 15) CTTCACAGGCAA 1 37372 ( 8) CCTGACTGTGAA 1 48567 ( 148) CCTCAGCGTCGA 1 43259 ( 437) ACTGACTGTCAT 1 8469 ( 156) TTTGAGTGTCAA 1 45868 ( 350) GTTCACAGTCGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 9.73455 E= 5.5e+001 -157 135 -31 -149 -57 87 -1010 50 -1010 -1010 -1010 196 -1010 155 28 -1010 189 -1010 -1010 -1010 -1010 172 -31 -1010 2 13 -1010 83 -1010 -1010 215 -1010 -1010 -1010 -131 183 -1010 187 -131 -1010 160 -1010 -31 -1010 143 -1010 -131 -49 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 5.5e+001 0.090909 0.636364 0.181818 0.090909 0.181818 0.454545 0.000000 0.363636 0.000000 0.000000 0.000000 1.000000 0.000000 0.727273 0.272727 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.818182 0.181818 0.000000 0.272727 0.272727 0.000000 0.454545 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.090909 0.909091 0.000000 0.909091 0.090909 0.000000 0.818182 0.000000 0.181818 0.000000 0.727273 0.000000 0.090909 0.181818 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[CT]T[CG]AC[TAC]GTCAA -------------------------------------------------------------------------------- Time 4.76 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 15 llr = 155 E-value = 3.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 3663:31:3:::131: pos.-specific C 53465::a6:1:6::2 probability G 21::579::424:383 matrix T :1:1::::16763315 bits 2.2 1.9 * 1.7 * 1.5 ** Relative 1.3 ** * Entropy 1.1 **** * * * (14.9 bits) 0.9 * **** *** * 0.6 * *********** ** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CAACCGGCCTTTCAGT consensus ACCAGA AGGGTG G sequence G T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 8469 411 8.81e-08 ATCTTGTCGG CCACGGGCATTTCGGT GTTTGCTGTG 45868 458 5.71e-07 GCCCACTGTT CACCGGGCAGTTTTGT TCCGGCTAAC 44552 239 9.90e-07 ATTAATTCGC CGCCGGGCCTTGCAGT GAGAGGAGGG 48567 378 1.12e-06 CATGGATAAA GAAACGGCCTTGTTGT AATGTCAACG 37372 332 1.12e-06 AGTGACAGAA CAAACAGCCTGTCTGT CACTCCTTGT 33638 397 1.46e-06 CATCCTGCAA GCCCCAGCCTTGCTGT TCGTGTTCCT 32823 55 6.53e-06 CCGAAAGTCA ATCCGGGCCTGTCGGT AGAGATTTTA 46855 149 8.59e-06 TTCGAAACGA CAATCAGCCTTTAAGG AGACTCATGG 43259 73 8.59e-06 CATGCAGTAT CCACGAGCCGTGCGAG TTTGAAACGC 33641 131 1.21e-05 GAGTGAGCGC GAAACGGCTTTTAAGT AAAATTTACA 14084 135 1.66e-05 ACGCCAAATC CAAACGACCGGTCAGC ACGCCCCCAT 34324 297 1.80e-05 AACTCCGATG ACACCGGCATTTCAAC ACCACCGCCA 43871 330 1.94e-05 CGGTGTGATG CAACGGGCAGTTTGTC GCCAAAACCC 11954 117 2.95e-05 GCAGAACGCG AACCCAACCGTGTTGG CAGTTGAAAA 34125 296 7.77e-05 ACGCCTGCAA AACTGGGCTGCGCGGG TTCTTCGGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8469 8.8e-08 410_[+3]_74 45868 5.7e-07 457_[+3]_27 44552 9.9e-07 238_[+3]_246 48567 1.1e-06 377_[+3]_107 37372 1.1e-06 331_[+3]_153 33638 1.5e-06 396_[+3]_88 32823 6.5e-06 54_[+3]_430 46855 8.6e-06 148_[+3]_336 43259 8.6e-06 72_[+3]_412 33641 1.2e-05 130_[+3]_354 14084 1.7e-05 134_[+3]_350 34324 1.8e-05 296_[+3]_188 43871 1.9e-05 329_[+3]_155 11954 3e-05 116_[+3]_368 34125 7.8e-05 295_[+3]_189 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=15 8469 ( 411) CCACGGGCATTTCGGT 1 45868 ( 458) CACCGGGCAGTTTTGT 1 44552 ( 239) CGCCGGGCCTTGCAGT 1 48567 ( 378) GAAACGGCCTTGTTGT 1 37372 ( 332) CAAACAGCCTGTCTGT 1 33638 ( 397) GCCCCAGCCTTGCTGT 1 32823 ( 55) ATCCGGGCCTGTCGGT 1 46855 ( 149) CAATCAGCCTTTAAGG 1 43259 ( 73) CCACGAGCCGTGCGAG 1 33641 ( 131) GAAACGGCTTTTAAGT 1 14084 ( 135) CAAACGACCGGTCAGC 1 34324 ( 297) ACACCGGCATTTCAAC 1 43871 ( 330) CAACGGGCAGTTTGTC 1 11954 ( 117) AACCCAACCGTGTTGG 1 34125 ( 296) AACTGGGCTGCGCGGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7275 bayes= 9.59421 E= 3.3e+002 -2 110 -17 -1055 115 10 -175 -194 115 69 -1055 -1055 -2 127 -1055 -94 -1055 110 105 -1055 30 -1055 157 -1055 -102 -1055 194 -1055 -1055 201 -1055 -1055 -2 127 -1055 -94 -1055 -1055 83 123 -1055 -190 -17 152 -1055 -1055 83 123 -102 127 -1055 6 30 -1055 57 38 -102 -1055 183 -194 -1055 -31 24 106 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 15 E= 3.3e+002 0.266667 0.533333 0.200000 0.000000 0.600000 0.266667 0.066667 0.066667 0.600000 0.400000 0.000000 0.000000 0.266667 0.600000 0.000000 0.133333 0.000000 0.533333 0.466667 0.000000 0.333333 0.000000 0.666667 0.000000 0.133333 0.000000 0.866667 0.000000 0.000000 1.000000 0.000000 0.000000 0.266667 0.600000 0.000000 0.133333 0.000000 0.000000 0.400000 0.600000 0.000000 0.066667 0.200000 0.733333 0.000000 0.000000 0.400000 0.600000 0.133333 0.600000 0.000000 0.266667 0.333333 0.000000 0.333333 0.333333 0.133333 0.000000 0.800000 0.066667 0.000000 0.200000 0.266667 0.533333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CAG][AC][AC][CA][CG][GA]GC[CA][TG][TG][TG][CT][AGT]G[TGC] -------------------------------------------------------------------------------- Time 6.80 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43259 1.92e-03 72_[+3(8.59e-06)]_348_\ [+2(2.33e-05)]_52 46855 2.59e-10 148_[+3(8.59e-06)]_107_\ [+1(2.64e-10)]_47_[+2(2.23e-05)]_22_[+2(2.36e-06)]_116 37372 2.29e-04 7_[+2(9.05e-06)]_312_[+3(1.12e-06)]_\ 153 14084 7.38e-02 134_[+3(1.66e-05)]_350 48567 4.33e-09 147_[+2(9.51e-06)]_98_\ [+1(1.05e-08)]_100_[+3(1.12e-06)]_107 32823 5.32e-05 54_[+3(6.53e-06)]_53_[+2(2.36e-06)]_\ 51_[+2(2.79e-05)]_302 43871 9.80e-08 145_[+1(8.73e-11)]_164_\ [+3(1.94e-05)]_155 34324 9.94e-04 14_[+2(7.85e-06)]_270_\ [+3(1.80e-05)]_188 11954 7.47e-02 116_[+3(2.95e-05)]_368 8469 1.86e-05 155_[+2(2.79e-05)]_243_\ [+3(8.81e-08)]_74 45868 1.94e-04 349_[+2(3.12e-05)]_96_\ [+3(5.71e-07)]_27 33641 3.79e-09 1_[+1(6.84e-09)]_109_[+3(1.21e-05)]_\ 297_[+2(1.17e-06)]_45 44552 2.50e-10 115_[+1(2.43e-08)]_103_\ [+3(9.90e-07)]_43_[+2(2.13e-07)]_191 34125 5.22e-06 49_[+1(3.48e-09)]_227_\ [+1(7.52e-05)]_184 33638 1.06e-05 355_[+2(5.78e-07)]_29_\ [+3(1.46e-06)]_88 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************