******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/109/109.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10460 1.0000 500 21617 1.0000 500 22246 1.0000 500 24511 1.0000 500 25156 1.0000 500 261141 1.0000 500 263656 1.0000 500 269327 1.0000 500 28998 1.0000 500 3233 1.0000 500 32934 1.0000 500 3428 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/109/109.seqs.fa -oc motifs/109 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.278 C 0.241 G 0.224 T 0.257 Background letter frequencies (from dataset with add-one prior applied): A 0.278 C 0.242 G 0.224 T 0.257 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 12 llr = 133 E-value = 1.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 2323:::4::::2::: pos.-specific C 6:::81:3::53:322 probability G 31:63:822233:::7 matrix T :782:93188338782 bits 2.2 1.9 1.7 1.5 * Relative 1.3 * *** ** * * Entropy 1.1 * *** ** *** (15.9 bits) 0.9 ** *** ** **** 0.6 ******* *** **** 0.4 ******* ******** 0.2 **************** 0.0 ---------------- Multilevel CTTGCTGATTCCTTTG consensus GA AG TC GG C sequence TT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 25156 265 7.83e-09 ATTCTTAATT CTTACTGATTCCTTTG TGAAGGAGAT 24511 94 4.00e-07 GCAATTGGTG GATGCTGATGCGTTTG GACGACGGAG 261141 365 5.10e-07 CTCTCTACTT CTTGGTGCTTGCTTCG GTGAGGGAAG 263656 234 6.33e-07 ACGTGTAAAC CTTGGTGCTGTTTTTG TACGACCGAC 269327 8 1.80e-06 TCGTTCC CATTCTTCTTCTTCTG CATGGGGCAC 21617 25 1.80e-06 AATAAAGCGG GTTGCTGGTTTGTCTC TTTGACGGAC 3428 336 2.40e-06 ACATTTGTCT CTTGGTTAGTTCTTTG TCGAGTAAAG 22246 286 3.46e-06 CTAACCAATG CTTGCTGGGTGGTCTC GAGTGTGGCC 3233 298 1.20e-05 AACGTTAGTT AGATCTGATTCCTTTG TGTTGATGAA 10460 14 1.47e-05 ATATTTTTCA AATACTGTTTCTTTTT AGTTGTGAAA 32934 293 1.56e-05 ATTGGTTGTG CTAACTGCTTCTACTT TATTCTTCTT 28998 87 4.50e-05 GTACCCCGCC GTTGCCTATTGGATCG CTGGAGCCGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25156 7.8e-09 264_[+1]_220 24511 4e-07 93_[+1]_391 261141 5.1e-07 364_[+1]_120 263656 6.3e-07 233_[+1]_251 269327 1.8e-06 7_[+1]_477 21617 1.8e-06 24_[+1]_460 3428 2.4e-06 335_[+1]_149 22246 3.5e-06 285_[+1]_199 3233 1.2e-05 297_[+1]_187 10460 1.5e-05 13_[+1]_471 32934 1.6e-05 292_[+1]_192 28998 4.5e-05 86_[+1]_398 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=12 25156 ( 265) CTTACTGATTCCTTTG 1 24511 ( 94) GATGCTGATGCGTTTG 1 261141 ( 365) CTTGGTGCTTGCTTCG 1 263656 ( 234) CTTGGTGCTGTTTTTG 1 269327 ( 8) CATTCTTCTTCTTCTG 1 21617 ( 25) GTTGCTGGTTTGTCTC 1 3428 ( 336) CTTGGTTAGTTCTTTG 1 22246 ( 286) CTTGCTGGGTGGTCTC 1 3233 ( 298) AGATCTGATTCCTTTG 1 10460 ( 14) AATACTGTTTCTTTTT 1 32934 ( 293) CTAACTGCTTCTACTT 1 28998 ( 87) GTTGCCTATTGGATCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 8.91886 E= 1.7e+001 -74 127 16 -1023 -15 -1023 -142 137 -74 -1023 -1023 170 -15 -1023 138 -63 -1023 163 16 -1023 -1023 -153 -1023 183 -1023 -1023 175 -4 58 46 -42 -162 -1023 -1023 -42 170 -1023 -1023 -42 170 -1023 105 16 -4 -1023 46 58 37 -74 -1023 -1023 170 -1023 46 -1023 137 -1023 -53 -1023 170 -1023 -53 158 -63 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 12 E= 1.7e+001 0.166667 0.583333 0.250000 0.000000 0.250000 0.000000 0.083333 0.666667 0.166667 0.000000 0.000000 0.833333 0.250000 0.000000 0.583333 0.166667 0.000000 0.750000 0.250000 0.000000 0.000000 0.083333 0.000000 0.916667 0.000000 0.000000 0.750000 0.250000 0.416667 0.333333 0.166667 0.083333 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.166667 0.833333 0.000000 0.500000 0.250000 0.250000 0.000000 0.333333 0.333333 0.333333 0.166667 0.000000 0.000000 0.833333 0.000000 0.333333 0.000000 0.666667 0.000000 0.166667 0.000000 0.833333 0.000000 0.166667 0.666667 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CG][TA]T[GA][CG]T[GT][AC]TT[CGT][CGT]T[TC]TG -------------------------------------------------------------------------------- Time 1.54 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 6 llr = 92 E-value = 3.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :22:2::::255:3:: pos.-specific C :7::8a:::2:::7:2 probability G a:8a::::72552:3: matrix T :2::::aa35::8:78 bits 2.2 * * 1.9 * * *** 1.7 * * *** 1.5 * ** *** Relative 1.3 * ****** * * Entropy 1.1 * ******* ****** (22.1 bits) 0.9 * ******* ****** 0.6 ********* ****** 0.4 ********* ****** 0.2 **************** 0.0 ---------------- Multilevel GCGGCCTTGTAATCTT consensus T GG AG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 28998 365 6.44e-09 CAACGTTTGC GCGGCCTTGTAGTAGT CCAGCAGATG 3233 241 7.61e-09 TGGAATTTTG GCGGACTTGTGGTCTT GTCAGAAGAT 22246 306 4.37e-08 GGTCTCGAGT GTGGCCTTGAAATCTT CACGAATCTG 32934 198 6.04e-08 ACGGAAGTGA GAGGCCTTGGGGTATT TTGGGATTGT 21617 227 1.38e-07 ACACCGCAAC GCGGCCTTTTGAGCTC ACATCATTCG 269327 86 2.15e-07 CTTCTATTCT GCAGCCTTTCAATCGT CATCAACACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 28998 6.4e-09 364_[+2]_120 3233 7.6e-09 240_[+2]_244 22246 4.4e-08 305_[+2]_179 32934 6e-08 197_[+2]_287 21617 1.4e-07 226_[+2]_258 269327 2.1e-07 85_[+2]_399 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=6 28998 ( 365) GCGGCCTTGTAGTAGT 1 3233 ( 241) GCGGACTTGTGGTCTT 1 22246 ( 306) GTGGCCTTGAAATCTT 1 32934 ( 198) GAGGCCTTGGGGTATT 1 21617 ( 227) GCGGCCTTTTGAGCTC 1 269327 ( 86) GCAGCCTTTCAATCGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 10.3682 E= 3.2e+001 -923 -923 216 -923 -74 146 -923 -62 -74 -923 190 -923 -923 -923 216 -923 -74 179 -923 -923 -923 205 -923 -923 -923 -923 -923 196 -923 -923 -923 196 -923 -923 157 37 -74 -53 -42 96 85 -923 116 -923 85 -923 116 -923 -923 -923 -42 169 26 146 -923 -923 -923 -923 58 137 -923 -53 -923 169 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 3.2e+001 0.000000 0.000000 1.000000 0.000000 0.166667 0.666667 0.000000 0.166667 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.166667 0.166667 0.166667 0.500000 0.500000 0.000000 0.500000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 0.166667 0.833333 0.333333 0.666667 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.166667 0.000000 0.833333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GCGGCCTT[GT]T[AG][AG]T[CA][TG]T -------------------------------------------------------------------------------- Time 3.04 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 10 llr = 121 E-value = 5.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::5511245a:68: pos.-specific C :::142::::2:12:1 probability G 92a91:66461:8219 matrix T 18:::3334:2:1:1: bits 2.2 * 1.9 * * 1.7 * ** * * 1.5 * ** * * Relative 1.3 **** ** * Entropy 1.1 **** * ** * (17.4 bits) 0.9 **** ** * ** ** 0.6 ***** ** * ***** 0.4 ********** ***** 0.2 **************** 0.0 ---------------- Multilevel GTGGAAGGGGAAGAAG consensus G CTTTTAC C sequence C A T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 32934 342 1.93e-08 ACGAGTGGAT GTGGAATGGGCAGAAG ATGCAGCGTG 21617 86 1.08e-07 TGAGTTCCAC GTGGCATTGGAAGGAG TTTCCTCTCT 269327 306 4.14e-07 GTTGTAGAGA GTGGCAGTAGTAGCAG AAGCAGTGGC 25156 20 8.94e-07 GTTTGAATGA GTGGATGGTAAATCAG CTGGAGTATC 22246 123 9.82e-07 ACGTTCGACG GTGGACGGTACACAAG ATATTTCGAT 3233 140 1.29e-06 TGCACAACTT GTGCGTGGGAAAGAAG CTTCGTAGAA 10460 136 1.79e-06 AATTAAGAAT TTGGACTGTGAAGGAG GAAAACGAGA 28998 179 2.85e-06 CAGTATGTGT GGGGCTGTGGTAGATG GAAGATGGTA 263656 412 5.00e-06 GGGGTAGACA GGGGCAAATAAAGAAG TTGCCATTTT 24511 297 6.06e-06 GAAACCGACT GTGGAAGGAGGAGAGC CCCCAAGTAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32934 1.9e-08 341_[+3]_143 21617 1.1e-07 85_[+3]_399 269327 4.1e-07 305_[+3]_179 25156 8.9e-07 19_[+3]_465 22246 9.8e-07 122_[+3]_362 3233 1.3e-06 139_[+3]_345 10460 1.8e-06 135_[+3]_349 28998 2.9e-06 178_[+3]_306 263656 5e-06 411_[+3]_73 24511 6.1e-06 296_[+3]_188 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=10 32934 ( 342) GTGGAATGGGCAGAAG 1 21617 ( 86) GTGGCATTGGAAGGAG 1 269327 ( 306) GTGGCAGTAGTAGCAG 1 25156 ( 20) GTGGATGGTAAATCAG 1 22246 ( 123) GTGGACGGTACACAAG 1 3233 ( 140) GTGCGTGGGAAAGAAG 1 10460 ( 136) TTGGACTGTGAAGGAG 1 28998 ( 179) GGGGCTGTGGTAGATG 1 263656 ( 412) GGGGCAAATAAAGAAG 1 24511 ( 297) GTGGAAGGAGGAGAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 9.43433 E= 5.5e+000 -997 -997 201 -136 -997 -997 -16 164 -997 -997 216 -997 -997 -127 201 -997 85 73 -116 -997 85 -27 -997 22 -147 -997 142 22 -147 -997 142 22 -47 -997 84 64 53 -997 142 -997 85 -27 -116 -36 185 -997 -997 -997 -997 -127 184 -136 111 -27 -16 -997 152 -997 -116 -136 -997 -127 201 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 5.5e+000 0.000000 0.000000 0.900000 0.100000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 1.000000 0.000000 0.000000 0.100000 0.900000 0.000000 0.500000 0.400000 0.100000 0.000000 0.500000 0.200000 0.000000 0.300000 0.100000 0.000000 0.600000 0.300000 0.100000 0.000000 0.600000 0.300000 0.200000 0.000000 0.400000 0.400000 0.400000 0.000000 0.600000 0.000000 0.500000 0.200000 0.100000 0.200000 1.000000 0.000000 0.000000 0.000000 0.000000 0.100000 0.800000 0.100000 0.600000 0.200000 0.200000 0.000000 0.800000 0.000000 0.100000 0.100000 0.000000 0.100000 0.900000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[TG]GG[AC][ATC][GT][GT][GTA][GA][ACT]AG[ACG]AG -------------------------------------------------------------------------------- Time 4.52 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10460 4.60e-04 13_[+1(1.47e-05)]_106_\ [+3(1.79e-06)]_349 21617 1.16e-09 24_[+1(1.80e-06)]_45_[+3(1.08e-07)]_\ 125_[+2(1.38e-07)]_258 22246 5.65e-09 122_[+3(9.82e-07)]_147_\ [+1(3.46e-06)]_4_[+2(4.37e-08)]_179 24511 5.00e-05 93_[+1(4.00e-07)]_187_\ [+3(6.06e-06)]_188 25156 3.52e-07 19_[+3(8.94e-07)]_229_\ [+1(7.83e-09)]_220 261141 4.27e-03 364_[+1(5.10e-07)]_120 263656 7.30e-05 233_[+1(6.33e-07)]_162_\ [+3(5.00e-06)]_73 269327 6.07e-09 7_[+1(1.80e-06)]_62_[+2(2.15e-07)]_\ 204_[+3(4.14e-07)]_179 28998 2.71e-08 86_[+1(4.50e-05)]_76_[+3(2.85e-06)]_\ 170_[+2(6.44e-09)]_120 3233 4.55e-09 139_[+3(1.29e-06)]_85_\ [+2(7.61e-09)]_41_[+1(1.20e-05)]_187 32934 8.08e-10 197_[+2(6.04e-08)]_79_\ [+1(1.56e-05)]_33_[+3(1.93e-08)]_143 3428 1.96e-02 193_[+1(5.17e-05)]_126_\ [+1(2.40e-06)]_149 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************