******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/64/64.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1600 1.0000 500 20963 1.0000 500 2322 1.0000 500 25115 1.0000 500 25367 1.0000 500 262925 1.0000 500 270201 1.0000 500 270334 1.0000 500 31635 1.0000 500 32510 1.0000 500 37944 1.0000 500 4022 1.0000 500 7853 1.0000 500 9423 1.0000 500 bd2050 1.0000 500 ThpsCp032 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/64/64.seqs.fa -oc motifs/64 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.280 C 0.220 G 0.231 T 0.269 Background letter frequencies (from dataset with add-one prior applied): A 0.280 C 0.220 G 0.231 T 0.269 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 8 llr = 152 E-value = 1.2e-008 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 6:::3:143a4:4::8913:: pos.-specific C :aaa5a918::8::a3:6:43 probability G ::::3:::::3::3:::3::6 matrix T 4::::::5::4368::1:861 bits 2.2 *** * * 2.0 *** * * 1.7 *** * * * 1.5 *** ** * * Relative 1.3 *** ** ** * * * Entropy 1.1 *** ** ** * **** ** (27.5 bits) 0.9 **** ** ** ********** 0.7 ******* ** ********** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel ACCCCCCTCAACTTCAACTTG consensus T A AA TTAG C GACC sequence G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 7853 355 5.49e-12 CAATCAATCA ACCCCCCTCAACATCAACTCG ATGCTTTGCT 270334 274 2.13e-10 CGGCTCTTGT TCCCGCCAAATCTTCAACTTG AAGTGTCAAG 262925 275 2.13e-10 CGGCTCTTGT TCCCGCCAAATCTTCAACTTG AAGTGTCAAG 20963 467 1.12e-09 TACCCCCCCT TCCCCCCTCAACTTCCTCTTC GTACGTCTTG 31635 2 1.58e-09 A ACCCCCCTCAGTTGCAAGATG GTAGGATCCT 270201 1 1.58e-09 . ACCCCCCTCAGTTGCAAGATG GTAGGATCCT 37944 345 2.35e-09 TGAACCGTCG ACCCACCACATCATCAAATCC TTCTTCGTCA 32510 415 1.69e-08 CCTCTCCTAC ACCCACACCAACATCCACTCT CACAAAACGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7853 5.5e-12 354_[+1]_125 270334 2.1e-10 273_[+1]_206 262925 2.1e-10 274_[+1]_205 20963 1.1e-09 466_[+1]_13 31635 1.6e-09 1_[+1]_478 270201 1.6e-09 [+1]_479 37944 2.3e-09 344_[+1]_135 32510 1.7e-08 414_[+1]_65 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=8 7853 ( 355) ACCCCCCTCAACATCAACTCG 1 270334 ( 274) TCCCGCCAAATCTTCAACTTG 1 262925 ( 275) TCCCGCCAAATCTTCAACTTG 1 20963 ( 467) TCCCCCCTCAACTTCCTCTTC 1 31635 ( 2) ACCCCCCTCAGTTGCAAGATG 1 270201 ( 1) ACCCCCCTCAGTTGCAAGATG 1 37944 ( 345) ACCCACCACATCATCAAATCC 1 32510 ( 415) ACCCACACCAACATCCACTCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 9.90539 E= 1.2e-008 116 -965 -965 48 -965 218 -965 -965 -965 218 -965 -965 -965 218 -965 -965 -16 118 12 -965 -965 218 -965 -965 -116 199 -965 -965 42 -82 -965 89 -16 177 -965 -965 184 -965 -965 -965 42 -965 12 48 -965 177 -965 -11 42 -965 -965 121 -965 -965 12 148 -965 218 -965 -965 142 18 -965 -965 164 -965 -965 -111 -116 150 12 -965 -16 -965 -965 148 -965 77 -965 121 -965 18 144 -111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 1.2e-008 0.625000 0.000000 0.000000 0.375000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.375000 0.125000 0.000000 0.500000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.375000 0.000000 0.250000 0.375000 0.000000 0.750000 0.000000 0.250000 0.375000 0.000000 0.000000 0.625000 0.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.875000 0.000000 0.000000 0.125000 0.125000 0.625000 0.250000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.375000 0.000000 0.625000 0.000000 0.250000 0.625000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AT]CCC[CAG]CC[TA][CA]A[ATG][CT][TA][TG]C[AC]A[CG][TA][TC][GC] -------------------------------------------------------------------------------- Time 1.99 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 11 llr = 172 E-value = 7.3e-007 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 9245a:65:681:::7::3:1 pos.-specific C :255:a::62:7116:9131: probability G :4::::::12:2584:17:9: matrix T 1321::453:2:51:3:25:9 bits 2.2 * 2.0 * 1.7 ** * * 1.5 ** * ** Relative 1.3 * ** ** * ** Entropy 1.1 * ** ** ***** ** (22.5 bits) 0.9 * ***** ** ***** ** 0.7 * *************** ** 0.4 * ******************* 0.2 * ******************* 0.0 --------------------- Multilevel AGCAACAACAACGGCACGTGT consensus TAC TTT T GT A sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- ThpsCp032 390 9.26e-10 TTCAATATTT ACAAACTACAACTGGACGCGT GTTATTAAAT bd2050 391 9.26e-10 TTCAATATTT ACAAACTACAACTGGACGCGT GTTATTAAAT 25115 93 2.89e-09 CCTCGGAAGC AACAACAACAACGGCACCAGT CTCGTGGAAG 31635 86 1.18e-08 GTTGAGACAC ATTCACATCATGTGCACGTGT CAACTTCACA 270201 85 1.18e-08 GTTGAGACAC ATTCACATCATGTGCACGTGT CAACTTCACA 270334 253 1.91e-08 TGGAAGGTAG AGCCACTTTCACGGCTCTTGT TCCCGCCAAA 262925 254 1.91e-08 TGGAAGGTAG AGCCACTTTCACGGCTCTTGT TCCCGCCAAA 37944 317 5.82e-08 CCGACTGCTA AAACACATCAACCGCTCGTGA ACCGTCGACC 32510 175 1.62e-07 TATTGCAAGC AGCAACAAGAAAGCGACGCGT GTTGTTTTGC 7853 442 3.43e-07 GCGGACAACC ATCTACAACGACTTCACGACT TCATACGTTA 2322 123 3.63e-07 ATGAATATAT TGAAACAATGACGGGAGGAGT CGTATTGTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ThpsCp032 9.3e-10 389_[+2]_90 bd2050 9.3e-10 390_[+2]_89 25115 2.9e-09 92_[+2]_387 31635 1.2e-08 85_[+2]_394 270201 1.2e-08 84_[+2]_395 270334 1.9e-08 252_[+2]_227 262925 1.9e-08 253_[+2]_226 37944 5.8e-08 316_[+2]_163 32510 1.6e-07 174_[+2]_305 7853 3.4e-07 441_[+2]_38 2322 3.6e-07 122_[+2]_357 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=11 ThpsCp032 ( 390) ACAAACTACAACTGGACGCGT 1 bd2050 ( 391) ACAAACTACAACTGGACGCGT 1 25115 ( 93) AACAACAACAACGGCACCAGT 1 31635 ( 86) ATTCACATCATGTGCACGTGT 1 270201 ( 85) ATTCACATCATGTGCACGTGT 1 270334 ( 253) AGCCACTTTCACGGCTCTTGT 1 262925 ( 254) AGCCACTTTCACGGCTCTTGT 1 37944 ( 317) AAACACATCAACCGCTCGTGA 1 32510 ( 175) AGCAACAAGAAAGCGACGCGT 1 7853 ( 442) ATCTACAACGACTTCACGACT 1 2322 ( 123) TGAAACAATGACGGGAGGAGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 9.80094 E= 7.3e-007 170 -1010 -1010 -156 -62 -28 66 2 38 104 -1010 -57 70 104 -1010 -156 184 -1010 -1010 -1010 -1010 218 -1010 -1010 119 -1010 -1010 43 96 -1010 -1010 75 -1010 153 -134 2 119 -28 -34 -1010 155 -1010 -1010 -57 -162 172 -34 -1010 -1010 -128 98 75 -1010 -128 183 -156 -1010 153 66 -1010 138 -1010 -1010 2 -1010 204 -134 -1010 -1010 -128 166 -57 -4 31 -1010 75 -1010 -128 198 -1010 -162 -1010 -1010 175 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 11 E= 7.3e-007 0.909091 0.000000 0.000000 0.090909 0.181818 0.181818 0.363636 0.272727 0.363636 0.454545 0.000000 0.181818 0.454545 0.454545 0.000000 0.090909 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.636364 0.000000 0.000000 0.363636 0.545455 0.000000 0.000000 0.454545 0.000000 0.636364 0.090909 0.272727 0.636364 0.181818 0.181818 0.000000 0.818182 0.000000 0.000000 0.181818 0.090909 0.727273 0.181818 0.000000 0.000000 0.090909 0.454545 0.454545 0.000000 0.090909 0.818182 0.090909 0.000000 0.636364 0.363636 0.000000 0.727273 0.000000 0.000000 0.272727 0.000000 0.909091 0.090909 0.000000 0.000000 0.090909 0.727273 0.181818 0.272727 0.272727 0.000000 0.454545 0.000000 0.090909 0.909091 0.000000 0.090909 0.000000 0.000000 0.909091 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[GT][CA][AC]AC[AT][AT][CT]AAC[GT]G[CG][AT]CG[TAC]GT -------------------------------------------------------------------------------- Time 3.80 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 11 llr = 176 E-value = 2.0e-007 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::1::5:412a::9:263111 pos.-specific C :14354:69::6::5:3::59 probability G 575:5:8::8:18135:29:: matrix T 52:7:12::::32:3415:5: bits 2.2 2.0 1.7 * * * * 1.5 * * * * * Relative 1.3 * *** ** * * Entropy 1.1 ** ** ***** ** * * (23.1 bits) 0.9 ***** ******** * * 0.7 *************** * *** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel TGGTGAGCCGACGACGATGCC consensus G CCCC A T GTCA T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 31635 222 2.20e-10 GGTGAAGGCA TGCCCCGCCGACGAGTATGCC GAGATTAAGA 270334 75 2.20e-10 AGATGCACCT TGCTCAGACGACGACGAAGTC GACGAAGCAG 270201 221 2.20e-10 GGTGAAGGCA TGCCCCGCCGACGAGTATGCC GAGATTAAGA 262925 76 2.20e-10 AGATGCACCT TGCTCAGACGACGACGAAGTC GACGAAGCAG ThpsCp032 87 1.72e-08 CACCAGCAAA TGGTGAGCCAATTATTATGCC TAGTCAAGAT bd2050 88 1.72e-08 CACCAGCAAA TGGTGAGCCAATTATTATGCC TAGTCAAGAT 9423 4 7.72e-08 TGT GGATGATACGACGACGAAGAC GATAACCAGC 4022 146 8.89e-08 TTGTGTCATT GCGTCCGCCGATGGCACTGCC AAGATCGGAA 25367 318 2.31e-07 AGAGATGGAT GGGCGAGCCGACGATGCGATA GGAGAGAGAG 32510 37 2.74e-07 AAAATTTAAT GTGTGCGCAGAGGAGGCGGTC TGTGGCCGCC 25115 294 3.42e-07 GGGTTGAAGC GTGTGTTACGACGACATTGTC ACTTTACGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31635 2.2e-10 221_[+3]_258 270334 2.2e-10 74_[+3]_405 270201 2.2e-10 220_[+3]_259 262925 2.2e-10 75_[+3]_404 ThpsCp032 1.7e-08 86_[+3]_393 bd2050 1.7e-08 87_[+3]_392 9423 7.7e-08 3_[+3]_476 4022 8.9e-08 145_[+3]_334 25367 2.3e-07 317_[+3]_162 32510 2.7e-07 36_[+3]_443 25115 3.4e-07 293_[+3]_186 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=11 31635 ( 222) TGCCCCGCCGACGAGTATGCC 1 270334 ( 75) TGCTCAGACGACGACGAAGTC 1 270201 ( 221) TGCCCCGCCGACGAGTATGCC 1 262925 ( 76) TGCTCAGACGACGACGAAGTC 1 ThpsCp032 ( 87) TGGTGAGCCAATTATTATGCC 1 bd2050 ( 88) TGGTGAGCCAATTATTATGCC 1 9423 ( 4) GGATGATACGACGACGAAGAC 1 4022 ( 146) GCGTCCGCCGATGGCACTGCC 1 25367 ( 318) GGGCGAGCCGACGATGCGATA 1 32510 ( 37) GTGTGCGCAGAGGAGGCGGTC 1 25115 ( 294) GTGTGTTACGACGACATTGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 9.80094 E= 2.0e-007 -1010 -1010 98 102 -1010 -128 166 -57 -162 72 124 -1010 -1010 31 -1010 143 -1010 104 124 -1010 96 72 -1010 -156 -1010 -1010 183 -57 38 153 -1010 -1010 -162 204 -1010 -1010 -62 -1010 183 -1010 184 -1010 -1010 -1010 -1010 153 -134 2 -1010 -1010 183 -57 170 -1010 -134 -1010 -1010 104 24 2 -62 -1010 98 43 119 31 -1010 -156 -4 -1010 -34 102 -162 -1010 198 -1010 -162 104 -1010 75 -162 204 -1010 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 11 E= 2.0e-007 0.000000 0.000000 0.454545 0.545455 0.000000 0.090909 0.727273 0.181818 0.090909 0.363636 0.545455 0.000000 0.000000 0.272727 0.000000 0.727273 0.000000 0.454545 0.545455 0.000000 0.545455 0.363636 0.000000 0.090909 0.000000 0.000000 0.818182 0.181818 0.363636 0.636364 0.000000 0.000000 0.090909 0.909091 0.000000 0.000000 0.181818 0.000000 0.818182 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.636364 0.090909 0.272727 0.000000 0.000000 0.818182 0.181818 0.909091 0.000000 0.090909 0.000000 0.000000 0.454545 0.272727 0.272727 0.181818 0.000000 0.454545 0.363636 0.636364 0.272727 0.000000 0.090909 0.272727 0.000000 0.181818 0.545455 0.090909 0.000000 0.909091 0.000000 0.090909 0.454545 0.000000 0.454545 0.090909 0.909091 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TG]G[GC][TC][GC][AC]G[CA]CGA[CT]GA[CGT][GT][AC][TA]G[CT]C -------------------------------------------------------------------------------- Time 5.60 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1600 2.53e-01 500 20963 8.20e-06 466_[+1(1.12e-09)]_13 2322 5.70e-03 122_[+2(3.63e-07)]_357 25115 4.32e-08 92_[+2(2.89e-09)]_180_\ [+3(3.42e-07)]_186 25367 5.19e-03 317_[+3(2.31e-07)]_162 262925 9.88e-17 75_[+3(2.20e-10)]_157_\ [+2(1.91e-08)]_[+1(2.13e-10)]_205 270201 4.24e-16 [+1(1.58e-09)]_63_[+2(1.18e-08)]_\ 115_[+3(2.20e-10)]_259 270334 9.88e-17 74_[+3(2.20e-10)]_157_\ [+2(1.91e-08)]_[+1(2.13e-10)]_206 31635 4.24e-16 1_[+1(1.58e-09)]_63_[+2(1.18e-08)]_\ 115_[+3(2.20e-10)]_258 32510 4.02e-11 36_[+3(2.74e-07)]_117_\ [+2(1.62e-07)]_219_[+1(1.69e-08)]_65 37944 2.57e-09 316_[+2(5.82e-08)]_7_[+1(2.35e-09)]_\ 77_[+1(6.71e-05)]_37 4022 9.58e-04 145_[+3(8.89e-08)]_334 7853 1.11e-10 354_[+1(5.49e-12)]_66_\ [+2(3.43e-07)]_38 9423 5.51e-05 3_[+3(7.72e-08)]_476 bd2050 1.11e-09 87_[+3(1.72e-08)]_282_\ [+2(9.26e-10)]_89 ThpsCp032 1.11e-09 86_[+3(1.72e-08)]_282_\ [+2(9.26e-10)]_90 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************