******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/62/62.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ bd1258 1.0000 500 bd1396 1.0000 500 bd93 1.0000 500 ThpsCp003 1.0000 500 ThpsCp004 1.0000 500 ThpsCp061 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/62/62.seqs.fa -oc motifs/62 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 6 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3000 N= 6 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.334 C 0.130 G 0.139 T 0.396 Background letter frequencies (from dataset with add-one prior applied): A 0.334 C 0.130 G 0.139 T 0.396 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 4 llr = 107 E-value = 2.6e-008 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::5a:::::::aa55:a5::5 pos.-specific C 5a::::5::::::::a:5a:: probability G 5:5:a:5a::5::55::::a5 matrix T :::::a::aa5:::::::::: bits 2.9 * * * * ** 2.6 * * * * ** 2.4 * * * * ** 2.1 * * * * ** Relative 1.8 ** * ** * ** Entropy 1.5 ** ******* ** ** ** (38.7 bits) 1.2 ********************* 0.9 ********************* 0.6 ********************* 0.3 ********************* 0.0 --------------------- Multilevel CCAAGTCGTTGAAAACAACGA consensus G G G T GG C G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- ThpsCp004 377 2.23e-15 GCATTCGTGG CCGAGTGGTTGAAGGCACCGG ACTCATAATC bd1396 340 2.23e-15 GCATTCGTGG CCGAGTGGTTGAAGGCACCGG ACTCATAATC ThpsCp003 199 4.10e-12 ACGTGGAAAA GCAAGTCGTTTAAAACAACGA TTTGAATAAT bd1258 159 4.10e-12 ACGTGGAAAA GCAAGTCGTTTAAAACAACGA TTTGAATAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ThpsCp004 2.2e-15 376_[+1]_103 bd1396 2.2e-15 339_[+1]_140 ThpsCp003 4.1e-12 198_[+1]_281 bd1258 4.1e-12 158_[+1]_321 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=4 ThpsCp004 ( 377) CCGAGTGGTTGAAGGCACCGG 1 bd1396 ( 340) CCGAGTGGTTGAAGGCACCGG 1 ThpsCp003 ( 199) GCAAGTCGTTTAAAACAACGA 1 bd1258 ( 159) GCAAGTCGTTTAAAACAACGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 2880 bayes= 10.2276 E= 2.6e-008 -865 194 184 -865 -865 293 -865 -865 58 -865 184 -865 158 -865 -865 -865 -865 -865 284 -865 -865 -865 -865 133 -865 194 184 -865 -865 -865 284 -865 -865 -865 -865 133 -865 -865 -865 133 -865 -865 184 34 158 -865 -865 -865 158 -865 -865 -865 58 -865 184 -865 58 -865 184 -865 -865 293 -865 -865 158 -865 -865 -865 58 194 -865 -865 -865 293 -865 -865 -865 -865 284 -865 58 -865 184 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 2.6e-008 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.500000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CG]C[AG]AGT[CG]GTT[GT]AA[AG][AG]CA[AC]CG[AG] -------------------------------------------------------------------------------- Time 0.32 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 6 llr = 123 E-value = 2.1e-008 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :73737:::::::a7:::333 pos.-specific C 3::33::::7::a::::773: probability G :37::3:7a3:3::3::3::7 matrix T 7:::3:a3::a7:::aa::3: bits 2.9 * * 2.6 * * 2.4 * * 2.1 ** * * Relative 1.8 ** * * Entropy 1.5 * ***** ** **** * (29.7 bits) 1.2 *** ****** ******* * 0.9 **** ************** * 0.6 **** ************** * 0.3 ********************* 0.0 --------------------- Multilevel TAGAAATGGCTTCAATTCCAG consensus CGACCG T G G G GACA sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- ThpsCp061 230 1.06e-11 TTAGGAGTTT TAGCTATGGCTGCAGTTCCTG CAGGATTAAT bd93 82 1.06e-11 TTAGGAGTTT TAGCTATGGCTGCAGTTCCTG CAGGATTAAT ThpsCp004 333 3.47e-11 AAACTTTAGA TGGACATGGGTTCAATTCCCA TCAGCTCCAA bd1396 296 3.47e-11 AAACTTTAGA TGGACATGGGTTCAATTCCCA TCAGCTCCAA ThpsCp003 116 3.22e-09 ATACATTCTC CAAAAGTTGCTTCAATTGAAG TAATTCGTCA bd1258 76 3.22e-09 ATACATTCTC CAAAAGTTGCTTCAATTGAAG TAATTCGTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ThpsCp061 1.1e-11 229_[+2]_250 bd93 1.1e-11 81_[+2]_398 ThpsCp004 3.5e-11 332_[+2]_147 bd1396 3.5e-11 295_[+2]_184 ThpsCp003 3.2e-09 115_[+2]_364 bd1258 3.2e-09 75_[+2]_404 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=6 ThpsCp061 ( 230) TAGCTATGGCTGCAGTTCCTG 1 bd93 ( 82) TAGCTATGGCTGCAGTTCCTG 1 ThpsCp004 ( 333) TGGACATGGGTTCAATTCCCA 1 bd1396 ( 296) TGGACATGGGTTCAATTCCCA 1 ThpsCp003 ( 116) CAAAAGTTGCTTCAATTGAAG 1 bd1258 ( 76) CAAAAGTTGCTTCAATTGAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 2880 bayes= 9.35214 E= 2.1e-008 -923 135 -923 75 99 -923 126 -923 0 -923 226 -923 99 135 -923 -923 0 135 -923 -25 99 -923 126 -923 -923 -923 -923 134 -923 -923 226 -25 -923 -923 284 -923 -923 235 126 -923 -923 -923 -923 134 -923 -923 126 75 -923 294 -923 -923 158 -923 -923 -923 99 -923 126 -923 -923 -923 -923 134 -923 -923 -923 134 -923 235 126 -923 0 235 -923 -923 0 135 -923 -25 0 -923 226 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 2.1e-008 0.000000 0.333333 0.000000 0.666667 0.666667 0.000000 0.333333 0.000000 0.333333 0.000000 0.666667 0.000000 0.666667 0.333333 0.000000 0.000000 0.333333 0.333333 0.000000 0.333333 0.666667 0.000000 0.333333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.333333 0.000000 0.333333 0.666667 0.000000 0.000000 0.333333 0.333333 0.000000 0.333333 0.333333 0.000000 0.666667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TC][AG][GA][AC][ACT][AG]T[GT]G[CG]T[TG]CA[AG]TT[CG][CA][ACT][GA] -------------------------------------------------------------------------------- Time 0.64 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 130 E-value = 2.1e-011 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::::3:::7733:373::: pos.-specific C :7a::7::7a:3:::3:7:37 probability G :::a::3:3::::3a:::a:: matrix T a3::a33a::3:73:33::73 bits 2.9 ** * * * 2.6 ** * * * 2.4 ** * * * 2.1 ** ** * * Relative 1.8 ** ** * * Entropy 1.5 ****** *** * ** * (31.2 bits) 1.2 ****** *** * * ** * 0.9 ****** *** * * **** 0.6 ****** ****** * ***** 0.3 ********************* 0.0 --------------------- Multilevel TCCGTCATCCAATAGAACGTC consensus T TG G TCAG CTA CT sequence T T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- ThpsCp004 406 3.07e-11 GGACTCATAA TCCGTTTTCCTCTGGAACGTC ACTGGTTCGA bd1396 369 3.07e-11 GGACTCATAA TCCGTTTTCCTCTGGAACGTC ACTGGTTCGA ThpsCp061 305 3.45e-11 CAAAATCCAT TCCGTCGTCCAATTGCAAGTT TAGTTTTTAT bd93 157 3.45e-11 CAAAATCCAT TCCGTCGTCCAATTGCAAGTT TAGTTTTTAT ThpsCp003 140 2.55e-10 ATTGAAGTAA TTCGTCATGCAAAAGTTCGCC GATCAAAACT bd1258 100 2.55e-10 ATTGAAGTAA TTCGTCATGCAAAAGTTCGCC GATCAAAACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ThpsCp004 3.1e-11 405_[+3]_74 bd1396 3.1e-11 368_[+3]_111 ThpsCp061 3.5e-11 304_[+3]_175 bd93 3.5e-11 156_[+3]_323 ThpsCp003 2.6e-10 139_[+3]_340 bd1258 2.6e-10 99_[+3]_380 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 ThpsCp004 ( 406) TCCGTTTTCCTCTGGAACGTC 1 bd1396 ( 369) TCCGTTTTCCTCTGGAACGTC 1 ThpsCp061 ( 305) TCCGTCGTCCAATTGCAAGTT 1 bd93 ( 157) TCCGTCGTCCAATTGCAAGTT 1 ThpsCp003 ( 140) TTCGTCATGCAAAAGTTCGCC 1 bd1258 ( 100) TTCGTCATGCAAAAGTTCGCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 2880 bayes= 9.35214 E= 2.1e-011 -923 -923 -923 134 -923 235 -923 -25 -923 294 -923 -923 -923 -923 284 -923 -923 -923 -923 134 -923 235 -923 -25 0 -923 126 -25 -923 -923 -923 134 -923 235 126 -923 -923 294 -923 -923 99 -923 -923 -25 99 135 -923 -923 0 -923 -923 75 0 -923 126 -25 -923 -923 284 -923 0 135 -923 -25 99 -923 -923 -25 0 235 -923 -923 -923 -923 284 -923 -923 135 -923 75 -923 235 -923 -25 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 2.1e-011 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.333333 0.333333 0.000000 0.333333 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.000000 0.000000 0.333333 0.666667 0.333333 0.000000 0.000000 0.333333 0.000000 0.000000 0.666667 0.333333 0.000000 0.333333 0.333333 0.000000 0.000000 1.000000 0.000000 0.333333 0.333333 0.000000 0.333333 0.666667 0.000000 0.000000 0.333333 0.333333 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.666667 0.000000 0.333333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[CT]CGT[CT][AGT]T[CG]C[AT][AC][TA][AGT]G[ACT][AT][CA]G[TC][CT] -------------------------------------------------------------------------------- Time 1.01 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- bd1258 4.72e-19 75_[+2(3.22e-09)]_3_[+3(2.55e-10)]_\ 38_[+1(4.10e-12)]_321 bd1396 5.46e-25 295_[+2(3.47e-11)]_23_\ [+1(2.23e-15)]_8_[+3(3.07e-11)]_111 bd93 5.62e-14 81_[+2(1.06e-11)]_54_[+3(3.45e-11)]_\ 323 ThpsCp003 4.72e-19 115_[+2(3.22e-09)]_3_[+3(2.55e-10)]_\ 38_[+1(4.10e-12)]_281 ThpsCp004 5.46e-25 332_[+2(3.47e-11)]_23_\ [+1(2.23e-15)]_8_[+3(3.07e-11)]_74 ThpsCp061 8.70e-15 229_[+2(1.06e-11)]_54_\ [+3(3.45e-11)]_175 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************