******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/359/359.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10848 1.0000 500 10996 1.0000 500 11618 1.0000 500 1751 1.0000 500 2049 1.0000 500 20692 1.0000 500 21318 1.0000 500 25153 1.0000 500 264601 1.0000 500 3054 1.0000 500 3055 1.0000 500 3099 1.0000 500 4630 1.0000 500 5210 1.0000 500 7239 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/359/359.seqs.fa -oc motifs/359 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.272 C 0.244 G 0.218 T 0.266 Background letter frequencies (from dataset with add-one prior applied): A 0.272 C 0.244 G 0.218 T 0.266 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 15 llr = 151 E-value = 5.0e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :11:32:1::511:: pos.-specific C a:2851a:2a124:9 probability G :51:11:31:1111: matrix T :46217:67:35491 bits 2.2 2.0 * * * 1.8 * * * * 1.5 * * * * Relative 1.3 * * * * ** Entropy 1.1 * * * * ** (14.6 bits) 0.9 * * * ** ** 0.7 ** * ***** ** 0.4 **** ***** ** 0.2 *************** 0.0 --------------- Multilevel CGTCCTCTTCATCTC consensus TCTAA GC TCT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 20692 234 2.50e-07 TGCCGCTCGC CGTCATCTTCATGTC CCGCAGCCGC 3054 444 3.92e-07 AGCACTTGAT CTTCATCTTCAATTC ACTCGACCTC 10996 261 3.92e-07 TACCTCGCTA CTACCTCGTCATCTC TACCGTGTCT 7239 136 1.03e-06 TTGTTGTTGT CGTCCTCTCCGTCTC GCTTTTTCTT 1751 105 1.33e-06 CAGTTTCTGT CATCCTCTTCTTATC CGAGCTGTCA 2049 324 6.83e-06 AGAACGCTTT CGTCATCGGCAGCTC ATGTCTCGCT 5210 182 8.24e-06 TAGAGCATCA CTGCCTCTTCCCTTC GTGTGGTCGG 4630 60 9.04e-06 GGTCTCTTCT CGTTTGCTTCATTTC TTGTTTTGTC 10848 25 1.40e-05 GAATCATATC CTCCTACTTCCTCTC AGCATCGTCA 11618 480 1.80e-05 TTCACCACAA CTTCCTCGCCTCCGC CATTCC 25153 35 2.11e-05 AATGATGCAT CGATGTCTTCATTGC TAGCAGTATC 3055 210 2.28e-05 ACAAGACAAA CTTCGTCATCATTTT GAGAAGCTAT 264601 384 2.85e-05 ACGCTACTTC CGCCCCCTCCTCCTC CTCCAATCCA 21318 461 3.30e-05 GCGCAAGTTA CGCTCACGTCTGTTC CAAAAGCCCA 3099 453 7.11e-05 CACAACAACT CATCAACATCAAATC ATCGTCACCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 20692 2.5e-07 233_[+1]_252 3054 3.9e-07 443_[+1]_42 10996 3.9e-07 260_[+1]_225 7239 1e-06 135_[+1]_350 1751 1.3e-06 104_[+1]_381 2049 6.8e-06 323_[+1]_162 5210 8.2e-06 181_[+1]_304 4630 9e-06 59_[+1]_426 10848 1.4e-05 24_[+1]_461 11618 1.8e-05 479_[+1]_6 25153 2.1e-05 34_[+1]_451 3055 2.3e-05 209_[+1]_276 264601 2.9e-05 383_[+1]_102 21318 3.3e-05 460_[+1]_25 3099 7.1e-05 452_[+1]_33 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=15 20692 ( 234) CGTCATCTTCATGTC 1 3054 ( 444) CTTCATCTTCAATTC 1 10996 ( 261) CTACCTCGTCATCTC 1 7239 ( 136) CGTCCTCTCCGTCTC 1 1751 ( 105) CATCCTCTTCTTATC 1 2049 ( 324) CGTCATCGGCAGCTC 1 5210 ( 182) CTGCCTCTTCCCTTC 1 4630 ( 60) CGTTTGCTTCATTTC 1 10848 ( 25) CTCCTACTTCCTCTC 1 11618 ( 480) CTTCCTCGCCTCCGC 1 25153 ( 35) CGATGTCTTCATTGC 1 3055 ( 210) CTTCGTCATCATTTT 1 264601 ( 384) CGCCCCCTCCTCCTC 1 21318 ( 461) CGCTCACGTCTGTTC 1 3099 ( 453) CATCAACATCAAATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 7290 bayes= 8.92184 E= 5.0e-001 -1055 204 -1055 -1055 -103 -1055 110 59 -103 -29 -170 117 -1055 171 -1055 -41 -3 94 -71 -100 -44 -187 -170 132 -1055 204 -1055 -1055 -103 -1055 29 117 -1055 -29 -170 146 -1055 204 -1055 -1055 97 -87 -170 0 -103 -29 -71 100 -103 71 -170 59 -1055 -1055 -71 170 -1055 194 -1055 -200 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 15 E= 5.0e-001 0.000000 1.000000 0.000000 0.000000 0.133333 0.000000 0.466667 0.400000 0.133333 0.200000 0.066667 0.600000 0.000000 0.800000 0.000000 0.200000 0.266667 0.466667 0.133333 0.133333 0.200000 0.066667 0.066667 0.666667 0.000000 1.000000 0.000000 0.000000 0.133333 0.000000 0.266667 0.600000 0.000000 0.200000 0.066667 0.733333 0.000000 1.000000 0.000000 0.000000 0.533333 0.133333 0.066667 0.266667 0.133333 0.200000 0.133333 0.533333 0.133333 0.400000 0.066667 0.400000 0.000000 0.000000 0.133333 0.866667 0.000000 0.933333 0.000000 0.066667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[GT][TC][CT][CA][TA]C[TG][TC]C[AT][TC][CT]TC -------------------------------------------------------------------------------- Time 1.95 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 7 llr = 103 E-value = 1.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :1:7:1::9341::76 pos.-specific C ::::::::::4:43:: probability G a6a379aa:6:96434 matrix T :3::3:::111::3:: bits 2.2 * * ** 2.0 * * ** 1.8 * * ** 1.5 * * *** * Relative 1.3 * * ***** * Entropy 1.1 * ******* ** ** (21.2 bits) 0.9 * ******* ** ** 0.7 ********** ** ** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GGGAGGGGAGAGGGAA consensus T GT AC CCGG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 10996 62 1.94e-09 AGTTTTCTTT GGGAGGGGAACGGGAG ATTAAATGTT 3099 90 3.38e-09 TTTGAGCGGC GGGGGGGGAGCGCGAG GGGTTCGATA 5210 226 8.28e-08 CGAATTTTGA GGGGGGGGAGTGGCGA CAGCAGGCTC 21318 69 9.90e-08 ATCATGAGAT GGGATGGGATAGGCAA CCCTCTGCAG 20692 25 2.29e-07 TAGAGATGGA GTGATGGGTGAGCGAA GTTGAGATGG 25153 442 2.78e-07 AAAGAGAGAT GTGAGGGGAAAAGTAA GACATGATGA 2049 130 5.20e-07 CATCGTCTCG GAGAGAGGAGCGCTGG CCTCTGACGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10996 1.9e-09 61_[+2]_423 3099 3.4e-09 89_[+2]_395 5210 8.3e-08 225_[+2]_259 21318 9.9e-08 68_[+2]_416 20692 2.3e-07 24_[+2]_460 25153 2.8e-07 441_[+2]_43 2049 5.2e-07 129_[+2]_355 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=7 10996 ( 62) GGGAGGGGAACGGGAG 1 3099 ( 90) GGGGGGGGAGCGCGAG 1 5210 ( 226) GGGGGGGGAGTGGCGA 1 21318 ( 69) GGGATGGGATAGGCAA 1 20692 ( 25) GTGATGGGTGAGCGAA 1 25153 ( 442) GTGAGGGGAAAAGTAA 1 2049 ( 130) GAGAGAGGAGCGCTGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7275 bayes= 11.2432 E= 1.1e+001 -945 -945 220 -945 -93 -945 139 10 -945 -945 220 -945 139 -945 39 -945 -945 -945 171 10 -93 -945 198 -945 -945 -945 220 -945 -945 -945 220 -945 165 -945 -945 -90 7 -945 139 -90 65 81 -945 -90 -93 -945 198 -945 -945 81 139 -945 -945 23 98 10 139 -945 39 -945 107 -945 98 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 1.1e+001 0.000000 0.000000 1.000000 0.000000 0.142857 0.000000 0.571429 0.285714 0.000000 0.000000 1.000000 0.000000 0.714286 0.000000 0.285714 0.000000 0.000000 0.000000 0.714286 0.285714 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.857143 0.000000 0.000000 0.142857 0.285714 0.000000 0.571429 0.142857 0.428571 0.428571 0.000000 0.142857 0.142857 0.000000 0.857143 0.000000 0.000000 0.428571 0.571429 0.000000 0.000000 0.285714 0.428571 0.285714 0.714286 0.000000 0.285714 0.000000 0.571429 0.000000 0.428571 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[GT]G[AG][GT]GGGA[GA][AC]G[GC][GCT][AG][AG] -------------------------------------------------------------------------------- Time 3.92 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 106 E-value = 3.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :3:53:8223:3a5:7::a37 pos.-specific C :72:3::22:a2:3::3:::: probability G a:832a2273:3::a:7a:73 matrix T :::22::5:3:2:2:3::::: bits 2.2 * * * * 2.0 * * * * * ** 1.8 * * * * * ** 1.5 * * * * * * ** Relative 1.3 * * ** * * * *** Entropy 1.1 *** ** * * * ***** (25.5 bits) 0.9 *** ** * * * ******* 0.7 *** ** * * * ******* 0.4 **** ** *** ********* 0.2 **** ****** ********* 0.0 --------------------- Multilevel GCGAAGATGACAAAGAGGAGA consensus A GC G G C TC AG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 5210 51 5.86e-10 CTTTGAGATT GCGAGGACGACGACGAGGAGG AGGGCGGCGA 25153 238 6.72e-10 GATTGATATT GCGGCGGTGGCTACGAGGAGA GGGACTATTG 2049 46 6.39e-09 GGTGGAGGTG GCGAAGAGAACGAAGAGGAAG ATATCTATCG 3054 186 8.88e-09 GATGGATTTG GAGGTGAAGTCAAAGTCGAGA GAACACAACG 3055 114 9.57e-09 AGTCGTAGCC GAGTAGATGTCAAAGTCGAAA GAACACAATG 21318 279 1.27e-08 ACAGTTCCAA GCCACGATCGCCATGAGGAGA CGCACCGAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5210 5.9e-10 50_[+3]_429 25153 6.7e-10 237_[+3]_242 2049 6.4e-09 45_[+3]_434 3054 8.9e-09 185_[+3]_294 3055 9.6e-09 113_[+3]_366 21318 1.3e-08 278_[+3]_201 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 5210 ( 51) GCGAGGACGACGACGAGGAGG 1 25153 ( 238) GCGGCGGTGGCTACGAGGAGA 1 2049 ( 46) GCGAAGAGAACGAAGAGGAAG 1 3054 ( 186) GAGGTGAAGTCAAAGTCGAGA 1 3055 ( 114) GAGTAGATGTCAAAGTCGAAA 1 21318 ( 279) GCCACGATCGCCATGAGGAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 9.88626 E= 3.1e+001 -923 -923 220 -923 29 145 -923 -923 -923 -55 194 -923 88 -923 61 -68 29 45 -38 -68 -923 -923 220 -923 161 -923 -38 -923 -71 -55 -38 91 -71 -55 161 -923 29 -923 61 32 -923 203 -923 -923 29 -55 61 -68 188 -923 -923 -923 88 45 -923 -68 -923 -923 220 -923 129 -923 -923 32 -923 45 161 -923 -923 -923 220 -923 188 -923 -923 -923 29 -923 161 -923 129 -923 61 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 3.1e+001 0.000000 0.000000 1.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.500000 0.000000 0.333333 0.166667 0.333333 0.333333 0.166667 0.166667 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.166667 0.166667 0.166667 0.500000 0.166667 0.166667 0.666667 0.000000 0.333333 0.000000 0.333333 0.333333 0.000000 1.000000 0.000000 0.000000 0.333333 0.166667 0.333333 0.166667 1.000000 0.000000 0.000000 0.000000 0.500000 0.333333 0.000000 0.166667 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.666667 0.000000 0.333333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[CA]G[AG][AC]GATG[AGT]C[AG]A[AC]G[AT][GC]GA[GA][AG] -------------------------------------------------------------------------------- Time 5.73 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10848 3.97e-02 24_[+1(1.40e-05)]_461 10996 1.19e-08 61_[+2(1.94e-09)]_183_\ [+1(3.92e-07)]_33_[+1(4.63e-05)]_140_[+1(4.93e-05)]_22 11618 1.82e-02 479_[+1(1.80e-05)]_6 1751 1.51e-03 104_[+1(1.33e-06)]_381 2049 9.83e-10 45_[+3(6.39e-09)]_11_[+2(6.09e-05)]_\ 36_[+2(5.20e-07)]_178_[+1(6.83e-06)]_162 20692 1.41e-07 24_[+2(2.29e-07)]_193_\ [+1(2.50e-07)]_252 21318 1.71e-09 68_[+2(9.90e-08)]_194_\ [+3(1.27e-08)]_161_[+1(3.30e-05)]_25 25153 1.93e-10 34_[+1(2.11e-05)]_188_\ [+3(6.72e-10)]_183_[+2(2.78e-07)]_43 264601 3.77e-02 383_[+1(2.85e-05)]_102 3054 1.11e-07 185_[+3(8.88e-09)]_74_\ [+1(9.90e-06)]_148_[+1(3.92e-07)]_42 3055 5.57e-06 113_[+3(9.57e-09)]_75_\ [+1(2.28e-05)]_276 3099 2.03e-06 89_[+2(3.38e-09)]_347_\ [+1(7.11e-05)]_33 4630 8.78e-02 59_[+1(9.04e-06)]_426 5210 2.28e-11 50_[+3(5.86e-10)]_110_\ [+1(8.24e-06)]_29_[+2(8.28e-08)]_128_[+1(7.11e-05)]_116 7239 1.63e-02 135_[+1(1.03e-06)]_350 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************