******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/447/447.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 46603 1.0000 500 7778 1.0000 500 41756 1.0000 500 55097 1.0000 500 43878 1.0000 500 33498 1.0000 500 44694 1.0000 500 42742 1.0000 500 49303 1.0000 500 49323 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/447/447.seqs.fa -oc motifs/447 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.265 C 0.243 G 0.217 T 0.275 Background letter frequencies (from dataset with add-one prior applied): A 0.265 C 0.243 G 0.217 T 0.275 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 5 llr = 80 E-value = 1.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 4a:::a26:::a6a: pos.-specific C ::2:::8:26::2:: probability G 6:::a:::8:a:2:8 matrix T ::8a:::4:4::::2 bits 2.2 * * 2.0 * ** ** * 1.8 * *** ** * 1.5 * *** * ** * Relative 1.3 * **** * ** ** Entropy 1.1 ******* **** ** (23.0 bits) 0.9 ************ ** 0.7 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel GATTGACAGCGAAAG consensus A C ATCT C T sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 46603 166 1.52e-08 AACAAATCGA AATTGACTGTGAAAG CCTACGTCAT 49303 304 2.79e-08 GAAAAGGAAT GATTGACACTGAAAG ACCGCATGGC 7778 454 2.79e-08 CTCGCGAAGG AATTGAAAGCGAAAG ATCTTATCAA 49323 259 3.26e-08 GAAATTGCGT GACTGACAGCGAGAG AAATACACCG 42742 308 7.76e-08 GGCAGTCCTG GATTGACTGCGACAT TCTATCCTAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46603 1.5e-08 165_[+1]_320 49303 2.8e-08 303_[+1]_182 7778 2.8e-08 453_[+1]_32 49323 3.3e-08 258_[+1]_227 42742 7.8e-08 307_[+1]_178 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=5 46603 ( 166) AATTGACTGTGAAAG 1 49303 ( 304) GATTGACACTGAAAG 1 7778 ( 454) AATTGAAAGCGAAAG 1 49323 ( 259) GACTGACAGCGAGAG 1 42742 ( 308) GATTGACTGCGACAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4860 bayes= 10.1751 E= 1.3e+001 60 -897 147 -897 192 -897 -897 -897 -897 -28 -897 154 -897 -897 -897 186 -897 -897 220 -897 192 -897 -897 -897 -40 171 -897 -897 118 -897 -897 54 -897 -28 188 -897 -897 130 -897 54 -897 -897 220 -897 192 -897 -897 -897 118 -28 -11 -897 192 -897 -897 -897 -897 -897 188 -46 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 5 E= 1.3e+001 0.400000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.600000 0.000000 0.000000 0.400000 0.000000 0.200000 0.800000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.600000 0.200000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA]A[TC]TGA[CA][AT][GC][CT]GA[ACG]A[GT] -------------------------------------------------------------------------------- Time 0.98 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 6 llr = 87 E-value = 1.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::8::2:3:5:22: pos.-specific C 8::2272::3:3:85: probability G :a8:::78::8:a:2: matrix T 2:28:32:a322::2a bits 2.2 * * 2.0 * * 1.8 * * * * 1.5 ** ** * * * Relative 1.3 ***** ** * ** * Entropy 1.1 ****** ** * ** * (20.9 bits) 0.9 ********* * ** * 0.7 ********* * ** * 0.4 ************** * 0.2 **************** 0.0 ---------------- Multilevel CGGTACGGTAGAGCCT consensus T C C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 43878 61 5.99e-09 AGGCCCAACA CGGTACTGTCGAGCCT TTCTTCCCAG 49303 204 2.41e-08 GTGGCCTGAC CGGTATGATAGAGCCT ATTTTATCTT 33498 240 1.27e-07 CGACAATGCT TGGTACGGTATCGCCT ACGGCAACAA 49323 93 1.92e-07 TCGTTGCCCA CGTTACGGTTGTGCTT ACTGTCCCTT 42742 16 2.66e-07 GTGTAGACAT CGGCATCGTCGAGCGT AGTATCCATC 55097 146 2.66e-07 CTGCCCGGTC CGGTCCGGTTGCGAAT TCCTATACGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43878 6e-09 60_[+2]_424 49303 2.4e-08 203_[+2]_281 33498 1.3e-07 239_[+2]_245 49323 1.9e-07 92_[+2]_392 42742 2.7e-07 15_[+2]_469 55097 2.7e-07 145_[+2]_339 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=6 43878 ( 61) CGGTACTGTCGAGCCT 1 49303 ( 204) CGGTATGATAGAGCCT 1 33498 ( 240) TGGTACGGTATCGCCT 1 49323 ( 93) CGTTACGGTTGTGCTT 1 42742 ( 16) CGGCATCGTCGAGCGT 1 55097 ( 146) CGGTCCGGTTGCGAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4850 bayes= 10.105 E= 1.3e+002 -923 177 -923 -72 -923 -923 220 -923 -923 -923 194 -72 -923 -55 -923 160 165 -55 -923 -923 -923 145 -923 28 -923 -55 162 -72 -67 -923 194 -923 -923 -923 -923 186 33 45 -923 28 -923 -923 194 -72 92 45 -923 -72 -923 -923 220 -923 -67 177 -923 -923 -67 104 -38 -72 -923 -923 -923 186 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 1.3e+002 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.166667 0.000000 0.833333 0.833333 0.166667 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.166667 0.666667 0.166667 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 0.000000 1.000000 0.333333 0.333333 0.000000 0.333333 0.000000 0.000000 0.833333 0.166667 0.500000 0.333333 0.000000 0.166667 0.000000 0.000000 1.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.166667 0.500000 0.166667 0.166667 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CGGTA[CT]GGT[ACT]G[AC]GCCT -------------------------------------------------------------------------------- Time 1.83 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 7 llr = 109 E-value = 4.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 791:1a:431:73:931:16: pos.-specific C 11::7:91::7:39:7:693: probability G 1:691:1:61:13:1:94:16 matrix T ::31:::4173111::::::4 bits 2.2 2.0 * 1.8 * 1.5 * ** * Relative 1.3 * * ** ** * * Entropy 1.1 * * ** * ****** * (22.4 bits) 0.9 ** **** ** ****** * 0.7 ******* **** ******** 0.4 ************ ******** 0.2 ************ ******** 0.0 --------------------- Multilevel AAGGCACAGTCAACACGCCAG consensus T TA T C A G CT sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 7778 63 2.33e-09 GTGCGTCTTC GAGGCACAGTCAATACGCCCG GAGAACCGGG 43878 282 8.90e-09 AGTCCTGGGA AAAGCACTTTCTGCACGGCAG GCCTCTGCAG 49323 132 1.55e-08 CTTCCGTTTC AATTGACAGTTACCACGCCAG AACCAAACAC 42742 418 2.02e-08 GAGAGGTAGG CAGGCAGTATCATCACGGCAT CGGTGTTCTT 33498 93 3.08e-08 GGAGCCTTCA AAGGCACCGTCGACGAGCCCT TCAACATCGC 41756 219 1.36e-07 GAGAAAAGGA ACGGCACAAACACCACGCAGT GCAGAAAAAT 44694 84 1.88e-07 CGCGATATCA AATGAACTGGTAGCAAAGCAG CAATACCTAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7778 2.3e-09 62_[+3]_417 43878 8.9e-09 281_[+3]_198 49323 1.5e-08 131_[+3]_348 42742 2e-08 417_[+3]_62 33498 3.1e-08 92_[+3]_387 41756 1.4e-07 218_[+3]_261 44694 1.9e-07 83_[+3]_396 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=7 7778 ( 63) GAGGCACAGTCAATACGCCCG 1 43878 ( 282) AAAGCACTTTCTGCACGGCAG 1 49323 ( 132) AATTGACAGTTACCACGCCAG 1 42742 ( 418) CAGGCAGTATCATCACGGCAT 1 33498 ( 93) AAGGCACCGTCGACGAGCCCT 1 41756 ( 219) ACGGCACAAACACCACGCAGT 1 44694 ( 84) AATGAACTGGTAGCAAAGCAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 9.263 E= 4.7e+002 143 -77 -60 -945 169 -77 -945 -945 -89 -945 140 5 -945 -945 198 -94 -89 155 -60 -945 192 -945 -945 -945 -945 181 -60 -945 69 -77 -945 64 11 -945 140 -94 -89 -945 -60 137 -945 155 -945 5 143 -945 -60 -94 11 23 40 -94 -945 181 -945 -94 169 -945 -60 -945 11 155 -945 -945 -89 -945 198 -945 -945 123 98 -945 -89 181 -945 -945 111 23 -60 -945 -945 -945 140 64 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 4.7e+002 0.714286 0.142857 0.142857 0.000000 0.857143 0.142857 0.000000 0.000000 0.142857 0.000000 0.571429 0.285714 0.000000 0.000000 0.857143 0.142857 0.142857 0.714286 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.428571 0.142857 0.000000 0.428571 0.285714 0.000000 0.571429 0.142857 0.142857 0.000000 0.142857 0.714286 0.000000 0.714286 0.000000 0.285714 0.714286 0.000000 0.142857 0.142857 0.285714 0.285714 0.285714 0.142857 0.000000 0.857143 0.000000 0.142857 0.857143 0.000000 0.142857 0.000000 0.285714 0.714286 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.571429 0.428571 0.000000 0.142857 0.857143 0.000000 0.000000 0.571429 0.285714 0.142857 0.000000 0.000000 0.000000 0.571429 0.428571 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AA[GT]GCAC[AT][GA]T[CT]A[ACG]CA[CA]G[CG]C[AC][GT] -------------------------------------------------------------------------------- Time 2.66 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46603 3.16e-04 165_[+1(1.52e-08)]_320 7778 3.51e-09 62_[+3(2.33e-09)]_370_\ [+1(2.79e-08)]_32 41756 6.87e-04 218_[+3(1.36e-07)]_261 55097 4.38e-03 145_[+2(2.66e-07)]_129_\ [+2(7.36e-05)]_194 43878 1.17e-09 60_[+2(5.99e-09)]_176_\ [+3(2.38e-05)]_8_[+3(8.90e-09)]_198 33498 1.21e-07 92_[+3(3.08e-08)]_126_\ [+2(1.27e-07)]_245 44694 2.88e-03 83_[+3(1.88e-07)]_396 42742 2.38e-11 15_[+2(2.66e-07)]_205_\ [+2(2.13e-05)]_55_[+1(7.76e-08)]_95_[+3(2.02e-08)]_62 49303 3.15e-08 203_[+2(2.41e-08)]_84_\ [+1(2.79e-08)]_38_[+1(1.35e-05)]_129 49323 6.03e-12 92_[+2(1.92e-07)]_23_[+3(1.55e-08)]_\ 106_[+1(3.26e-08)]_227 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************