******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/213/213.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42953 1.0000 500 46408 1.0000 500 47239 1.0000 500 47612 1.0000 500 47667 1.0000 500 15281 1.0000 500 43507 1.0000 500 36425 1.0000 500 47771 1.0000 500 45208 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/213/213.seqs.fa -oc motifs/213 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.275 C 0.222 G 0.222 T 0.281 Background letter frequencies (from dataset with add-one prior applied): A 0.275 C 0.222 G 0.222 T 0.281 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 6 llr = 92 E-value = 1.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::5a27:2:7::732 pos.-specific C 2:a:::3::a:::::8 probability G :a:5:7:2::3283:: matrix T 8::::2:88::82:7: bits 2.2 ** * 2.0 ** * * 1.7 ** * * 1.5 ** * * * * Relative 1.3 *** * * * ** * Entropy 1.1 ***** ******** * (22.1 bits) 0.9 **************** 0.7 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TGCAAGATTCATGATC consensus G C G GA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 47667 346 1.24e-09 GATGTGAATC TGCGAGATTCGTGATC GATTCTGTAG 47239 50 1.23e-08 TCTTTGAGTT TGCGAGATTCATTATC GATTCAGTAG 36425 270 3.60e-08 TAGAGATTCG TGCAAGATTCATGAAA CGACTTGTTC 45208 256 1.06e-07 TACAAAATAA CGCGAGCTTCGGGATC TTTTTCATTG 42953 110 1.56e-07 ACATCGTCAT TGCAATCGTCATGGTC ACCAGCAGCA 47612 425 1.93e-07 CGATAATTAT TGCAAAATACATGGAC TCATTATTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47667 1.2e-09 345_[+1]_139 47239 1.2e-08 49_[+1]_435 36425 3.6e-08 269_[+1]_215 45208 1.1e-07 255_[+1]_229 42953 1.6e-07 109_[+1]_375 47612 1.9e-07 424_[+1]_60 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=6 47667 ( 346) TGCGAGATTCGTGATC 1 47239 ( 50) TGCGAGATTCATTATC 1 36425 ( 270) TGCAAGATTCATGAAA 1 45208 ( 256) CGCGAGCTTCGGGATC 1 42953 ( 110) TGCAATCGTCATGGTC 1 47612 ( 425) TGCAAAATACATGGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4850 bayes= 9.31551 E= 1.1e+002 -923 -42 -923 157 -923 -923 217 -923 -923 217 -923 -923 86 -923 117 -923 186 -923 -923 -923 -72 -923 158 -75 128 58 -923 -923 -923 -923 -42 157 -72 -923 -923 157 -923 217 -923 -923 128 -923 58 -923 -923 -923 -42 157 -923 -923 190 -75 128 -923 58 -923 28 -923 -923 125 -72 190 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 1.1e+002 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.666667 0.166667 0.666667 0.333333 0.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.166667 0.000000 0.000000 0.833333 0.000000 1.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.833333 0.166667 0.666667 0.000000 0.333333 0.000000 0.333333 0.000000 0.000000 0.666667 0.166667 0.833333 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TGC[AG]AG[AC]TTC[AG]TG[AG][TA]C -------------------------------------------------------------------------------- Time 0.87 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 6 llr = 100 E-value = 2.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 2:53:3::2:3222:::::a: pos.-specific C 8:22:2:22:53:57::22:: probability G :a23:3223::28:::::8:a matrix T ::22a2873a23:33aa8::: bits 2.2 * * 2.0 * ** 1.7 * * * ** ** 1.5 ** * * * ** *** Relative 1.3 ** * * * * ****** Entropy 1.1 ** * * * * ******* (23.9 bits) 0.9 ** * * * * ******* 0.7 ** * ** ** ********* 0.4 ** * ** ** ********* 0.2 *** * ** ** ********* 0.0 --------------------- Multilevel CGAATATTGTCCGCCTTTGAG consensus G G T AT TT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 43507 195 1.65e-09 GGCTGTGAAT CGTGTGTTTTCTGCCTTTCAG ATATTGCAAT 47612 371 1.85e-09 TATGCTGCTG CGAATCTTGTACACCTTTGAG GCCGTAGATT 46408 152 1.19e-08 AAACCTACAC CGAATGTTTTTCGTTTTCGAG AAAGAAAAAG 47667 472 1.96e-08 TTGTCACTTT CGCCTTTGGTCTGACTTTGAG GAGTCTCC 36425 11 3.50e-08 GAATTAGTTA CGGGTAGTATAAGCTTTTGAG AATAGTCTGT 47239 27 4.59e-08 CACCAAAACA AGATTATCCTCGGTCTTTGAG TTTGCGAGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43507 1.6e-09 194_[+2]_285 47612 1.9e-09 370_[+2]_109 46408 1.2e-08 151_[+2]_328 47667 2e-08 471_[+2]_8 36425 3.5e-08 10_[+2]_469 47239 4.6e-08 26_[+2]_453 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=6 43507 ( 195) CGTGTGTTTTCTGCCTTTCAG 1 47612 ( 371) CGAATCTTGTACACCTTTGAG 1 46408 ( 152) CGAATGTTTTTCGTTTTCGAG 1 47667 ( 472) CGCCTTTGGTCTGACTTTGAG 1 36425 ( 11) CGGGTAGTATAAGCTTTTGAG 1 47239 ( 27) AGATTATCCTCGGTCTTTGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 9.30053 E= 2.4e+002 -72 190 -923 -923 -923 -923 217 -923 86 -42 -42 -75 28 -42 58 -75 -923 -923 -923 183 28 -42 58 -75 -923 -923 -42 157 -923 -42 -42 125 -72 -42 58 25 -923 -923 -923 183 28 117 -923 -75 -72 58 -42 25 -72 -923 190 -923 -72 117 -923 25 -923 158 -923 25 -923 -923 -923 183 -923 -923 -923 183 -923 -42 -923 157 -923 -42 190 -923 186 -923 -923 -923 -923 -923 217 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 2.4e+002 0.166667 0.833333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.166667 0.166667 0.166667 0.333333 0.166667 0.333333 0.166667 0.000000 0.000000 0.000000 1.000000 0.333333 0.166667 0.333333 0.166667 0.000000 0.000000 0.166667 0.833333 0.000000 0.166667 0.166667 0.666667 0.166667 0.166667 0.333333 0.333333 0.000000 0.000000 0.000000 1.000000 0.333333 0.500000 0.000000 0.166667 0.166667 0.333333 0.166667 0.333333 0.166667 0.000000 0.833333 0.000000 0.166667 0.500000 0.000000 0.333333 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.166667 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CGA[AG]T[AG]TT[GT]T[CA][CT]G[CT][CT]TTTGAG -------------------------------------------------------------------------------- Time 1.87 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 6 llr = 98 E-value = 9.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 23:878:33a:::73::22: pos.-specific C ::::3::3:::822737587 probability G :2a2:2a27:2282:523:3 matrix T 85:::::2::8::::22::: bits 2.2 * * 2.0 * * * 1.7 * * * 1.5 * * * ** * Relative 1.3 ** ** **** ** Entropy 1.1 * ***** ***** * ** (23.7 bits) 0.9 * ***** ***** * * ** 0.7 * ***** ************ 0.4 ******* ************ 0.2 ******* ************ 0.0 -------------------- Multilevel TTGAAAGAGATCGACGCCCC consensus A C CA AC G G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 42953 37 1.21e-09 GAATAGAAGT TAGACAGAGATCGACGGCCG CAGACCAACA 15281 437 3.32e-09 GTTTGTTTAG TGGAAAGCAATCGCAGCCCC GGGCATATGG 47667 237 8.03e-09 TTCTCTTTCT AAGAAAGAGATCGAACCGCG ACTAAACGAC 47771 433 1.88e-08 AAGGGCAACT TTGGAAGGGATCGACCTACC GTTGCAACTT 47612 46 3.68e-08 AGTAGTCGTA TTGACAGTGAGGCACGCCCC TCAATGTGCG 43507 383 7.96e-08 CTTTCCATCG TTGAAGGCAATCGGCTCGAC TCTTCCTTCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42953 1.2e-09 36_[+3]_444 15281 3.3e-09 436_[+3]_44 47667 8e-09 236_[+3]_244 47771 1.9e-08 432_[+3]_48 47612 3.7e-08 45_[+3]_435 43507 8e-08 382_[+3]_98 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=6 42953 ( 37) TAGACAGAGATCGACGGCCG 1 15281 ( 437) TGGAAAGCAATCGCAGCCCC 1 47667 ( 237) AAGAAAGAGATCGAACCGCG 1 47771 ( 433) TTGGAAGGGATCGACCTACC 1 47612 ( 46) TTGACAGTGAGGCACGCCCC 1 43507 ( 383) TTGAAGGCAATCGGCTCGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4810 bayes= 10.7456 E= 9.6e+002 -72 -923 -923 157 28 -923 -42 83 -923 -923 217 -923 160 -923 -42 -923 128 58 -923 -923 160 -923 -42 -923 -923 -923 217 -923 28 58 -42 -75 28 -923 158 -923 186 -923 -923 -923 -923 -923 -42 157 -923 190 -42 -923 -923 -42 190 -923 128 -42 -42 -923 28 158 -923 -923 -923 58 117 -75 -923 158 -42 -75 -72 117 58 -923 -72 190 -923 -923 -923 158 58 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 9.6e+002 0.166667 0.000000 0.000000 0.833333 0.333333 0.000000 0.166667 0.500000 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.666667 0.333333 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.333333 0.166667 0.166667 0.333333 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.833333 0.166667 0.000000 0.000000 0.166667 0.833333 0.000000 0.666667 0.166667 0.166667 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.333333 0.500000 0.166667 0.000000 0.666667 0.166667 0.166667 0.166667 0.500000 0.333333 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[TA]GA[AC]AG[AC][GA]ATCGA[CA][GC]C[CG]C[CG] -------------------------------------------------------------------------------- Time 2.76 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42953 1.34e-08 36_[+3(1.21e-09)]_53_[+1(1.56e-07)]_\ 375 46408 1.30e-04 151_[+2(1.19e-08)]_328 47239 2.74e-08 26_[+2(4.59e-08)]_2_[+1(1.23e-08)]_\ 435 47612 9.13e-13 45_[+3(3.68e-08)]_305_\ [+2(1.85e-09)]_33_[+1(1.93e-07)]_60 47667 1.70e-14 236_[+3(8.03e-09)]_29_\ [+3(3.03e-05)]_40_[+1(1.24e-09)]_110_[+2(1.96e-08)]_8 15281 7.68e-05 436_[+3(3.32e-09)]_44 43507 1.69e-09 194_[+2(1.65e-09)]_167_\ [+3(7.96e-08)]_98 36425 1.92e-08 10_[+2(3.50e-08)]_238_\ [+1(3.60e-08)]_215 47771 1.60e-04 432_[+3(1.88e-08)]_48 45208 1.72e-03 255_[+1(1.06e-07)]_229 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************