******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/201/201.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1060 1.0000 500 11286 1.0000 500 21785 1.0000 500 2399 1.0000 500 25295 1.0000 500 268172 1.0000 500 5180 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/201/201.seqs.fa -oc motifs/201 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.253 C 0.219 G 0.255 T 0.273 Background letter frequencies (from dataset with add-one prior applied): A 0.253 C 0.219 G 0.255 T 0.273 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 6 llr = 78 E-value = 1.1e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 227::a2:::32 pos.-specific C 88:aa:8a:828 probability G ::3:::::3::: matrix T ::::::::725: bits 2.2 ** * 2.0 *** * 1.8 *** * 1.5 ** ***** * * Relative 1.3 ** ***** * * Entropy 1.1 ******** * * (18.6 bits) 0.9 ********** * 0.7 ********** * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CCACCACCTCTC consensus G G A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 1060 27 7.22e-08 GTCAACTCGT CCACCACCGCTC GCTGTACCAA 5180 477 9.77e-08 AATAATCTCA CCGCCACCTCTC CACTCCTCGC 25295 426 5.05e-07 ACCGCAATGG CAACCACCTCAC ACCATCCATC 2399 428 1.06e-06 TTTGTCTTGG CCACCACCTTCC CTCCTTCGTC 268172 423 1.70e-06 CTCGCCGCCG CCGCCACCGCAA CTGCAGTTCG 11286 479 1.93e-06 TGTCGTACAT ACACCAACTCTC ACACCTACAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1060 7.2e-08 26_[+1]_462 5180 9.8e-08 476_[+1]_12 25295 5e-07 425_[+1]_63 2399 1.1e-06 427_[+1]_61 268172 1.7e-06 422_[+1]_66 11286 1.9e-06 478_[+1]_10 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=6 1060 ( 27) CCACCACCGCTC 1 5180 ( 477) CCGCCACCTCTC 1 25295 ( 426) CAACCACCTCAC 1 2399 ( 428) CCACCACCTTCC 1 268172 ( 423) CCGCCACCGCAA 1 11286 ( 479) ACACCAACTCTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3423 bayes= 10.2544 E= 1.1e-001 -60 193 -923 -923 -60 193 -923 -923 140 -923 39 -923 -923 219 -923 -923 -923 219 -923 -923 198 -923 -923 -923 -60 193 -923 -923 -923 219 -923 -923 -923 -923 39 128 -923 193 -923 -71 40 -39 -923 87 -60 193 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 1.1e-001 0.166667 0.833333 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.833333 0.000000 0.166667 0.333333 0.166667 0.000000 0.500000 0.166667 0.833333 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CC[AG]CCACC[TG]C[TA]C -------------------------------------------------------------------------------- Time 0.43 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 7 llr = 80 E-value = 7.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 19::::::3114 pos.-specific C 7:a496:a:99: probability G ::::11::4::6 matrix T 11:6:3a:3::: bits 2.2 * * 2.0 * ** 1.8 * ** 1.5 * * ** ** Relative 1.3 ** * ** ** Entropy 1.1 **** ** *** (16.4 bits) 0.9 ***** ** *** 0.7 ******** *** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CACTCCTCGCCG consensus C T A A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 5180 489 5.34e-08 GCCACCTCTC CACTCCTCGCCG 268172 467 3.26e-07 GGGAGACGTG CACTCCTCTCCA TCCACCCTCA 1060 474 3.18e-06 AAGCAAGTTA CACCCCTCAACA AACAACCAAA 2399 470 3.84e-06 AGAAGTGGCG CACCGTTCGCCG TCACTAACGC 25295 408 4.61e-06 TTCTGCAGTG TACTCTTCACCG CAATGGCAAC 11286 78 6.91e-06 TAACGAGGTC ATCTCCTCGCCG AGCGAGACTC 21785 329 1.20e-05 GAGGTCGTCT CACCCGTCTCAA ATCGTCGGCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5180 5.3e-08 488_[+2] 268172 3.3e-07 466_[+2]_22 1060 3.2e-06 473_[+2]_15 2399 3.8e-06 469_[+2]_19 25295 4.6e-06 407_[+2]_81 11286 6.9e-06 77_[+2]_411 21785 1.2e-05 328_[+2]_160 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=7 5180 ( 489) CACTCCTCGCCG 1 268172 ( 467) CACTCCTCTCCA 1 1060 ( 474) CACCCCTCAACA 1 2399 ( 470) CACCGTTCGCCG 1 25295 ( 408) TACTCTTCACCG 1 11286 ( 78) ATCTCCTCGCCG 1 21785 ( 329) CACCCGTCTCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3423 bayes= 10.1548 E= 7.6e+000 -82 171 -945 -93 176 -945 -945 -93 -945 219 -945 -945 -945 97 -945 106 -945 197 -84 -945 -945 138 -84 6 -945 -945 -945 187 -945 219 -945 -945 18 -945 75 6 -82 197 -945 -945 -82 197 -945 -945 76 -945 116 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 7.6e+000 0.142857 0.714286 0.000000 0.142857 0.857143 0.000000 0.000000 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.428571 0.000000 0.571429 0.000000 0.857143 0.142857 0.000000 0.000000 0.571429 0.142857 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.285714 0.000000 0.428571 0.285714 0.142857 0.857143 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.428571 0.000000 0.571429 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CAC[TC]C[CT]TC[GAT]CC[GA] -------------------------------------------------------------------------------- Time 0.89 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 5 llr = 76 E-value = 1.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 222:::2::::82a: pos.-specific C 86826:2a6:a:8:a probability G ::::4::::2::::: matrix T :2:8:a6:48:2::: bits 2.2 * * * 2.0 * * * ** 1.8 * * * ** 1.5 * * * * * *** Relative 1.3 * ** * * ***** Entropy 1.1 * **** ******** (21.9 bits) 0.9 * **** ******** 0.7 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CCCTCTTCCTCACAC consensus AAACG A TG TA sequence T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 21785 470 1.78e-08 AGATCAATTA CCCTCTTCTTCAAAC AATATTCTAC 268172 378 3.85e-08 CGAAACCAGT CTCTCTTCCTCTCAC CAACTTCACC 11286 388 5.64e-08 GTTGTAGTAC ACCTCTCCTTCACAC TCTCACTCAG 1060 449 6.50e-08 TATTTCACCA CCACGTTCCTCACAC AAGCAAGTTA 25295 373 1.56e-07 AAATGCCGAT CACTGTACCGCACAC AGCAGCCTCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21785 1.8e-08 469_[+3]_16 268172 3.8e-08 377_[+3]_108 11286 5.6e-08 387_[+3]_98 1060 6.5e-08 448_[+3]_37 25295 1.6e-07 372_[+3]_113 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=5 21785 ( 470) CCCTCTTCTTCAAAC 1 268172 ( 378) CTCTCTTCCTCTCAC 1 11286 ( 388) ACCTCTCCTTCACAC 1 1060 ( 449) CCACGTTCCTCACAC 1 25295 ( 373) CACTGTACCGCACAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 3402 bayes= 10.3526 E= 1.2e+001 -34 187 -897 -897 -34 145 -897 -45 -34 187 -897 -897 -897 -13 -897 155 -897 145 65 -897 -897 -897 -897 187 -34 -13 -897 113 -897 219 -897 -897 -897 145 -897 55 -897 -897 -35 155 -897 219 -897 -897 166 -897 -897 -45 -34 187 -897 -897 198 -897 -897 -897 -897 219 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 5 E= 1.2e+001 0.200000 0.800000 0.000000 0.000000 0.200000 0.600000 0.000000 0.200000 0.200000 0.800000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 0.000000 1.000000 0.200000 0.200000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 0.200000 0.800000 0.000000 1.000000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.200000 0.800000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA][CAT][CA][TC][CG]T[TAC]C[CT][TG]C[AT][CA]AC -------------------------------------------------------------------------------- Time 1.32 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1060 6.84e-10 26_[+1(7.22e-08)]_410_\ [+3(6.50e-08)]_10_[+2(3.18e-06)]_15 11286 2.55e-08 77_[+2(6.91e-06)]_298_\ [+3(5.64e-08)]_76_[+1(1.93e-06)]_10 21785 7.17e-06 328_[+2(1.20e-05)]_129_\ [+3(1.78e-08)]_16 2399 4.60e-05 427_[+1(1.06e-06)]_30_\ [+2(3.84e-06)]_19 25295 1.31e-08 372_[+3(1.56e-07)]_20_\ [+2(4.61e-06)]_6_[+1(5.05e-07)]_63 268172 9.51e-10 377_[+3(3.85e-08)]_15_\ [+2(9.98e-05)]_3_[+1(1.70e-06)]_32_[+2(3.26e-07)]_22 5180 1.24e-07 476_[+1(9.77e-08)]_[+2(5.34e-08)] -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************