******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/167/167.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 17300 1.0000 500 36720 1.0000 500 46727 1.0000 500 51017 1.0000 500 29174 1.0000 500 18142 1.0000 500 44595 1.0000 500 48485 1.0000 500 44763 1.0000 500 47012 1.0000 500 50047 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/167/167.seqs.fa -oc motifs/167 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.250 C 0.258 G 0.237 T 0.255 Background letter frequencies (from dataset with add-one prior applied): A 0.250 C 0.258 G 0.237 T 0.255 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 2 llr = 58 E-value = 9.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::::::::::::::::a:: pos.-specific C ::::::::::::::::::::: probability G a:a:a:a:a:a:a:a:a::5a matrix T :a:a:a:a:a:a:a:a:a:5: bits 2.1 ******************* * 1.9 ******************* * 1.7 ******************* * 1.5 ******************* * Relative 1.2 ******************* * Entropy 1.0 ********************* (41.5 bits) 0.8 ********************* 0.6 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GTGTGTGTGTGTGTGTGTAGG consensus T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 17300 384 1.51e-13 TGTGTGCTTC GTGTGTGTGTGTGTGTGTAGG AGAGTGAGTA 48485 128 3.14e-13 GAGTGTGTGT GTGTGTGTGTGTGTGTGTATG TGTGTATGCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17300 1.5e-13 383_[+1]_96 48485 3.1e-13 127_[+1]_352 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=2 17300 ( 384) GTGTGTGTGTGTGTGTGTAGG 1 48485 ( 128) GTGTGTGTGTGTGTGTGTATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 11.3658 E= 9.4e+001 -765 -765 207 -765 -765 -765 -765 197 -765 -765 207 -765 -765 -765 -765 197 -765 -765 207 -765 -765 -765 -765 197 -765 -765 207 -765 -765 -765 -765 197 -765 -765 207 -765 -765 -765 -765 197 -765 -765 207 -765 -765 -765 -765 197 -765 -765 207 -765 -765 -765 -765 197 -765 -765 207 -765 -765 -765 -765 197 -765 -765 207 -765 -765 -765 -765 197 200 -765 -765 -765 -765 -765 107 97 -765 -765 207 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 2 E= 9.4e+001 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GTGTGTGTGTGTGTGTGTA[GT]G -------------------------------------------------------------------------------- Time 1.09 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 11 llr = 106 E-value = 1.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :388:86419:8 pos.-specific C 85::a::58:82 probability G 22:2:21211:: matrix T :12:::3:::2: bits 2.1 1.9 * 1.7 * * 1.5 * * Relative 1.2 * **** *** Entropy 1.0 * **** **** (13.9 bits) 0.8 * ***** **** 0.6 * ***** **** 0.4 * ********** 0.2 ************ 0.0 ------------ Multilevel CCAACAACCACA consensus A TA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44763 466 7.20e-08 ACCAACGACA CCAACAACCACA CTCGCCAAAG 29174 301 1.71e-06 CTTCGCAACC CCAACGAACACA AATCACTCTC 51017 86 2.32e-06 ACGAGTAGAT CCAACAGCCACA GAGAGAAAGG 50047 267 3.84e-06 GTTGCGTGTA CAAACAAACATA CCGTTCAAAC 36720 415 7.16e-06 ATCATGAGTA GCAGCAACCACA GCTTGGATGG 46727 406 1.48e-05 CATTGCGTGA CAAACATCGACA CCGGAAGGAT 48485 76 2.06e-05 CGACCACCAG CCAGCAAGCATA TTCCGTCGGG 44595 24 2.46e-05 AGTCTTGCAT CTTACATCCACA CGTTTACCGC 47012 412 4.93e-05 ACGCAACGCA GGAACAAACGCA CCGCAAGTCT 18142 48 6.53e-05 TGCTTTCCTT CATACATGCACC GAAATCTTGT 17300 187 1.30e-04 TAACGGTCGA CGAACGAAAACC GGTACGTCTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44763 7.2e-08 465_[+2]_23 29174 1.7e-06 300_[+2]_188 51017 2.3e-06 85_[+2]_403 50047 3.8e-06 266_[+2]_222 36720 7.2e-06 414_[+2]_74 46727 1.5e-05 405_[+2]_83 48485 2.1e-05 75_[+2]_413 44595 2.5e-05 23_[+2]_465 47012 4.9e-05 411_[+2]_77 18142 6.5e-05 47_[+2]_441 17300 0.00013 186_[+2]_302 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=11 44763 ( 466) CCAACAACCACA 1 29174 ( 301) CCAACGAACACA 1 51017 ( 86) CCAACAGCCACA 1 50047 ( 267) CAAACAAACATA 1 36720 ( 415) GCAGCAACCACA 1 46727 ( 406) CAAACATCGACA 1 48485 ( 76) CCAGCAAGCATA 1 44595 ( 24) CTTACATCCACA 1 47012 ( 412) GGAACAAACGCA 1 18142 ( 48) CATACATGCACC 1 17300 ( 187) CGAACGAAAACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5379 bayes= 9.28648 E= 1.9e+002 -1010 166 -38 -1010 13 82 -38 -148 171 -1010 -1010 -48 171 -1010 -38 -1010 -1010 195 -1010 -1010 171 -1010 -38 -1010 135 -1010 -138 10 54 82 -38 -1010 -146 166 -138 -1010 186 -1010 -138 -1010 -1010 166 -1010 -48 171 -51 -1010 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 1.9e+002 0.000000 0.818182 0.181818 0.000000 0.272727 0.454545 0.181818 0.090909 0.818182 0.000000 0.000000 0.181818 0.818182 0.000000 0.181818 0.000000 0.000000 1.000000 0.000000 0.000000 0.818182 0.000000 0.181818 0.000000 0.636364 0.000000 0.090909 0.272727 0.363636 0.454545 0.181818 0.000000 0.090909 0.818182 0.090909 0.000000 0.909091 0.000000 0.090909 0.000000 0.000000 0.818182 0.000000 0.181818 0.818182 0.181818 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[CA]AACA[AT][CA]CACA -------------------------------------------------------------------------------- Time 2.19 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 10 llr = 105 E-value = 1.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::329:1:3::: pos.-specific C 7:751:913:a5 probability G 3::3:1:21::5 matrix T :a:::9:73a:: bits 2.1 * * 1.9 * ** 1.7 * ** 1.5 * *** ** Relative 1.2 * *** ** Entropy 1.0 *** *** *** (15.1 bits) 0.8 *** **** *** 0.6 *** **** *** 0.4 ******** *** 0.2 ******** *** 0.0 ------------ Multilevel CTCCATCTATCC consensus G AG GC G sequence A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 29174 398 3.76e-07 TTTTGGTTCC CTCCATCTTTCC CCCCTTTCGT 44595 188 1.95e-06 TATCTCTCGG CTACATCTCTCG ACGGCGCTGC 46727 289 1.95e-06 GGGGAGGAAT CTCAATCTCTCG TCCTTGCGGT 17300 33 2.25e-06 TCGAGACGTA CTCCATCTGTCG CATCCTTTAA 51017 167 4.18e-06 TTTTCCCTTA CTCGATCGATCG ACATCTCTCC 50047 468 9.11e-06 ATCCCCGTCT CTCCCTCTATCC TTTTTCTCCA 47012 245 1.16e-05 GGTACTAATA GTAAATCTCTCC CAACCACTGG 44763 157 1.42e-05 GTTTGGGTCT GTCCATCCATCC ATCCATTTTA 18142 309 2.47e-05 CTATGAAATT GTCGATATTTCC ACGATGCAAG 48485 430 4.45e-05 CGTTGGCACA CTAGAGCGTTCG ACTGGATAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 29174 3.8e-07 397_[+3]_91 44595 1.9e-06 187_[+3]_301 46727 1.9e-06 288_[+3]_200 17300 2.2e-06 32_[+3]_456 51017 4.2e-06 166_[+3]_322 50047 9.1e-06 467_[+3]_21 47012 1.2e-05 244_[+3]_244 44763 1.4e-05 156_[+3]_332 18142 2.5e-05 308_[+3]_180 48485 4.4e-05 429_[+3]_59 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=10 29174 ( 398) CTCCATCTTTCC 1 44595 ( 188) CTACATCTCTCG 1 46727 ( 289) CTCAATCTCTCG 1 17300 ( 33) CTCCATCTGTCG 1 51017 ( 167) CTCGATCGATCG 1 50047 ( 468) CTCCCTCTATCC 1 47012 ( 245) GTAAATCTCTCC 1 44763 ( 157) GTCCATCCATCC 1 18142 ( 309) GTCGATATTTCC 1 48485 ( 430) CTAGAGCGTTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5379 bayes= 9.32048 E= 1.4e+002 -997 144 34 -997 -997 -997 -997 197 26 144 -997 -997 -32 95 34 -997 185 -137 -997 -997 -997 -997 -125 182 -132 180 -997 -997 -997 -137 -25 146 26 22 -125 24 -997 -997 -997 197 -997 195 -997 -997 -997 95 107 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 1.4e+002 0.000000 0.700000 0.300000 0.000000 0.000000 0.000000 0.000000 1.000000 0.300000 0.700000 0.000000 0.000000 0.200000 0.500000 0.300000 0.000000 0.900000 0.100000 0.000000 0.000000 0.000000 0.000000 0.100000 0.900000 0.100000 0.900000 0.000000 0.000000 0.000000 0.100000 0.200000 0.700000 0.300000 0.300000 0.100000 0.300000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG]T[CA][CGA]ATC[TG][ACT]TC[CG] -------------------------------------------------------------------------------- Time 3.19 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17300 2.83e-12 32_[+3(2.25e-06)]_339_\ [+1(1.51e-13)]_1_[+1(2.91e-05)]_74 36720 7.62e-02 414_[+2(7.16e-06)]_74 46727 4.64e-04 288_[+3(1.95e-06)]_105_\ [+2(1.48e-05)]_83 51017 2.01e-04 85_[+2(2.32e-06)]_69_[+3(4.18e-06)]_\ 322 29174 2.10e-05 300_[+2(1.71e-06)]_85_\ [+3(3.76e-07)]_91 18142 3.69e-03 47_[+2(6.53e-05)]_249_\ [+3(2.47e-05)]_180 44595 2.95e-04 23_[+2(2.46e-05)]_152_\ [+3(1.95e-06)]_301 48485 1.67e-11 75_[+2(2.06e-05)]_40_[+1(3.14e-13)]_\ 145_[+1(9.64e-05)]_115_[+3(4.45e-05)]_59 44763 2.05e-05 156_[+3(1.42e-05)]_297_\ [+2(7.20e-08)]_23 47012 6.54e-03 244_[+3(1.16e-05)]_155_\ [+2(4.93e-05)]_77 50047 6.33e-04 266_[+2(3.84e-06)]_189_\ [+3(9.11e-06)]_21 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************