******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/165/165.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9634 1.0000 500 54065 1.0000 500 13107 1.0000 500 39133 1.0000 500 30770 1.0000 500 33530 1.0000 500 41570 1.0000 500 34592 1.0000 500 32471 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/165/165.seqs.fa -oc motifs/165 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.252 C 0.266 G 0.232 T 0.250 Background letter frequencies (from dataset with add-one prior applied): A 0.252 C 0.266 G 0.232 T 0.250 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 13 sites = 7 llr = 86 E-value = 8.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::9:::::3:::: pos.-specific C :::61::31:3:a probability G a3149:a6:4:4: matrix T :7:::a:16676: bits 2.1 * * 1.9 * ** * 1.7 * ** * 1.5 * * *** * Relative 1.3 *** *** * Entropy 1.1 ******* **** (17.7 bits) 0.8 ******* **** 0.6 ************* 0.4 ************* 0.2 ************* 0.0 ------------- Multilevel GTACGTGGTTTTC consensus G G CAGCG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- 32471 242 1.00e-07 GACGTAGGAC GTACGTGGATTTC GGGCCGTACT 30770 53 3.38e-07 ACAGACACAA GTACGTGGTGCTC GTACGTTGTG 33530 116 4.60e-07 ATTGGAAGGA GTACGTGTTTTTC CTTTTCCCAC 34592 343 1.25e-06 AGGTAGGAAG GTAGGTGCAGTGC CAGTACAGTC 54065 276 1.41e-06 GAGGTACAAG GTAGCTGGTTTGC GTGTGTCAGG 9634 182 1.41e-06 CGTCCGGCGT GGACGTGGTGCGC AAGAGCGGTG 13107 267 8.53e-06 CAGTCGCCAC GGGGGTGCCTTTC TGGGTCGTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32471 1e-07 241_[+1]_246 30770 3.4e-07 52_[+1]_435 33530 4.6e-07 115_[+1]_372 34592 1.2e-06 342_[+1]_145 54065 1.4e-06 275_[+1]_212 9634 1.4e-06 181_[+1]_306 13107 8.5e-06 266_[+1]_221 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=13 seqs=7 32471 ( 242) GTACGTGGATTTC 1 30770 ( 53) GTACGTGGTGCTC 1 33530 ( 116) GTACGTGTTTTTC 1 34592 ( 343) GTAGGTGCAGTGC 1 54065 ( 276) GTAGCTGGTTTGC 1 9634 ( 182) GGACGTGGTGCGC 1 13107 ( 267) GGGGGTGCCTTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 4392 bayes= 9.89752 E= 8.0e+001 -945 -945 211 -945 -945 -945 30 152 176 -945 -70 -945 -945 110 89 -945 -945 -90 189 -945 -945 -945 -945 200 -945 -945 211 -945 -945 10 130 -80 18 -90 -945 119 -945 -945 89 119 -945 10 -945 152 -945 -945 89 119 -945 191 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 7 E= 8.0e+001 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.714286 0.857143 0.000000 0.142857 0.000000 0.000000 0.571429 0.428571 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.285714 0.571429 0.142857 0.285714 0.142857 0.000000 0.571429 0.000000 0.000000 0.428571 0.571429 0.000000 0.285714 0.000000 0.714286 0.000000 0.000000 0.428571 0.571429 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[TG]A[CG]GTG[GC][TA][TG][TC][TG]C -------------------------------------------------------------------------------- Time 0.73 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 6 llr = 74 E-value = 1.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 3::72::5:2:: pos.-specific C ::::::a::::5 probability G 7:a:83:588a: matrix T :a:3:7::2::5 bits 2.1 * * 1.9 ** * * 1.7 ** * * 1.5 ** * * *** Relative 1.3 ** * * *** Entropy 1.1 ************ (17.8 bits) 0.8 ************ 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GTGAGTCAGGGC consensus A T G G T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 30770 73 3.12e-07 CTCGTACGTT GTGTGTCGGGGT CCCACACGCA 32471 268 6.06e-07 CCGTACTGGA ATGAGTCAGGGT TTCCGGCTGA 54065 289 6.06e-07 GCTGGTTTGC GTGTGTCAGGGC ATGGCCTACC 33530 54 1.27e-06 TCGGGAGTGG GTGAGTCGTGGT TTTATTTCTG 39133 86 2.85e-06 GGACACAACC GTGAAGCAGGGC ATGGCACCAG 9634 271 4.44e-06 ATTCATAGGA ATGAGGCGGAGC CAGAGCAAGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 30770 3.1e-07 72_[+2]_416 32471 6.1e-07 267_[+2]_221 54065 6.1e-07 288_[+2]_200 33530 1.3e-06 53_[+2]_435 39133 2.8e-06 85_[+2]_403 9634 4.4e-06 270_[+2]_218 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=6 30770 ( 73) GTGTGTCGGGGT 1 32471 ( 268) ATGAGTCAGGGT 1 54065 ( 289) GTGTGTCAGGGC 1 33530 ( 54) GTGAGTCGTGGT 1 39133 ( 86) GTGAAGCAGGGC 1 9634 ( 271) ATGAGGCGGAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4401 bayes= 10.6173 E= 1.3e+002 40 -923 152 -923 -923 -923 -923 200 -923 -923 211 -923 140 -923 -923 42 -60 -923 184 -923 -923 -923 52 142 -923 191 -923 -923 98 -923 111 -923 -923 -923 184 -58 -60 -923 184 -923 -923 -923 211 -923 -923 91 -923 100 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 1.3e+002 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.000000 0.333333 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 0.833333 0.166667 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GA]TG[AT]G[TG]C[AG]GGG[CT] -------------------------------------------------------------------------------- Time 1.46 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 9 llr = 105 E-value = 4.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 71:49:2:1124:23: pos.-specific C 3183:33::9:411:8 probability G :8:2:23a::81927: matrix T ::2:141:9::::4:2 bits 2.1 * 1.9 * 1.7 * * 1.5 * *** * Relative 1.3 * * **** * * Entropy 1.1 *** * **** * ** (16.8 bits) 0.8 *** * **** * ** 0.6 *** * ****** ** 0.4 ****** ****** ** 0.2 **************** 0.0 ---------------- Multilevel AGCAATCGTCGAGTGC consensus C TC CG AC AAT sequence G GA G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 39133 35 2.04e-08 CAAGTTCTAT AGCAAGGGTCGCGGGC CCCGAGCCGG 32471 373 4.01e-07 TCACTATATC AACAACAGTCGCGTGC CGACTACGAC 30770 274 4.51e-07 GTTCTCGGGT CGCCATGGTCGAGTAT CCGGTACTGC 13107 465 1.23e-06 CGAATGCGCC CGCGACGGACGAGTGC CGGCCGTTTC 33530 418 1.49e-06 TATTATTCAC AGTCAGTGTCGCGTAC ACTTGCGTTC 9634 310 1.49e-06 ACACAACCGA CGCAACCGTCACGAAC GAGAGTGCCC 54065 199 3.13e-06 ATTTACTGTT AGTCATAGTCAAGCGC AGTGCGTACT 34592 129 1.12e-05 GAATCCCTTG ACCAATCGTAGGGAGC CACCCTACTG 41570 79 1.47e-05 GGTTTTACGA AGCGTTCGTCGACGGT GGGTTTGCCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39133 2e-08 34_[+3]_450 32471 4e-07 372_[+3]_112 30770 4.5e-07 273_[+3]_211 13107 1.2e-06 464_[+3]_20 33530 1.5e-06 417_[+3]_67 9634 1.5e-06 309_[+3]_175 54065 3.1e-06 198_[+3]_286 34592 1.1e-05 128_[+3]_356 41570 1.5e-05 78_[+3]_406 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=9 39133 ( 35) AGCAAGGGTCGCGGGC 1 32471 ( 373) AACAACAGTCGCGTGC 1 30770 ( 274) CGCCATGGTCGAGTAT 1 13107 ( 465) CGCGACGGACGAGTGC 1 33530 ( 418) AGTCAGTGTCGCGTAC 1 9634 ( 310) CGCAACCGTCACGAAC 1 54065 ( 199) AGTCATAGTCAAGCGC 1 34592 ( 129) ACCAATCGTAGGGAGC 1 41570 ( 79) AGCGTTCGTCGACGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4365 bayes= 8.91886 E= 4.6e+002 140 32 -982 -982 -118 -126 175 -982 -982 155 -982 -17 82 32 -6 -982 181 -982 -982 -117 -982 32 -6 83 -18 32 52 -117 -982 -982 211 -982 -118 -982 -982 183 -118 174 -982 -982 -18 -982 175 -982 82 74 -106 -982 -982 -126 194 -982 -18 -126 -6 83 40 -982 152 -982 -982 155 -982 -17 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 4.6e+002 0.666667 0.333333 0.000000 0.000000 0.111111 0.111111 0.777778 0.000000 0.000000 0.777778 0.000000 0.222222 0.444444 0.333333 0.222222 0.000000 0.888889 0.000000 0.000000 0.111111 0.000000 0.333333 0.222222 0.444444 0.222222 0.333333 0.333333 0.111111 0.000000 0.000000 1.000000 0.000000 0.111111 0.000000 0.000000 0.888889 0.111111 0.888889 0.000000 0.000000 0.222222 0.000000 0.777778 0.000000 0.444444 0.444444 0.111111 0.000000 0.000000 0.111111 0.888889 0.000000 0.222222 0.111111 0.222222 0.444444 0.333333 0.000000 0.666667 0.000000 0.000000 0.777778 0.000000 0.222222 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AC]G[CT][ACG]A[TCG][CGA]GTC[GA][AC]G[TAG][GA][CT] -------------------------------------------------------------------------------- Time 2.19 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9634 2.52e-07 181_[+1(1.41e-06)]_76_\ [+2(4.44e-06)]_27_[+3(1.49e-06)]_175 54065 8.11e-08 198_[+3(3.13e-06)]_61_\ [+1(1.41e-06)]_[+2(6.06e-07)]_200 13107 2.27e-04 266_[+1(8.53e-06)]_185_\ [+3(1.23e-06)]_20 39133 2.41e-06 34_[+3(2.04e-08)]_35_[+2(2.85e-06)]_\ 403 30770 1.99e-09 52_[+1(3.38e-07)]_7_[+2(3.12e-07)]_\ 189_[+3(4.51e-07)]_211 33530 2.90e-08 53_[+2(1.27e-06)]_50_[+1(4.60e-07)]_\ 289_[+3(1.49e-06)]_67 41570 3.78e-02 78_[+3(1.47e-05)]_406 34592 3.56e-05 128_[+3(1.12e-05)]_198_\ [+1(1.25e-06)]_145 32471 1.08e-09 241_[+1(1.00e-07)]_13_\ [+2(6.06e-07)]_93_[+3(4.01e-07)]_112 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************