******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/229/229.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42650 1.0000 500 9255 1.0000 500 54598 1.0000 500 46643 1.0000 500 21290 1.0000 500 48145 1.0000 500 7678 1.0000 500 30636 1.0000 500 41011 1.0000 500 44438 1.0000 500 44670 1.0000 500 44999 1.0000 500 12095 1.0000 500 46142 1.0000 500 12620 1.0000 500 34894 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/229/229.seqs.fa -oc motifs/229 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.248 G 0.236 T 0.248 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.248 G 0.236 T 0.249 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 9 llr = 107 E-value = 4.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::::::86:: pos.-specific C 98:::13::3:: probability G :261a:6a::a1 matrix T 1:49:91:21:9 bits 2.1 * * * 1.9 * * * 1.7 * * * 1.5 * *** * ** Relative 1.3 ** *** ** ** Entropy 1.0 ****** ** ** (17.1 bits) 0.8 ****** ** ** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CCGTGTGGAAGT consensus GT C TC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44438 458 4.99e-08 AAGTCTTATG CCGTGTGGAAGT TATGGCACAA 46643 65 4.99e-08 TCGGTGAGAC CCGTGTGGAAGT GTTCTTTCTC 21290 103 1.49e-07 TCACGGTTCG CCGTGTGGACGT GGAAACCCGG 54598 51 9.32e-07 ACGGATGCTT CCTTGTGGATGT CTCTAAATGA 46142 87 1.08e-06 TTCGGCTGTT CCTTGTTGAAGT AATACTGGAG 48145 206 4.44e-06 AAAAACCCTC CGGTGTCGTCGT CCCCAAGTAG 42650 375 4.44e-06 ATGGACAGTC TCTTGTCGACGT GTGCGAAAAA 41011 46 1.07e-05 ACGGGGGACA CGTGGTGGTAGT AGAGTAGGAA 12620 428 1.13e-05 TGTGCTTCTC CCGTGCCGAAGG GTGCACCGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44438 5e-08 457_[+1]_31 46643 5e-08 64_[+1]_424 21290 1.5e-07 102_[+1]_386 54598 9.3e-07 50_[+1]_438 46142 1.1e-06 86_[+1]_402 48145 4.4e-06 205_[+1]_283 42650 4.4e-06 374_[+1]_114 41011 1.1e-05 45_[+1]_443 12620 1.1e-05 427_[+1]_61 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=9 44438 ( 458) CCGTGTGGAAGT 1 46643 ( 65) CCGTGTGGAAGT 1 21290 ( 103) CCGTGTGGACGT 1 54598 ( 51) CCTTGTGGATGT 1 46142 ( 87) CCTTGTTGAAGT 1 48145 ( 206) CGGTGTCGTCGT 1 42650 ( 375) TCTTGTCGACGT 1 41011 ( 46) CGTGGTGGTAGT 1 12620 ( 428) CCGTGCCGAAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7824 bayes= 10.6108 E= 4.0e+000 -982 184 -982 -116 -982 165 -8 -982 -982 -982 124 84 -982 -982 -108 184 -982 -982 208 -982 -982 -115 -982 184 -982 43 124 -116 -982 -982 208 -982 153 -982 -982 -16 105 43 -982 -116 -982 -982 208 -982 -982 -982 -108 184 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 4.0e+000 0.000000 0.888889 0.000000 0.111111 0.000000 0.777778 0.222222 0.000000 0.000000 0.000000 0.555556 0.444444 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 1.000000 0.000000 0.000000 0.111111 0.000000 0.888889 0.000000 0.333333 0.555556 0.111111 0.000000 0.000000 1.000000 0.000000 0.777778 0.000000 0.000000 0.222222 0.555556 0.333333 0.000000 0.111111 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.111111 0.888889 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[CG][GT]TGT[GC]G[AT][AC]GT -------------------------------------------------------------------------------- Time 2.44 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 6 llr = 101 E-value = 8.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::::8322:::::5::: pos.-specific C ::22::22:335a8237:a probability G a::2aa:2:22::23:3:: matrix T :a87:::38355::52:a: bits 2.1 ** ** * ** 1.9 ** ** * ** 1.7 ** ** * ** 1.5 *** ** ** ** Relative 1.3 *** *** * ** ** Entropy 1.0 *** *** * *** *** (24.3 bits) 0.8 ******* * *** *** 0.6 ******* * ***** *** 0.4 ******* * ********* 0.2 ******* * ********* 0.0 ------------------- Multilevel GTTTGGAATCTCCCTACTC consensus T TCT GCG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 54598 335 5.15e-11 AAGAAACGTC GTTTGGATTTTCCCTCCTC ATTTTGGGTA 9255 158 5.42e-09 AATGCACTTT GTTTGGAATCGTCCGCGTC ATCGCGATAC 41011 220 9.87e-09 GACCAAGCCA GTTGGGAGTCCTCCTAGTC TCCTTCTTAT 7678 255 2.57e-08 CTACAGTCTC GTCTGGACTATCCCCACTC TCGTAACGAC 21290 415 3.14e-08 TCAACCCCCC GTTTGGATAGTCCGGACTC GACCAAAGTC 46142 372 3.81e-08 CTTGACCAAC GTTCGGCATTCTCCTTCTC TCGCTTGGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54598 5.1e-11 334_[+2]_147 9255 5.4e-09 157_[+2]_324 41011 9.9e-09 219_[+2]_262 7678 2.6e-08 254_[+2]_227 21290 3.1e-08 414_[+2]_67 46142 3.8e-08 371_[+2]_110 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=6 54598 ( 335) GTTTGGATTTTCCCTCCTC 1 9255 ( 158) GTTTGGAATCGTCCGCGTC 1 41011 ( 220) GTTGGGAGTCCTCCTAGTC 1 7678 ( 255) GTCTGGACTATCCCCACTC 1 21290 ( 415) GTTTGGATAGTCCGGACTC 1 46142 ( 372) GTTCGGCATTCTCCTTCTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 7712 bayes= 9.98547 E= 8.5e+001 -923 -923 208 -923 -923 -923 -923 201 -923 -57 -923 174 -923 -57 -50 142 -923 -923 208 -923 -923 -923 208 -923 163 -57 -923 -923 31 -57 -50 42 -69 -923 -923 174 -69 43 -50 42 -923 43 -50 101 -923 101 -923 101 -923 201 -923 -923 -923 175 -50 -923 -923 -57 50 101 90 43 -923 -58 -923 143 50 -923 -923 -923 -923 201 -923 201 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 6 E= 8.5e+001 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.166667 0.166667 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.333333 0.166667 0.166667 0.333333 0.166667 0.000000 0.000000 0.833333 0.166667 0.333333 0.166667 0.333333 0.000000 0.333333 0.166667 0.500000 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.166667 0.333333 0.500000 0.500000 0.333333 0.000000 0.166667 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GTTTGGA[AT]T[CT][TC][CT]CC[TG][AC][CG]TC -------------------------------------------------------------------------------- Time 4.91 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 6 llr = 91 E-value = 2.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :88a:22:::a2:3:8 pos.-specific C :22:::2::8:8:2:2 probability G a:::7258:2::a:8: matrix T ::::3722a::::52: bits 2.1 * * * 1.9 * * * * * 1.7 * * * * * 1.5 * * **** * * Relative 1.3 **** ****** ** Entropy 1.0 ***** ****** ** (22.0 bits) 0.8 ****** ****** ** 0.6 ****** ****** ** 0.4 ****** ********* 0.2 **************** 0.0 ---------------- Multilevel GAAAGTGGTCACGTGA consensus T A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 34894 115 1.31e-08 GATGTCCGTC GAAAGGTGTCACGTGA GTGGCGCAGC 44438 296 1.31e-08 ATGTATTCGA GAAATTAGTCACGAGA AGGTAGGTAT 41011 14 2.50e-08 GTGCGCATCC GCAAGTGGTCACGTGC TCACTTACGG 44670 245 6.11e-08 AACACCGCAT GACAGTGTTCACGAGA AAGACCGGAC 12620 264 9.06e-08 TCAGGTCTTG GAAATTGGTCAAGTTA TGGCGTTGGA 54598 481 2.19e-07 CGAAGCTAAA GAAAGACGTGACGCGA CTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34894 1.3e-08 114_[+3]_370 44438 1.3e-08 295_[+3]_189 41011 2.5e-08 13_[+3]_471 44670 6.1e-08 244_[+3]_240 12620 9.1e-08 263_[+3]_221 54598 2.2e-07 480_[+3]_4 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=6 34894 ( 115) GAAAGGTGTCACGTGA 1 44438 ( 296) GAAATTAGTCACGAGA 1 41011 ( 14) GCAAGTGGTCACGTGC 1 44670 ( 245) GACAGTGTTCACGAGA 1 12620 ( 264) GAAATTGGTCAAGTTA 1 54598 ( 481) GAAAGACGTGACGCGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7760 bayes= 9.99443 E= 2.0e+002 -923 -923 208 -923 163 -57 -923 -923 163 -57 -923 -923 190 -923 -923 -923 -923 -923 150 42 -69 -923 -50 142 -69 -57 108 -58 -923 -923 182 -58 -923 -923 -923 201 -923 175 -50 -923 190 -923 -923 -923 -69 175 -923 -923 -923 -923 208 -923 31 -57 -923 101 -923 -923 182 -58 163 -57 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 2.0e+002 0.000000 0.000000 1.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.166667 0.000000 0.166667 0.666667 0.166667 0.166667 0.500000 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.166667 0.000000 0.500000 0.000000 0.000000 0.833333 0.166667 0.833333 0.166667 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GAAA[GT]TGGTCACG[TA]GA -------------------------------------------------------------------------------- Time 7.15 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42650 1.00e-02 374_[+1(4.44e-06)]_114 9255 9.37e-05 157_[+2(5.42e-09)]_324 54598 7.51e-13 50_[+1(9.32e-07)]_272_\ [+2(5.15e-11)]_127_[+3(2.19e-07)]_4 46643 5.25e-04 64_[+1(4.99e-08)]_424 21290 7.60e-08 102_[+1(1.49e-07)]_300_\ [+2(3.14e-08)]_67 48145 4.68e-03 205_[+1(4.44e-06)]_283 7678 1.44e-04 254_[+2(2.57e-08)]_227 30636 3.47e-02 203_[+2(2.21e-05)]_278 41011 1.34e-10 13_[+3(2.50e-08)]_16_[+1(1.07e-05)]_\ 162_[+2(9.87e-09)]_262 44438 1.86e-08 295_[+3(1.31e-08)]_146_\ [+1(4.99e-08)]_31 44670 1.30e-04 244_[+3(6.11e-08)]_240 44999 9.47e-01 500 12095 7.09e-01 500 46142 1.50e-06 86_[+1(1.08e-06)]_273_\ [+2(3.81e-08)]_110 12620 1.15e-05 263_[+3(9.06e-08)]_148_\ [+1(1.13e-05)]_61 34894 4.18e-04 114_[+3(1.31e-08)]_370 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************