******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/59/59.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 32057 1.0000 500 52173 1.0000 500 14356 1.0000 500 47841 1.0000 500 39421 1.0000 500 32491 1.0000 500 40048 1.0000 500 41282 1.0000 500 18877 1.0000 500 11003 1.0000 500 41894 1.0000 500 11615 1.0000 500 12330 1.0000 500 45995 1.0000 500 45998 1.0000 500 46242 1.0000 500 36165 1.0000 500 33189 1.0000 500 42734 1.0000 500 50474 1.0000 500 42989 1.0000 500 46468 1.0000 500 49539 1.0000 500 43750 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/59/59.seqs.fa -oc motifs/59 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 24 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 12000 N= 24 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.263 C 0.239 G 0.231 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.263 C 0.239 G 0.231 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 7 llr = 122 E-value = 5.7e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :7:aa6143::a:::a1::1 pos.-specific C :19:::66:::::79::6:: probability G 9::::33:14a:3:1:3179 matrix T 111::1::66::73::633: bits 2.1 * 1.9 ** ** * 1.7 ** ** * 1.5 * *** ** ** * Relative 1.3 * *** ** ** ** Entropy 1.1 * *** * ******* ** (25.1 bits) 0.8 ***** * ******* ** 0.6 ******************** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GACAAACCTTGATCCATCGG consensus GGAAG GT GTT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 41282 228 8.93e-13 AAGGGCGTGA GACAAACCTTGATCCATCGG TTTTGTGTTG 40048 384 8.93e-13 GGGCGTGTGA GACAAACCTTGATCCATCGG TTTTGTGTGT 32057 68 7.94e-09 CATCGCCGAT GATAATCCATGATTCATCGG TAATCAAATC 18877 210 1.08e-08 GGCAACCCGG GACAAGGCGGGAGCGATCGG ATGGATTTCG 42989 9 3.95e-08 CATGGATT GCCAAACATGGATTCAAGTG TCGCCAGTGC 46468 446 4.62e-08 ACCTTCCGCA TACAAGGATGGATCCAGTGA TTGCTCTCTA 12330 98 5.65e-08 CGGCATTACT GTCAAAAAATGAGCCAGTTG TTTCGAAGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41282 8.9e-13 227_[+1]_253 40048 8.9e-13 383_[+1]_97 32057 7.9e-09 67_[+1]_413 18877 1.1e-08 209_[+1]_271 42989 3.9e-08 8_[+1]_472 46468 4.6e-08 445_[+1]_35 12330 5.7e-08 97_[+1]_383 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=7 41282 ( 228) GACAAACCTTGATCCATCGG 1 40048 ( 384) GACAAACCTTGATCCATCGG 1 32057 ( 68) GATAATCCATGATTCATCGG 1 18877 ( 210) GACAAGGCGGGAGCGATCGG 1 42989 ( 9) GCCAAACATGGATTCAAGTG 1 46468 ( 446) TACAAGGATGGATCCAGTGA 1 12330 ( 98) GTCAAAAAATGAGCCAGTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 11544 bayes= 11.2926 E= 5.7e+000 -945 -945 189 -90 144 -74 -945 -90 -945 184 -945 -90 193 -945 -945 -945 193 -945 -945 -945 112 -945 31 -90 -88 126 31 -945 70 126 -945 -945 12 -945 -69 109 -945 -945 89 109 -945 -945 212 -945 193 -945 -945 -945 -945 -945 31 142 -945 158 -945 10 -945 184 -69 -945 193 -945 -945 -945 -88 -945 31 109 -945 126 -69 10 -945 -945 163 10 -88 -945 189 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 7 E= 5.7e+000 0.000000 0.000000 0.857143 0.142857 0.714286 0.142857 0.000000 0.142857 0.000000 0.857143 0.000000 0.142857 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.571429 0.000000 0.285714 0.142857 0.142857 0.571429 0.285714 0.000000 0.428571 0.571429 0.000000 0.000000 0.285714 0.000000 0.142857 0.571429 0.000000 0.000000 0.428571 0.571429 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.285714 0.714286 0.000000 0.714286 0.000000 0.285714 0.000000 0.857143 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.000000 0.285714 0.571429 0.000000 0.571429 0.142857 0.285714 0.000000 0.000000 0.714286 0.285714 0.142857 0.000000 0.857143 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GACAA[AG][CG][CA][TA][TG]GA[TG][CT]CA[TG][CT][GT]G -------------------------------------------------------------------------------- Time 4.65 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 7 llr = 94 E-value = 7.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::1::1:a:::: pos.-specific C :6:91:::a:a7 probability G 1:9:::a::a:3 matrix T 94:199:::::: bits 2.1 * *** 1.9 ***** 1.7 ***** 1.5 ** ***** Relative 1.3 * ********** Entropy 1.1 ************ (19.5 bits) 0.8 ************ 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TCGCTTGACGCC consensus T G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 41282 177 1.02e-07 AAATAAATGA TTGCTTGACGCC ACAAAACCCA 40048 331 1.02e-07 AAATAAATGA TTGCTTGACGCC ACAAAACCCA 45995 154 2.42e-07 AAATTCATTG GCGCTTGACGCC TTCGAAATAT 18877 255 3.32e-07 GAGCCGCGAT TCGCTAGACGCC AACGGCGGCG 42989 259 4.41e-07 TTGCTTTTTG TCGTTTGACGCC TGGACTTCTT 11003 185 7.92e-07 CCCCAGCAAG TCGCCTGACGCG ATTAGGGAAG 32491 454 1.20e-06 CCGTCTGCGC TTACTTGACGCG ATCGTTGTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41282 1e-07 176_[+2]_312 40048 1e-07 330_[+2]_158 45995 2.4e-07 153_[+2]_335 18877 3.3e-07 254_[+2]_234 42989 4.4e-07 258_[+2]_230 11003 7.9e-07 184_[+2]_304 32491 1.2e-06 453_[+2]_35 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=7 41282 ( 177) TTGCTTGACGCC 1 40048 ( 331) TTGCTTGACGCC 1 45995 ( 154) GCGCTTGACGCC 1 18877 ( 255) TCGCTAGACGCC 1 42989 ( 259) TCGTTTGACGCC 1 11003 ( 185) TCGCCTGACGCG 1 32491 ( 454) TTACTTGACGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 11736 bayes= 10.5542 E= 7.2e+000 -945 -945 -69 168 -945 126 -945 68 -88 -945 189 -945 -945 184 -945 -90 -945 -74 -945 168 -88 -945 -945 168 -945 -945 212 -945 193 -945 -945 -945 -945 206 -945 -945 -945 -945 212 -945 -945 206 -945 -945 -945 158 31 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 7.2e+000 0.000000 0.000000 0.142857 0.857143 0.000000 0.571429 0.000000 0.428571 0.142857 0.000000 0.857143 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.142857 0.000000 0.857143 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.714286 0.285714 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[CT]GCTTGACGC[CG] -------------------------------------------------------------------------------- Time 9.44 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 116 E-value = 8.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :872:8:a:2822:a752a8: pos.-specific C 8:::::::27:22a:3:3::: probability G 22::a::::::55:::55::: matrix T ::38:2a:82222::::::2a bits 2.1 * * 1.9 * ** ** * * 1.7 * ** ** * * 1.5 * * ** ** * * Relative 1.3 ** ****** * ** *** Entropy 1.1 ********* * **** *** (27.8 bits) 0.8 *********** **** *** 0.6 *********** ******** 0.4 *********** ******** 0.2 ********************* 0.0 --------------------- Multilevel CAATGATATCAGGCAAAGAAT consensus T CGC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 41282 38 5.88e-13 TAAAATGTAA CAATGATATCAGGCAAAGAAT GGATTCCTGC 40048 192 5.88e-13 TAAAATGTAA CAATGATATCAGGCAAAGAAT GGATTCCTGC 32057 327 3.78e-09 GCAAGAGATT CAATGATATTTCGCAAAGATT CCTTCTGTCG 52173 80 5.11e-09 GCAATTGCCA CATTGATATAATTCACGAAAT ACATGAGCGA 45998 142 5.41e-09 AAGCCGTTGC CAATGTTACCAAACACGCAAT TTGAATTAGG 42989 426 7.03e-09 AACTACGTTT GGTAGATATCAGCCAAGCAAT CAATTTTTGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41282 5.9e-13 37_[+3]_442 40048 5.9e-13 191_[+3]_288 32057 3.8e-09 326_[+3]_153 52173 5.1e-09 79_[+3]_400 45998 5.4e-09 141_[+3]_338 42989 7e-09 425_[+3]_54 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 41282 ( 38) CAATGATATCAGGCAAAGAAT 1 40048 ( 192) CAATGATATCAGGCAAAGAAT 1 32057 ( 327) CAATGATATTTCGCAAAGATT 1 52173 ( 80) CATTGATATAATTCACGAAAT 1 45998 ( 142) CAATGTTACCAAACACGCAAT 1 42989 ( 426) GGTAGATATCAGCCAAGCAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 11520 bayes= 11.3538 E= 8.5e+000 -923 180 -47 -923 166 -923 -47 -923 134 -923 -923 32 -66 -923 -923 164 -923 -923 212 -923 166 -923 -923 -68 -923 -923 -923 190 193 -923 -923 -923 -923 -52 -923 164 -66 148 -923 -68 166 -923 -923 -68 -66 -52 112 -68 -66 -52 112 -68 -923 206 -923 -923 193 -923 -923 -923 134 48 -923 -923 93 -923 112 -923 -66 48 112 -923 193 -923 -923 -923 166 -923 -923 -68 -923 -923 -923 190 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 8.5e+000 0.000000 0.833333 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 0.666667 0.000000 0.000000 0.333333 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.166667 0.666667 0.000000 0.166667 0.833333 0.000000 0.000000 0.166667 0.166667 0.166667 0.500000 0.166667 0.166667 0.166667 0.500000 0.166667 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.166667 0.333333 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CA[AT]TGATATCAGGCA[AC][AG][GC]AAT -------------------------------------------------------------------------------- Time 14.29 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32057 1.25e-09 67_[+1(7.94e-09)]_239_\ [+3(3.78e-09)]_153 52173 4.49e-05 79_[+3(5.11e-09)]_400 14356 3.14e-01 500 47841 7.01e-01 500 39421 9.61e-01 500 32491 1.78e-02 453_[+2(1.20e-06)]_35 40048 8.95e-21 191_[+3(5.88e-13)]_118_\ [+2(1.02e-07)]_41_[+1(8.93e-13)]_97 41282 8.95e-21 37_[+3(5.88e-13)]_118_\ [+2(1.02e-07)]_39_[+1(8.93e-13)]_253 18877 1.11e-07 209_[+1(1.08e-08)]_25_\ [+2(3.32e-07)]_234 11003 9.40e-04 184_[+2(7.92e-07)]_304 41894 8.01e-01 500 11615 3.57e-01 500 12330 5.97e-05 97_[+1(5.65e-08)]_383 45995 3.77e-03 153_[+2(2.42e-07)]_335 45998 1.32e-04 141_[+3(5.41e-09)]_338 46242 5.92e-01 500 36165 3.30e-01 500 33189 2.27e-01 500 42734 5.58e-01 500 50474 6.83e-01 500 42989 7.48e-12 8_[+1(3.95e-08)]_230_[+2(4.41e-07)]_\ 155_[+3(7.03e-09)]_54 46468 2.06e-04 445_[+1(4.62e-08)]_35 49539 8.00e-01 500 43750 7.18e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************