******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/27/27.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31898 1.0000 500 42995 1.0000 500 24886 1.0000 500 37406 1.0000 500 13944 1.0000 500 54789 1.0000 500 47885 1.0000 500 38338 1.0000 500 48046 1.0000 500 48570 1.0000 500 48573 1.0000 500 48683 1.0000 500 43413 1.0000 500 39786 1.0000 500 49160 1.0000 500 39864 1.0000 500 16021 1.0000 500 49557 1.0000 500 49566 1.0000 500 24006 1.0000 500 43821 1.0000 500 43828 1.0000 500 43944 1.0000 500 44038 1.0000 500 34252 1.0000 500 45409 1.0000 500 45546 1.0000 500 45840 1.0000 500 20588 1.0000 500 41318 1.0000 500 31511 1.0000 500 42608 1.0000 500 48404 1.0000 500 31704 1.0000 500 44302 1.0000 500 47209 1.0000 500 34905 1.0000 500 47373 1.0000 500 44712 1.0000 500 44880 1.0000 500 39717 1.0000 500 47820 1.0000 500 34994 1.0000 500 49199 1.0000 500 43983 1.0000 500 49899 1.0000 500 43373 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/27/27.seqs.fa -oc motifs/27 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 47 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 23500 N= 47 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.266 C 0.243 G 0.223 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.266 C 0.243 G 0.223 T 0.268 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 142 E-value = 3.1e-010 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 2:aa:2:a:::7:::::2::: pos.-specific C :a::2:2:::a2:::2:2:2: probability G 8:::88::7::2aa28::88a matrix T ::::::8:3a::::8:a72:: bits 2.2 ** * 2.0 *** * ** ** * * 1.7 *** * ** ** * * 1.5 ****** * ** ** ** *** Relative 1.3 ******** ** ***** *** Entropy 1.1 *********** ***** *** (34.1 bits) 0.9 *********** ***** *** 0.7 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GCAAGGTAGTCAGGTGTTGGG consensus T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 34252 67 1.24e-13 TAACTACCAT GCAAGGTAGTCAGGTGTTGGG CAAAGCATAG 38338 163 1.24e-13 TGACTACCAT GCAAGGTAGTCAGGTGTTGGG CAAAGCATAG 37406 67 1.24e-13 TAACTACCAT GCAAGGTAGTCAGGTGTTGGG CAAAGCATAG 31898 67 1.24e-13 TAACTACCAT GCAAGGTAGTCAGGTGTTGGG CAAAGCATAG 41318 171 5.66e-10 TCCGCACCGG ACAACGCATTCGGGGGTAGGG GTGGGGGTGG 48573 117 7.65e-10 AACCAGGCAG GCAAGATATTCCGGTCTCTCG TCTAACGGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34252 1.2e-13 66_[+1]_413 38338 1.2e-13 162_[+1]_317 37406 1.2e-13 66_[+1]_413 31898 1.2e-13 66_[+1]_413 41318 5.7e-10 170_[+1]_309 48573 7.6e-10 116_[+1]_363 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 34252 ( 67) GCAAGGTAGTCAGGTGTTGGG 1 38338 ( 163) GCAAGGTAGTCAGGTGTTGGG 1 37406 ( 67) GCAAGGTAGTCAGGTGTTGGG 1 31898 ( 67) GCAAGGTAGTCAGGTGTTGGG 1 41318 ( 171) ACAACGCATTCGGGGGTAGGG 1 48573 ( 117) GCAAGATATTCCGGTCTCTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 22560 bayes= 12.3237 E= 3.1e-010 -68 -923 190 -923 -923 204 -923 -923 191 -923 -923 -923 191 -923 -923 -923 -923 -54 190 -923 -68 -923 190 -923 -923 -54 -923 163 191 -923 -923 -923 -923 -923 158 31 -923 -923 -923 190 -923 204 -923 -923 132 -54 -42 -923 -923 -923 217 -923 -923 -923 217 -923 -923 -923 -42 163 -923 -54 190 -923 -923 -923 -923 190 -68 -54 -923 131 -923 -923 190 -68 -923 -54 190 -923 -923 -923 217 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 3.1e-010 0.166667 0.000000 0.833333 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.833333 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.166667 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.000000 1.000000 0.166667 0.166667 0.000000 0.666667 0.000000 0.000000 0.833333 0.166667 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GCAAGGTA[GT]TCAGGTGTTGGG -------------------------------------------------------------------------------- Time 18.01 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 6 llr = 142 E-value = 5.6e-010 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::2:78:a:::::a282a: pos.-specific C ::2::2222::2::a::2::3 probability G aa8a8:2:8::82a::::8:7 matrix T :::::8::::a:8:::8:::: bits 2.2 ** * * 2.0 ** * ** *** * 1.7 ** * ** *** * 1.5 ***** **** *** ** Relative 1.3 ****** ************** Entropy 1.1 ****** ************** (34.1 bits) 0.9 ****** ************** 0.7 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GGGGGTAAGATGTGCATAGAG consensus C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 34252 264 1.35e-13 TTTTCTTGGT GGGGGTAAGATGTGCATAGAG ACCACATTGC 38338 360 1.35e-13 TTTTTTTGGT GGGGGTAAGATGTGCATAGAG ACCACATTGC 37406 264 1.35e-13 TTTTTTTGGT GGGGGTAAGATGTGCATAGAG ACCACATTGC 31898 264 1.35e-13 TTTTTTTGGT GGGGGTAAGATGTGCATAGAG ACCACATTGC 45840 61 6.68e-10 GCTGGGGAAT GGGGGCCAGATCGGCAAAAAC GGTAGGTAAA 44038 132 7.19e-10 TTGTCGTCAA GGCGATGCCATGTGCATCGAC AAGATTTCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34252 1.4e-13 263_[+2]_216 38338 1.4e-13 359_[+2]_120 37406 1.4e-13 263_[+2]_216 31898 1.4e-13 263_[+2]_216 45840 6.7e-10 60_[+2]_419 44038 7.2e-10 131_[+2]_348 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=6 34252 ( 264) GGGGGTAAGATGTGCATAGAG 1 38338 ( 360) GGGGGTAAGATGTGCATAGAG 1 37406 ( 264) GGGGGTAAGATGTGCATAGAG 1 31898 ( 264) GGGGGTAAGATGTGCATAGAG 1 45840 ( 61) GGGGGCCAGATCGGCAAAAAC 1 44038 ( 132) GGCGATGCCATGTGCATCGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 22560 bayes= 12.3237 E= 5.6e-010 -923 -923 217 -923 -923 -923 217 -923 -923 -54 190 -923 -923 -923 217 -923 -68 -923 190 -923 -923 -54 -923 163 132 -54 -42 -923 164 -54 -923 -923 -923 -54 190 -923 191 -923 -923 -923 -923 -923 -923 190 -923 -54 190 -923 -923 -923 -42 163 -923 -923 217 -923 -923 204 -923 -923 191 -923 -923 -923 -68 -923 -923 163 164 -54 -923 -923 -68 -923 190 -923 191 -923 -923 -923 -923 46 158 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 5.6e-010 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.833333 0.666667 0.166667 0.166667 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.000000 0.833333 0.833333 0.166667 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GGGGGTAAGATGTGCATAGA[GC] -------------------------------------------------------------------------------- Time 35.95 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 11 llr = 193 E-value = 2.6e-009 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :113:2:2:5471::a69:2: pos.-specific C 2793:1478:5:3:a:41919 probability G 8::::65:14226a::::::1 matrix T :2:5a11111:1::::::17: bits 2.2 * 2.0 * *** 1.7 * *** 1.5 * * * *** ** * Relative 1.3 * * * *** ** * Entropy 1.1 * * * * ****** * (25.3 bits) 0.9 *** * *** ********** 0.7 *** ****** ********** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GCCTTGGCCACAGGCAAACTC consensus A C GA C C sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 37406 354 5.04e-13 CTTAGTCCAG GCCTTGGCCAAAGGCAAACTC ACGGCGGCCA 38338 450 1.33e-12 CCTAGTCCAG GCCTTGGCCGAAGGCAAACTC ACGGCGGCCG 31898 354 1.33e-12 CCTAGTCCAG GCCTTGGCCGAAGGCAAACTC ACGGCGGCCG 34252 354 6.15e-11 CCTAGTCCAG GCCTTGGCCGAAGGCAAACCC ACGGCGGCCG 43821 372 2.54e-09 ACAAGTGGTC GTCCTTGCCACAGGCAAACAC TAGCATCGAA 39864 361 9.12e-09 TTCTCCGTTA GACTTGCCTACGGGCACACTC ACAATTCGGC 43944 139 6.31e-08 TAGGCCTCAA GCAATGGCCTCAAGCAACCTC GCTTTACAGA 24886 309 6.69e-08 ATGCGACAAC GCCATGCAGAGACGCACACTG GCAGCTGCTG 42995 387 7.98e-08 CCTTTTTACA CCCATACACACACGCACATTC ATAACAACTC 24006 437 8.45e-08 TGCAACGATG GCCCTCTCCGGTGGCACACAC AACACAACAC 49566 418 9.98e-08 ATTGTCATTC CTCCTACTCACGCGCAAACTC CTTCGCACAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37406 5e-13 353_[+3]_126 38338 1.3e-12 449_[+3]_30 31898 1.3e-12 353_[+3]_126 34252 6.2e-11 353_[+3]_126 43821 2.5e-09 371_[+3]_108 39864 9.1e-09 360_[+3]_119 43944 6.3e-08 138_[+3]_341 24886 6.7e-08 308_[+3]_171 42995 8e-08 386_[+3]_93 24006 8.4e-08 436_[+3]_43 49566 1e-07 417_[+3]_62 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=11 37406 ( 354) GCCTTGGCCAAAGGCAAACTC 1 38338 ( 450) GCCTTGGCCGAAGGCAAACTC 1 31898 ( 354) GCCTTGGCCGAAGGCAAACTC 1 34252 ( 354) GCCTTGGCCGAAGGCAAACCC 1 43821 ( 372) GTCCTTGCCACAGGCAAACAC 1 39864 ( 361) GACTTGCCTACGGGCACACTC 1 43944 ( 139) GCAATGGCCTCAAGCAACCTC 1 24886 ( 309) GCCATGCAGAGACGCACACTG 1 42995 ( 387) CCCATACACACACGCACATTC 1 24006 ( 437) GCCCTCTCCGGTGGCACACAC 1 49566 ( 418) CTCCTACTCACGCGCAAACTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 22560 bayes= 11.3566 E= 2.6e-009 -1010 -42 188 -1010 -155 158 -1010 -56 -155 190 -1010 -1010 3 17 -1010 76 -1010 -1010 -1010 190 -55 -142 151 -156 -1010 58 129 -156 -55 158 -1010 -156 -1010 175 -129 -156 103 -1010 71 -156 45 90 -29 -1010 145 -1010 -29 -156 -155 17 151 -1010 -1010 -1010 217 -1010 -1010 204 -1010 -1010 191 -1010 -1010 -1010 126 58 -1010 -1010 177 -142 -1010 -1010 -1010 190 -1010 -156 -55 -142 -1010 144 -1010 190 -129 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 11 E= 2.6e-009 0.000000 0.181818 0.818182 0.000000 0.090909 0.727273 0.000000 0.181818 0.090909 0.909091 0.000000 0.000000 0.272727 0.272727 0.000000 0.454545 0.000000 0.000000 0.000000 1.000000 0.181818 0.090909 0.636364 0.090909 0.000000 0.363636 0.545455 0.090909 0.181818 0.727273 0.000000 0.090909 0.000000 0.818182 0.090909 0.090909 0.545455 0.000000 0.363636 0.090909 0.363636 0.454545 0.181818 0.000000 0.727273 0.000000 0.181818 0.090909 0.090909 0.272727 0.636364 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.636364 0.363636 0.000000 0.000000 0.909091 0.090909 0.000000 0.000000 0.000000 0.909091 0.000000 0.090909 0.181818 0.090909 0.000000 0.727273 0.000000 0.909091 0.090909 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GCC[TAC]TG[GC]CC[AG][CA]A[GC]GCA[AC]ACTC -------------------------------------------------------------------------------- Time 55.18 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31898 5.92e-27 66_[+1(1.24e-13)]_15_[+3(1.52e-05)]_\ 140_[+2(1.35e-13)]_69_[+3(1.33e-12)]_126 42995 8.25e-04 386_[+3(7.98e-08)]_93 24886 8.45e-04 308_[+3(6.69e-08)]_171 37406 2.30e-27 66_[+1(1.24e-13)]_15_[+3(1.52e-05)]_\ 140_[+2(1.35e-13)]_69_[+3(5.04e-13)]_126 13944 6.04e-01 500 54789 8.26e-02 86_[+1(3.40e-05)]_393 47885 4.12e-01 500 38338 5.92e-27 162_[+1(1.24e-13)]_15_\ [+3(1.52e-05)]_140_[+2(1.35e-13)]_69_[+3(1.33e-12)]_30 48046 4.72e-01 500 48570 4.81e-01 500 48573 3.39e-05 116_[+1(7.65e-10)]_363 48683 7.29e-01 500 43413 2.66e-01 500 39786 3.45e-01 500 49160 6.68e-01 500 39864 5.87e-05 360_[+3(9.12e-09)]_119 16021 9.97e-01 500 49557 3.41e-01 500 49566 3.11e-04 417_[+3(9.98e-08)]_62 24006 7.99e-04 436_[+3(8.45e-08)]_43 43821 2.58e-05 371_[+3(2.54e-09)]_108 43828 6.46e-01 500 43944 4.18e-04 138_[+3(6.31e-08)]_341 44038 7.69e-06 131_[+2(7.19e-10)]_348 34252 2.44e-25 66_[+1(1.24e-13)]_15_[+3(1.52e-05)]_\ 140_[+2(1.35e-13)]_69_[+3(6.15e-11)]_126 45409 7.61e-01 500 45546 2.01e-01 500 45840 7.42e-06 60_[+2(6.68e-10)]_419 20588 3.88e-01 500 41318 8.37e-06 170_[+1(5.66e-10)]_309 31511 2.48e-02 104_[+1(1.69e-05)]_375 42608 5.52e-01 500 48404 5.97e-01 500 31704 2.13e-01 500 44302 8.38e-01 500 47209 3.56e-01 500 34905 6.35e-01 500 47373 5.50e-01 500 44712 6.48e-01 500 44880 2.65e-01 500 39717 1.63e-01 500 47820 3.10e-02 219_[+2(4.31e-05)]_260 34994 5.12e-01 500 49199 9.75e-01 500 43983 9.40e-01 500 49899 3.40e-02 447_[+1(9.22e-05)]_32 43373 2.74e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************