******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/280/280.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11988 1.0000 500 21611 1.0000 500 23684 1.0000 500 23685 1.0000 500 25047 1.0000 500 268793 1.0000 500 3148 1.0000 500 33220 1.0000 500 40537 1.0000 500 bd1747 1.0000 500 bd1821 1.0000 500 bd888 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/280/280.seqs.fa -oc motifs/280 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.261 C 0.233 G 0.233 T 0.273 Background letter frequencies (from dataset with add-one prior applied): A 0.261 C 0.233 G 0.233 T 0.273 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 6 llr = 111 E-value = 7.5e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 7:82:7::5a:57::a:7:8 pos.-specific C :7::8:38::a5:a5:a:2: probability G 33::232:3:::::5:::8: matrix T ::28::522:::3::::3:2 bits 2.1 * * * 1.9 ** * ** 1.7 ** * ** 1.5 * * ** * ** * Relative 1.3 **** * ** * ** ** Entropy 1.1 ****** * *********** (26.8 bits) 0.8 ****** * *********** 0.6 ******** *********** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel ACATCATCAACAACCACAGA consensus GG GC G CT G T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 40537 353 1.42e-11 CTCCATCATG GCATCATCAACCACCACAGA TACCTATCAA 25047 357 1.17e-10 CTTTTGTGCC AGATCATCAACCTCCACAGA CACAACACCC 21611 465 1.47e-09 CCAATTTGCA GCATCAGCAACCACCACACA ATTCTACCTA bd1747 205 4.77e-09 CTACTATTTG ACAACGTTGACAACGACAGA ATGAGATCGA 23685 311 8.77e-09 CCGAAACACG AGATGGCCTACAACGACTGA CGTACGTGTA 268793 32 1.02e-08 TGCGGTAGTA ACTTCACCGACATCGACTGT ATCATTTGCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40537 1.4e-11 352_[+1]_128 25047 1.2e-10 356_[+1]_124 21611 1.5e-09 464_[+1]_16 bd1747 4.8e-09 204_[+1]_276 23685 8.8e-09 310_[+1]_170 268793 1e-08 31_[+1]_449 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=6 40537 ( 353) GCATCATCAACCACCACAGA 1 25047 ( 357) AGATCATCAACCTCCACAGA 1 21611 ( 465) GCATCAGCAACCACCACACA 1 bd1747 ( 205) ACAACGTTGACAACGACAGA 1 23685 ( 311) AGATGGCCTACAACGACTGA 1 268793 ( 32) ACTTCACCGACATCGACTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5772 bayes= 10.3563 E= 7.5e-001 135 -923 52 -923 -923 152 52 -923 167 -923 -923 -71 -65 -923 -923 161 -923 184 -48 -923 135 -923 52 -923 -923 52 -48 87 -923 184 -923 -71 94 -923 52 -71 194 -923 -923 -923 -923 210 -923 -923 94 110 -923 -923 135 -923 -923 29 -923 210 -923 -923 -923 110 110 -923 194 -923 -923 -923 -923 210 -923 -923 135 -923 -923 29 -923 -48 184 -923 167 -923 -923 -71 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 7.5e-001 0.666667 0.000000 0.333333 0.000000 0.000000 0.666667 0.333333 0.000000 0.833333 0.000000 0.000000 0.166667 0.166667 0.000000 0.000000 0.833333 0.000000 0.833333 0.166667 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.333333 0.166667 0.500000 0.000000 0.833333 0.000000 0.166667 0.500000 0.000000 0.333333 0.166667 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 0.166667 0.833333 0.000000 0.833333 0.000000 0.000000 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AG][CG]ATC[AG][TC]C[AG]AC[AC][AT]C[CG]AC[AT]GA -------------------------------------------------------------------------------- Time 1.39 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 9 llr = 117 E-value = 6.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 32:47:33:a6:41: pos.-specific C :22:1::61:::::: probability G 71862a7:9::a:7a matrix T :4:::::1::4:62: bits 2.1 * * * 1.9 * * * * 1.7 * ** * * 1.5 * ** * * Relative 1.3 * * ** * * Entropy 1.1 * ** ** ** * * (18.7 bits) 0.8 * ***** ******* 0.6 * ************* 0.4 * ************* 0.2 *************** 0.0 --------------- Multilevel GTGGAGGCGAAGTGG consensus AACAG AA T AT sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 23685 24 1.14e-08 ACACGAATTG ATGGAGGCGAAGTGG TTGGTGTTGT 11988 443 1.14e-08 ACTCAGCGAA GCGAAGGCGAAGTGG CTCAACATAG bd1821 93 1.72e-08 GAAGGCCGAG GCGGAGGCGATGAGG GAGAGAGCAA 3148 31 4.78e-08 TTCTGTATTG GAGGAGGAGAAGAGG CAAGATGAGA 33220 482 8.90e-07 GTGGCGTGGG GTCGAGGCGATGAAG AGGA 40537 133 1.60e-06 GAACTCCGTG AAGAGGAAGAAGTGG TCGACTAGCA 21611 139 2.18e-06 AACGTCAGGT ATGGCGGTGATGTGG ATGGTTGTGA bd888 377 3.76e-06 AGGACCACAT GGGAGGAAGATGTTG AAACTCTCTA bd1747 96 4.41e-06 TATAGACGGA GTCAAGACCAAGATG TGATCTCACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23685 1.1e-08 23_[+2]_462 11988 1.1e-08 442_[+2]_43 bd1821 1.7e-08 92_[+2]_393 3148 4.8e-08 30_[+2]_455 33220 8.9e-07 481_[+2]_4 40537 1.6e-06 132_[+2]_353 21611 2.2e-06 138_[+2]_347 bd888 3.8e-06 376_[+2]_109 bd1747 4.4e-06 95_[+2]_390 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=9 23685 ( 24) ATGGAGGCGAAGTGG 1 11988 ( 443) GCGAAGGCGAAGTGG 1 bd1821 ( 93) GCGGAGGCGATGAGG 1 3148 ( 31) GAGGAGGAGAAGAGG 1 33220 ( 482) GTCGAGGCGATGAAG 1 40537 ( 133) AAGAGGAAGAAGTGG 1 21611 ( 139) ATGGCGGTGATGTGG 1 bd888 ( 377) GGGAGGAAGATGTTG 1 bd1747 ( 96) GTCAAGACCAAGATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 5832 bayes= 10.1866 E= 6.4e-001 35 -982 152 -982 -23 -7 -107 70 -982 -7 174 -982 77 -982 125 -982 135 -106 -7 -982 -982 -982 210 -982 35 -982 152 -982 35 125 -982 -130 -982 -106 193 -982 194 -982 -982 -982 109 -982 -982 70 -982 -982 210 -982 77 -982 -982 102 -123 -982 152 -30 -982 -982 210 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 9 E= 6.4e-001 0.333333 0.000000 0.666667 0.000000 0.222222 0.222222 0.111111 0.444444 0.000000 0.222222 0.777778 0.000000 0.444444 0.000000 0.555556 0.000000 0.666667 0.111111 0.222222 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.333333 0.555556 0.000000 0.111111 0.000000 0.111111 0.888889 0.000000 1.000000 0.000000 0.000000 0.000000 0.555556 0.000000 0.000000 0.444444 0.000000 0.000000 1.000000 0.000000 0.444444 0.000000 0.000000 0.555556 0.111111 0.000000 0.666667 0.222222 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GA][TAC][GC][GA][AG]G[GA][CA]GA[AT]G[TA][GT]G -------------------------------------------------------------------------------- Time 2.76 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 12 llr = 117 E-value = 8.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :43:96:256:8 pos.-specific C :2::1::2:::: probability G 837a:39714a1 matrix T 31:::11:4::2 bits 2.1 * * 1.9 * * 1.7 * * * 1.5 ** * * Relative 1.3 * ** * * Entropy 1.1 * *** * ** (14.1 bits) 0.8 * *** ** *** 0.6 * ********** 0.4 * ********** 0.2 ************ 0.0 ------------ Multilevel GAGGAAGGAAGA consensus TGA G TG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 23685 200 4.96e-07 GATCATGGGA GAGGAGGGAGGA TGTGTTGGAG bd1821 434 5.80e-07 GTCCATTGGG GGGGAGGGAGGA CTCTTCAGGA 33220 104 1.34e-06 CGGGTACGTC GCGGAGGGAAGA GGTAGAGATA 25047 3 1.49e-06 CT GGAGAAGGTGGA ACGAAAGCTT 23684 127 2.70e-06 TTTTGCATTG GAGGAAGCTAGA AGGGAGTTTC 3148 327 6.39e-06 AGAGGAGGAT GGGGAGGATAGA GCCGCTCCTC bd1747 422 7.21e-06 AGCTACACAC GAAGAAGGAAGT GCGACAGTCC 21611 204 1.66e-05 ACGGTGATGG TGGGAAGGGAGA GGTCTCTATC 268793 225 4.31e-05 ATAAAATATG GTGGAATGAAGA CTGATTGACC 11988 94 7.83e-05 GTAGGGCAAA TCAGAAGGAGGG TCAGGCCGTT 40537 211 1.01e-04 GTAGGTGCAG TAAGAAGATGGT CCTCCGACAA bd888 313 1.27e-04 TCCGTTCTAC GAGGCTGCTAGA CAAGGAACGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23685 5e-07 199_[+3]_289 bd1821 5.8e-07 433_[+3]_55 33220 1.3e-06 103_[+3]_385 25047 1.5e-06 2_[+3]_486 23684 2.7e-06 126_[+3]_362 3148 6.4e-06 326_[+3]_162 bd1747 7.2e-06 421_[+3]_67 21611 1.7e-05 203_[+3]_285 268793 4.3e-05 224_[+3]_264 11988 7.8e-05 93_[+3]_395 40537 0.0001 210_[+3]_278 bd888 0.00013 312_[+3]_176 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=12 23685 ( 200) GAGGAGGGAGGA 1 bd1821 ( 434) GGGGAGGGAGGA 1 33220 ( 104) GCGGAGGGAAGA 1 25047 ( 3) GGAGAAGGTGGA 1 23684 ( 127) GAGGAAGCTAGA 1 3148 ( 327) GGGGAGGATAGA 1 bd1747 ( 422) GAAGAAGGAAGT 1 21611 ( 204) TGGGAAGGGAGA 1 268793 ( 225) GTGGAATGAAGA 1 11988 ( 94) TCAGAAGGAGGG 1 40537 ( 211) TAAGAAGATGGT 1 bd888 ( 313) GAGGCTGCTAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 9.37898 E= 8.1e+001 -1023 -1023 169 -13 67 -48 52 -171 35 -1023 152 -1023 -1023 -1023 210 -1023 181 -148 -1023 -1023 116 -1023 52 -171 -1023 -1023 198 -171 -65 -48 152 -1023 94 -1023 -148 61 116 -1023 84 -1023 -1023 -1023 210 -1023 152 -1023 -148 -71 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 8.1e+001 0.000000 0.000000 0.750000 0.250000 0.416667 0.166667 0.333333 0.083333 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.916667 0.083333 0.000000 0.000000 0.583333 0.000000 0.333333 0.083333 0.000000 0.000000 0.916667 0.083333 0.166667 0.166667 0.666667 0.000000 0.500000 0.000000 0.083333 0.416667 0.583333 0.000000 0.416667 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.083333 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GT][AG][GA]GA[AG]GG[AT][AG]GA -------------------------------------------------------------------------------- Time 3.90 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11988 1.27e-05 93_[+3(7.83e-05)]_337_\ [+2(1.14e-08)]_43 21611 2.17e-09 138_[+2(2.18e-06)]_50_\ [+3(1.66e-05)]_249_[+1(1.47e-09)]_16 23684 2.35e-02 126_[+3(2.70e-06)]_362 23685 3.26e-12 23_[+2(1.14e-08)]_161_\ [+3(4.96e-07)]_99_[+1(8.77e-09)]_170 25047 1.24e-08 2_[+3(1.49e-06)]_342_[+1(1.17e-10)]_\ 124 268793 8.40e-06 31_[+1(1.02e-08)]_173_\ [+3(4.31e-05)]_264 3148 1.72e-06 30_[+2(4.78e-08)]_281_\ [+3(6.39e-06)]_162 33220 1.89e-05 103_[+3(1.34e-06)]_366_\ [+2(8.90e-07)]_4 40537 1.15e-10 132_[+2(1.60e-06)]_205_\ [+1(1.42e-11)]_46_[+1(2.09e-05)]_62 bd1747 5.77e-09 95_[+2(4.41e-06)]_94_[+1(4.77e-09)]_\ 197_[+3(7.21e-06)]_67 bd1821 5.10e-07 92_[+2(1.72e-08)]_326_\ [+3(5.80e-07)]_55 bd888 7.55e-04 376_[+2(3.76e-06)]_109 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************