******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/254/254.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9007 1.0000 500 36620 1.0000 500 47184 1.0000 500 37520 1.0000 500 5651 1.0000 500 14783 1.0000 500 15145 1.0000 500 48815 1.0000 500 3940 1.0000 500 42364 1.0000 500 45989 1.0000 500 2762 1.0000 500 49542 1.0000 500 38786 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/254/254.seqs.fa -oc motifs/254 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.261 C 0.245 G 0.240 T 0.254 Background letter frequencies (from dataset with add-one prior applied): A 0.261 C 0.245 G 0.240 T 0.254 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 8 llr = 129 E-value = 2.1e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a358:18::4:1:15141a:1 pos.-specific C :55:4::3:366:31::::89 probability G :1:3393:a:4:a:3954:3: matrix T :1::4::8:4:3:61:15::: bits 2.1 * * 1.9 * * * * 1.6 * * * * 1.4 * * * * * * * Relative 1.2 * * **** * * *** Entropy 1.0 * ** **** * * * *** (23.2 bits) 0.8 * ** **** * * * *** 0.6 * ** **** **** ****** 0.4 * ************ ****** 0.2 ********************* 0.0 --------------------- Multilevel ACAACGATGACCGTAGGTACC consensus ACGT GC TGT CG AG G sequence G C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45989 285 2.40e-10 GACGTGAGGA AACACGATGTGCGTGGGTACC CAACATCCTC 15145 66 2.47e-09 TAACCGTTTT AGAGGGATGACCGTAGGGACC CACGCGAGTC 9007 107 3.53e-09 CTCACAGTCA ATAATGATGCGTGTAGGTACC GGTGTTCTCT 48815 242 6.23e-09 GCTGTCTCCC ACCATGACGCGCGCGGATACC AAGGACCAAA 42364 362 3.88e-08 CAACTTCGCG ACCATGATGACAGCAGTGAGC GTGACCGCCG 14783 402 4.52e-08 ACCGTTAGCG ACAACGGCGTCCGTAAATAGC TCTACACTCC 36620 243 8.98e-08 CAAGGACACA AACACGATGACCGATGAGACA CAATTGTCGA 5651 180 2.99e-07 AATTCGTTGG ACAGGAGTGTCTGTCGGAACC CCTTGGAATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45989 2.4e-10 284_[+1]_195 15145 2.5e-09 65_[+1]_414 9007 3.5e-09 106_[+1]_373 48815 6.2e-09 241_[+1]_238 42364 3.9e-08 361_[+1]_118 14783 4.5e-08 401_[+1]_78 36620 9e-08 242_[+1]_237 5651 3e-07 179_[+1]_300 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=8 45989 ( 285) AACACGATGTGCGTGGGTACC 1 15145 ( 66) AGAGGGATGACCGTAGGGACC 1 9007 ( 107) ATAATGATGCGTGTAGGTACC 1 48815 ( 242) ACCATGACGCGCGCGGATACC 1 42364 ( 362) ACCATGATGACAGCAGTGAGC 1 14783 ( 402) ACAACGGCGTCCGTAAATAGC 1 36620 ( 243) AACACGATGACCGATGAGACA 1 5651 ( 180) ACAGGAGTGTCTGTCGGAACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6720 bayes= 10.4502 E= 2.1e+000 194 -965 -965 -965 -6 103 -94 -102 94 103 -965 -965 152 -965 6 -965 -965 62 6 56 -106 -965 186 -965 152 -965 6 -965 -965 3 -965 156 -965 -965 206 -965 52 3 -965 56 -965 135 64 -965 -106 135 -965 -2 -965 -965 206 -965 -106 3 -965 130 94 -97 6 -102 -106 -965 186 -965 52 -965 106 -102 -106 -965 64 98 194 -965 -965 -965 -965 161 6 -965 -106 184 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 2.1e+000 1.000000 0.000000 0.000000 0.000000 0.250000 0.500000 0.125000 0.125000 0.500000 0.500000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.375000 0.250000 0.375000 0.125000 0.000000 0.875000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.375000 0.250000 0.000000 0.375000 0.000000 0.625000 0.375000 0.000000 0.125000 0.625000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.125000 0.250000 0.000000 0.625000 0.500000 0.125000 0.250000 0.125000 0.125000 0.000000 0.875000 0.000000 0.375000 0.000000 0.500000 0.125000 0.125000 0.000000 0.375000 0.500000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.125000 0.875000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[CA][AC][AG][CTG]G[AG][TC]G[ATC][CG][CT]G[TC][AG]G[GA][TG]A[CG]C -------------------------------------------------------------------------------- Time 1.71 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 3 llr = 66 E-value = 4.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::37:::::::::33:::: pos.-specific C ::3:7::a::a::a:::3333 probability G a37a:3a:3a:aa:a377777 matrix T :7::::::7::::::3::::: bits 2.1 * * ** ****** 1.9 * * ** ****** 1.6 * * ** ****** 1.4 * * ** ****** Relative 1.2 * * ** ****** Entropy 1.0 *************** ***** (32.0 bits) 0.8 *************** ***** 0.6 *************** ***** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GTGGCAGCTGCGGCGAGGGGG consensus GC AG G GACCCC sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 5651 89 2.18e-12 ACTTGGACTC GTGGCAGCTGCGGCGGGGGGC TGGCGGGAAT 9007 367 9.70e-11 TGTTCCGCCA GTGGCGGCGGCGGCGTAGCGG ATCCCGACCA 14783 171 2.50e-10 ACACACATTT GGCGAAGCTGCGGCGAGCGCG CCCTTCCATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5651 2.2e-12 88_[+2]_391 9007 9.7e-11 366_[+2]_113 14783 2.5e-10 170_[+2]_309 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=3 5651 ( 89) GTGGCAGCTGCGGCGGGGGGC 1 9007 ( 367) GTGGCGGCGGCGGCGTAGCGG 1 14783 ( 171) GGCGAAGCTGCGGCGAGCGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6720 bayes= 10.7874 E= 4.2e+002 -823 -823 206 -823 -823 -823 47 139 -823 44 147 -823 -823 -823 206 -823 35 144 -823 -823 135 -823 47 -823 -823 -823 206 -823 -823 203 -823 -823 -823 -823 47 139 -823 -823 206 -823 -823 203 -823 -823 -823 -823 206 -823 -823 -823 206 -823 -823 203 -823 -823 -823 -823 206 -823 35 -823 47 39 35 -823 147 -823 -823 44 147 -823 -823 44 147 -823 -823 44 147 -823 -823 44 147 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 3 E= 4.2e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.333333 0.333333 0.333333 0.000000 0.666667 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.333333 0.666667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[TG][GC]G[CA][AG]GC[TG]GCGGCG[AGT][GA][GC][GC][GC][GC] -------------------------------------------------------------------------------- Time 3.32 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 14 llr = 131 E-value = 5.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::::2:41::: pos.-specific C :2:::1::::2a probability G 411:367:234: matrix T 669a7:36674: bits 2.1 * * 1.9 * * 1.6 * * 1.4 ** * Relative 1.2 *** * * * Entropy 1.0 * *** ** * * (13.5 bits) 0.8 * ****** * * 0.6 ********** * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTTTTGGTTTGC consensus GC GATAGGT sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 37520 287 5.87e-08 AACCTTGAGG TTTTTGGTTTGC TCTTCCAATA 48815 87 2.95e-07 TCAGACGCGT GTTTTGGTTTTC CCACGCTGAG 38786 6 3.70e-06 GTGGA GCTTTGGATTGC GACTATGAGA 3940 277 6.32e-06 ACCGTCCAAT TTTTTGGAATTC CAGCAAGAGA 9007 179 7.23e-06 GTGGAACGAT GTTTTGGAATGC ACCAATGCCT 42364 98 1.88e-05 ATCGAGCAGA TTTTTCGTGTTC AGAAAACATC 5651 469 2.10e-05 TCCAACTTTC TTTTGGTATTCC CAAGCACCGA 47184 344 2.31e-05 GTTTAATTTC TCTTTAGTTTCC GGGTTATTTA 14783 108 3.65e-05 GCGCGGTGAG GGTTGGGTTGGC AGGGGTCGAG 49542 162 5.06e-05 TTACATTGGC GTTTGAGATGTC AAAGCTTTCT 45989 317 6.75e-05 AACATCCTCA TTTTGGTAGGGC GAACCCTATC 36620 48 7.21e-05 CTTGCTCATG GTGTTGTTGTTC GGCTTTTGTG 15145 294 1.03e-04 CCTGGTTTGT TGGTTCGTTTGC GTTCCCGCGT 2762 484 1.08e-04 TCTTCTTGCT TCTTTATTTGCC TCAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37520 5.9e-08 286_[+3]_202 48815 3e-07 86_[+3]_402 38786 3.7e-06 5_[+3]_483 3940 6.3e-06 276_[+3]_212 9007 7.2e-06 178_[+3]_310 42364 1.9e-05 97_[+3]_391 5651 2.1e-05 468_[+3]_20 47184 2.3e-05 343_[+3]_145 14783 3.7e-05 107_[+3]_381 49542 5.1e-05 161_[+3]_327 45989 6.7e-05 316_[+3]_172 36620 7.2e-05 47_[+3]_441 15145 0.0001 293_[+3]_195 2762 0.00011 483_[+3]_5 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=14 37520 ( 287) TTTTTGGTTTGC 1 48815 ( 87) GTTTTGGTTTTC 1 38786 ( 6) GCTTTGGATTGC 1 3940 ( 277) TTTTTGGAATTC 1 9007 ( 179) GTTTTGGAATGC 1 42364 ( 98) TTTTTCGTGTTC 1 5651 ( 469) TTTTGGTATTCC 1 47184 ( 344) TCTTTAGTTTCC 1 14783 ( 108) GGTTGGGTTGGC 1 49542 ( 162) GTTTGAGATGTC 1 45989 ( 317) TTTTGGTAGGGC 1 36620 ( 48) GTGTTGTTGTTC 1 15145 ( 294) TGGTTCGTTTGC 1 2762 ( 484) TCTTTATTTGCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 8.93074 E= 5.0e+002 -1045 -1045 84 117 -1045 -19 -75 134 -1045 -1045 -75 175 -1045 -1045 -1045 198 -1045 -1045 25 149 -29 -78 142 -1045 -1045 -1045 157 17 71 -1045 -1045 117 -87 -1045 -16 134 -1045 -1045 25 149 -1045 -19 84 49 -1045 203 -1045 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 5.0e+002 0.000000 0.000000 0.428571 0.571429 0.000000 0.214286 0.142857 0.642857 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.285714 0.714286 0.214286 0.142857 0.642857 0.000000 0.000000 0.000000 0.714286 0.285714 0.428571 0.000000 0.000000 0.571429 0.142857 0.000000 0.214286 0.642857 0.000000 0.000000 0.285714 0.714286 0.000000 0.214286 0.428571 0.357143 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TG][TC]TT[TG][GA][GT][TA][TG][TG][GTC]C -------------------------------------------------------------------------------- Time 4.92 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9007 1.89e-13 106_[+1(3.53e-09)]_51_\ [+3(7.23e-06)]_176_[+2(9.70e-11)]_113 36620 1.18e-04 47_[+3(7.21e-05)]_183_\ [+1(8.98e-08)]_237 47184 1.04e-02 343_[+3(2.31e-05)]_145 37520 1.45e-03 286_[+3(5.87e-08)]_202 5651 9.45e-13 88_[+2(2.18e-12)]_70_[+1(2.99e-07)]_\ 268_[+3(2.10e-05)]_20 14783 2.31e-11 107_[+3(3.65e-05)]_51_\ [+2(2.50e-10)]_210_[+1(4.52e-08)]_78 15145 6.86e-07 65_[+1(2.47e-09)]_414 48815 4.39e-08 86_[+3(2.95e-07)]_143_\ [+1(6.23e-09)]_238 3940 2.92e-02 83_[+3(9.21e-05)]_181_\ [+3(6.32e-06)]_212 42364 5.90e-06 97_[+3(1.88e-05)]_252_\ [+1(3.88e-08)]_118 45989 7.79e-07 284_[+1(2.40e-10)]_11_\ [+3(6.75e-05)]_172 2762 2.45e-01 500 49542 1.92e-01 161_[+3(5.06e-05)]_327 38786 1.19e-02 5_[+3(3.70e-06)]_483 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************