******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/80/80.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 17794 1.0000 500 46877 1.0000 500 47011 1.0000 500 43724 1.0000 500 43826 1.0000 500 50578 1.0000 500 45310 1.0000 500 38587 1.0000 500 46373 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/80/80.seqs.fa -oc motifs/80 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.288 C 0.229 G 0.217 T 0.266 Background letter frequencies (from dataset with add-one prior applied): A 0.288 C 0.229 G 0.217 T 0.266 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 4 llr = 65 E-value = 1.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::3a::533::::: pos.-specific C :58:::a::5:::8a probability G a338:::5::aa:3: matrix T :3:::a::83::a:: bits 2.2 * * ** * 2.0 * ** *** * 1.8 * *** *** * 1.5 * *** *** * Relative 1.3 * ***** ***** Entropy 1.1 * ******* ***** (23.6 bits) 0.9 * ******* ***** 0.7 ********* ***** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel GCCGATCATCGGTCC consensus GGA GAA G sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 46877 257 1.23e-09 TCCAATGTTG GGCGATCGTCGGTCC GAAAGATGAT 50578 272 1.21e-08 AAGTCCAAGT GTCGATCGTAGGTCC GGTGTCAGGA 43724 181 4.59e-08 AAATGAGAAA GCCAATCAACGGTCC CGGACGCAAT 17794 436 6.12e-08 ACACTTTCAG GCGGATCATTGGTGC TGCAGCTTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46877 1.2e-09 256_[+1]_229 50578 1.2e-08 271_[+1]_214 43724 4.6e-08 180_[+1]_305 17794 6.1e-08 435_[+1]_50 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=4 46877 ( 257) GGCGATCGTCGGTCC 1 50578 ( 272) GTCGATCGTAGGTCC 1 43724 ( 181) GCCAATCAACGGTCC 1 17794 ( 436) GCGGATCATTGGTGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4374 bayes= 10.8309 E= 1.9e+002 -865 -865 220 -865 -865 112 20 -9 -865 171 20 -865 -20 -865 179 -865 179 -865 -865 -865 -865 -865 -865 191 -865 212 -865 -865 80 -865 120 -865 -20 -865 -865 149 -20 112 -865 -9 -865 -865 220 -865 -865 -865 220 -865 -865 -865 -865 191 -865 171 20 -865 -865 212 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 4 E= 1.9e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.750000 0.250000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.250000 0.000000 0.000000 0.750000 0.250000 0.500000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[CGT][CG][GA]ATC[AG][TA][CAT]GGT[CG]C -------------------------------------------------------------------------------- Time 0.76 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 3 llr = 55 E-value = 3.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::aaa3a::3:7:: pos.-specific C a:3:::7::::::a: probability G :a7:::::aa:a::a matrix T ::::::::::7:3:: bits 2.2 ** ** * ** 2.0 ** ** * ** 1.8 ** *** *** * ** 1.5 ** *** *** * ** Relative 1.3 ****** *** * ** Entropy 1.1 ********** * ** (26.7 bits) 0.9 *************** 0.7 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CGGAAACAGGTGACG consensus C A A T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 17794 460 6.61e-10 CTGCAGCTTT CGGAAACAGGTGACG AAGCCGATAA 46877 414 4.82e-09 ATTACGGATC CGGAAACAGGAGTCG GACGGTGACC 38587 38 7.22e-09 CAAAAAATTG CGCAAAAAGGTGACG CGGTCGGACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17794 6.6e-10 459_[+2]_26 46877 4.8e-09 413_[+2]_72 38587 7.2e-09 37_[+2]_448 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=3 17794 ( 460) CGGAAACAGGTGACG 1 46877 ( 414) CGGAAACAGGAGTCG 1 38587 ( 38) CGCAAAAAGGTGACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4374 bayes= 10.1675 E= 3.3e+002 -823 212 -823 -823 -823 -823 220 -823 -823 54 162 -823 179 -823 -823 -823 179 -823 -823 -823 179 -823 -823 -823 21 154 -823 -823 179 -823 -823 -823 -823 -823 220 -823 -823 -823 220 -823 21 -823 -823 132 -823 -823 220 -823 121 -823 -823 33 -823 212 -823 -823 -823 -823 220 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 3 E= 3.3e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.000000 0.666667 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CG[GC]AAA[CA]AGG[TA]G[AT]CG -------------------------------------------------------------------------------- Time 1.49 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 4 llr = 75 E-value = 5.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::::8:::::::::::: pos.-specific C :a3:8:::3a:3:53:38: probability G ::3:3a::::a38:3853: matrix T a:5a::3a8::535533:a bits 2.2 * * ** 2.0 ** * * * ** * 1.8 ** * * * ** * 1.5 ** * * * ** * Relative 1.3 ** *** * ** * * ** Entropy 1.1 ** ******** ** * ** (27.0 bits) 0.9 ** ******** ** * ** 0.7 ** ******** ** **** 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel TCTTCGATTCGTGCTGGCT consensus C G T C CTTCTCG sequence G G G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 50578 224 1.03e-09 GGAAGGGTGT TCTTCGATTCGGGTCTGCT ATCACTTCCG 46877 334 1.68e-09 ATCGATGGTA TCCTCGATCCGTGTGGCCT GTCGTTTCCA 47011 456 1.90e-09 TGCAAGACTT TCGTCGTTTCGCGCTGTCT ACTAGAGCGA 17794 65 2.87e-09 TGAAGAAAGA TCTTGGATTCGTTCTGGGT CACGACTAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50578 1e-09 223_[+3]_258 46877 1.7e-09 333_[+3]_148 47011 1.9e-09 455_[+3]_26 17794 2.9e-09 64_[+3]_417 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=4 50578 ( 224) TCTTCGATTCGGGTCTGCT 1 46877 ( 334) TCCTCGATCCGTGTGGCCT 1 47011 ( 456) TCGTCGTTTCGCGCTGTCT 1 17794 ( 65) TCTTGGATTCGTTCTGGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 4338 bayes= 10.0815 E= 5.6e+002 -865 -865 -865 191 -865 212 -865 -865 -865 12 20 91 -865 -865 -865 191 -865 171 20 -865 -865 -865 220 -865 138 -865 -865 -9 -865 -865 -865 191 -865 12 -865 149 -865 212 -865 -865 -865 -865 220 -865 -865 12 20 91 -865 -865 179 -9 -865 112 -865 91 -865 12 20 91 -865 -865 179 -9 -865 12 120 -9 -865 171 20 -865 -865 -865 -865 191 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 4 E= 5.6e+002 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.250000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.250000 0.500000 0.000000 0.000000 0.750000 0.250000 0.000000 0.500000 0.000000 0.500000 0.000000 0.250000 0.250000 0.500000 0.000000 0.000000 0.750000 0.250000 0.000000 0.250000 0.500000 0.250000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TC[TCG]T[CG]G[AT]T[TC]CG[TCG][GT][CT][TCG][GT][GCT][CG]T -------------------------------------------------------------------------------- Time 2.26 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17794 1.05e-14 64_[+3(2.87e-09)]_352_\ [+1(6.12e-08)]_9_[+2(6.61e-10)]_26 46877 1.02e-15 256_[+1(1.23e-09)]_62_\ [+3(1.68e-09)]_61_[+2(4.82e-09)]_72 47011 1.64e-05 455_[+3(1.90e-09)]_26 43724 1.19e-03 180_[+1(4.59e-08)]_305 43826 8.29e-01 500 50578 2.09e-10 223_[+3(1.03e-09)]_29_\ [+1(1.21e-08)]_214 45310 8.68e-01 500 38587 7.92e-05 37_[+2(7.22e-09)]_448 46373 5.73e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************