******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/326/326.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 28222 1.0000 500 13356 1.0000 500 52138 1.0000 500 54778 1.0000 500 49657 1.0000 500 16069 1.0000 500 44546 1.0000 500 51830 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/326/326.seqs.fa -oc motifs/326 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4000 N= 8 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.280 C 0.239 G 0.227 T 0.254 Background letter frequencies (from dataset with add-one prior applied): A 0.279 C 0.240 G 0.227 T 0.254 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 4 llr = 77 E-value = 4.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::53:::5:53::a:3:::: pos.-specific C ::3:8::5a355:::8:83: probability G 8:3:33a:::3:a:a:3:8a matrix T 3a:8:8:::3:5::::83:: bits 2.1 * * * * * 1.9 * * * *** * 1.7 * * * *** * 1.5 * * * *** * Relative 1.3 ** *** * ******** Entropy 1.1 ** **** * ********* (27.7 bits) 0.9 ** ****** ********* 0.6 ** ****** ********* 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GTATCTGACACCGAGCTCGG consensus T CAGG C CAT AGTC sequence G TG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 54778 291 1.89e-11 GCCGTATCCA GTCTCTGCCCCCGAGCTCGG AAACCTCAGT 52138 274 5.96e-11 CAACCTACCG GTATCTGACTGTGAGCTCGG ACCTGCTAGC 13356 7 3.70e-09 AGCTTG GTATGTGCCACTGAGATTCG TTTTTTAATC 16069 327 8.88e-09 TTTCCGCTTC TTGACGGACAACGAGCGCGG TGGATTGTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54778 1.9e-11 290_[+1]_190 52138 6e-11 273_[+1]_207 13356 3.7e-09 6_[+1]_474 16069 8.9e-09 326_[+1]_154 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=4 54778 ( 291) GTCTCTGCCCCCGAGCTCGG 1 52138 ( 274) GTATCTGACTGTGAGCTCGG 1 13356 ( 7) GTATGTGCCACTGAGATTCG 1 16069 ( 327) TTGACGGACAACGAGCGCGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 3848 bayes= 10.646 E= 4.2e+002 -865 -865 172 -2 -865 -865 -865 197 84 6 14 -865 -16 -865 -865 156 -865 164 14 -865 -865 -865 14 156 -865 -865 214 -865 84 106 -865 -865 -865 206 -865 -865 84 6 -865 -2 -16 106 14 -865 -865 106 -865 97 -865 -865 214 -865 184 -865 -865 -865 -865 -865 214 -865 -16 164 -865 -865 -865 -865 14 156 -865 164 -865 -2 -865 6 172 -865 -865 -865 214 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 4.2e+002 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.500000 0.250000 0.250000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.250000 0.000000 0.250000 0.250000 0.500000 0.250000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.750000 0.000000 0.250000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GT]T[ACG][TA][CG][TG]G[AC]C[ACT][CAG][CT]GAG[CA][TG][CT][GC]G -------------------------------------------------------------------------------- Time 0.53 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 3 llr = 65 E-value = 7.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :a::::7a:::a3:3::3:: pos.-specific C ::::a:::737::a3a:3a: probability G a::a:a3::73:3:3:7::a matrix T ::a:::::3:::3:::33:: bits 2.1 * *** * * ** 1.9 ****** * * * * ** 1.7 ****** * * * * ** 1.5 ****** * * * * ** Relative 1.3 ****** * * * * * ** Entropy 1.1 ************ * ** ** (31.2 bits) 0.9 ************ * ** ** 0.6 ************ * ** ** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GATGCGAACGCAACACGACG consensus G TCG G C TC sequence T G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 49657 364 6.79e-11 TCTTTCGCAC GATGCGAACGGATCACGCCG ACCACACGAG 13356 92 1.37e-10 AATATACAGT GATGCGGATGCAGCCCGACG AAGCAGTTTA 51830 87 2.37e-10 CCTTGGAGCG GATGCGAACCCAACGCTTCG GAAACGAATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49657 6.8e-11 363_[+2]_117 13356 1.4e-10 91_[+2]_389 51830 2.4e-10 86_[+2]_394 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=3 49657 ( 364) GATGCGAACGGATCACGCCG 1 13356 ( 92) GATGCGGATGCAGCCCGACG 1 51830 ( 87) GATGCGAACCCAACGCTTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 3848 bayes= 10.7716 E= 7.0e+002 -823 -823 214 -823 184 -823 -823 -823 -823 -823 -823 197 -823 -823 214 -823 -823 206 -823 -823 -823 -823 214 -823 125 -823 55 -823 184 -823 -823 -823 -823 147 -823 39 -823 48 155 -823 -823 147 55 -823 184 -823 -823 -823 25 -823 55 39 -823 206 -823 -823 25 48 55 -823 -823 206 -823 -823 -823 -823 155 39 25 48 -823 39 -823 206 -823 -823 -823 -823 214 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 3 E= 7.0e+002 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.333333 0.666667 0.000000 0.000000 0.666667 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.000000 0.333333 0.333333 0.000000 1.000000 0.000000 0.000000 0.333333 0.333333 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.333333 0.333333 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GATGCG[AG]A[CT][GC][CG]A[AGT]C[ACG]C[GT][ACT]CG -------------------------------------------------------------------------------- Time 1.13 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 89 E-value = 5.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :84::2:8:::4:a:6:a:4: pos.-specific C ::2a44::6:22::626:::a probability G a:2:62a2::6:8:224:66: matrix T :22::2::4a242:2:::4:: bits 2.1 * * * * 1.9 * * * * * * * 1.7 * * * * * * * 1.5 * * * * * * * Relative 1.3 * * ** * ** * * Entropy 1.1 ** ** **** ** ***** (25.8 bits) 0.9 ** ** **** ** ***** 0.6 ** ** ***** ********* 0.4 ** ** *************** 0.2 ** ** *************** 0.0 --------------------- Multilevel GAACGCGACTGAGACACAGGC consensus TC CA GT CTT GCG TA sequence G G TC TG T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 51830 1 1.43e-10 . GAGCCCGATTGTGACACAGAC AAAAGGACAC 28222 1 2.50e-09 . GAACGAGACTGATATACAGGC GCTCATGCTG 16069 41 4.51e-09 GTACAGCAAT GTCCCCGACTCCGACACAGGC CCACGATGAA 54778 125 4.51e-09 AAATCGCAGC GAACGGGATTGAGAGGGATGC CAGCTCCAAT 49657 241 2.37e-08 CGAACTTTTG GATCGTGGCTTTGACCGATAC AACATGCCTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 51830 1.4e-10 [+3]_479 28222 2.5e-09 [+3]_479 16069 4.5e-09 40_[+3]_439 54778 4.5e-09 124_[+3]_355 49657 2.4e-08 240_[+3]_239 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 51830 ( 1) GAGCCCGATTGTGACACAGAC 1 28222 ( 1) GAACGAGACTGATATACAGGC 1 16069 ( 41) GTCCCCGACTCCGACACAGGC 1 54778 ( 125) GAACGGGATTGAGAGGGATGC 1 49657 ( 241) GATCGTGGCTTTGACCGATAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3840 bayes= 9.83492 E= 5.5e+002 -897 -897 214 -897 152 -897 -897 -35 52 -26 -18 -35 -897 206 -897 -897 -897 74 140 -897 -48 74 -18 -35 -897 -897 214 -897 152 -897 -18 -897 -897 132 -897 65 -897 -897 -897 197 -897 -26 140 -35 52 -26 -897 65 -897 -897 182 -35 184 -897 -897 -897 -897 132 -18 -35 110 -26 -18 -897 -897 132 82 -897 184 -897 -897 -897 -897 -897 140 65 52 -897 140 -897 -897 206 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 5.5e+002 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.400000 0.200000 0.200000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.200000 0.400000 0.200000 0.200000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.600000 0.200000 0.400000 0.200000 0.000000 0.400000 0.000000 0.000000 0.800000 0.200000 1.000000 0.000000 0.000000 0.000000 0.000000 0.600000 0.200000 0.200000 0.600000 0.200000 0.200000 0.000000 0.000000 0.600000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.600000 0.400000 0.400000 0.000000 0.600000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[AT][ACGT]C[GC][CAGT]G[AG][CT]T[GCT][ATC][GT]A[CGT][ACG][CG]A[GT][GA]C -------------------------------------------------------------------------------- Time 1.63 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 28222 6.49e-05 [+3(2.50e-09)]_479 13356 1.53e-11 6_[+1(3.70e-09)]_65_[+2(1.37e-10)]_\ 389 52138 7.64e-07 273_[+1(5.96e-11)]_207 54778 4.44e-12 124_[+3(4.51e-09)]_145_\ [+1(1.89e-11)]_190 49657 7.64e-11 240_[+3(2.37e-08)]_102_\ [+2(6.79e-11)]_117 16069 9.91e-10 40_[+3(4.51e-09)]_265_\ [+1(8.88e-09)]_154 44546 4.04e-01 500 51830 3.69e-12 [+3(1.43e-10)]_65_[+2(2.37e-10)]_\ 394 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************