******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/267/267.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 37215 1.0000 500 48354 1.0000 500 43562 1.0000 500 39515 1.0000 500 49496 1.0000 500 33458 1.0000 500 35830 1.0000 500 44703 1.0000 500 33952 1.0000 500 44324 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/267/267.seqs.fa -oc motifs/267 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.224 G 0.218 T 0.288 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.224 G 0.218 T 0.288 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 8 llr = 95 E-value = 6.7e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a391413a::89 pos.-specific C :6:::9:::a:1 probability G :1184:8:a::: matrix T :::13:::::3: bits 2.2 ** 2.0 * *** 1.8 * *** 1.5 * * *** Relative 1.3 * * ***** * Entropy 1.1 * ** ******* (17.1 bits) 0.9 **** ******* 0.7 **** ******* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel ACAGACGAGCAA consensus A G A T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 37215 22 3.18e-07 CCAGGGTTTT AAAGACGAGCAA CAAAGTCGAG 35830 447 6.78e-07 CTGGTGCTTT ACAGTCGAGCTA TTTACTGTTG 33952 22 8.22e-07 CAGACATTGG ACAAGCGAGCAA TTTCCACGTC 44703 207 8.22e-07 CTGTCGAAAG ACGGACGAGCAA CATCTACACC 48354 450 2.48e-06 TACATCACCA ACAGGCAAGCAC TTACTGTCGA 43562 108 3.18e-06 GTTGTTAACG ACATGCAAGCAA ACACAATATT 49496 150 3.45e-06 GAAAACCCGA AGAGTCGAGCTA CCCTACCCTA 39515 303 3.82e-06 GACCTAAGAA AAAGAAGAGCAA AGAGTAGGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37215 3.2e-07 21_[+1]_467 35830 6.8e-07 446_[+1]_42 33952 8.2e-07 21_[+1]_467 44703 8.2e-07 206_[+1]_282 48354 2.5e-06 449_[+1]_39 43562 3.2e-06 107_[+1]_381 49496 3.5e-06 149_[+1]_339 39515 3.8e-06 302_[+1]_186 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=8 37215 ( 22) AAAGACGAGCAA 1 35830 ( 447) ACAGTCGAGCTA 1 33952 ( 22) ACAAGCGAGCAA 1 44703 ( 207) ACGGACGAGCAA 1 48354 ( 450) ACAGGCAAGCAC 1 43562 ( 108) ACATGCAAGCAA 1 49496 ( 150) AGAGTCGAGCTA 1 39515 ( 303) AAAGAAGAGCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 9.25326 E= 6.7e-001 188 -965 -965 -965 -11 148 -80 -965 169 -965 -80 -965 -111 -965 178 -120 47 -965 78 -20 -111 196 -965 -965 -11 -965 178 -965 188 -965 -965 -965 -965 -965 220 -965 -965 216 -965 -965 147 -965 -965 -20 169 -84 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 6.7e-001 1.000000 0.000000 0.000000 0.000000 0.250000 0.625000 0.125000 0.000000 0.875000 0.000000 0.125000 0.000000 0.125000 0.000000 0.750000 0.125000 0.375000 0.000000 0.375000 0.250000 0.125000 0.875000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.875000 0.125000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[CA]AG[AGT]C[GA]AGC[AT]A -------------------------------------------------------------------------------- Time 1.04 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 10 llr = 111 E-value = 3.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 4531:a::281822 pos.-specific C 6:::7:1:319::: probability G :::53:1:4:::8: matrix T :574::8a11:2:8 bits 2.2 2.0 * 1.8 * * * 1.5 * * * Relative 1.3 ** * * * Entropy 1.1 * ** * **** (16.0 bits) 0.9 *** **** ***** 0.7 ******** ***** 0.4 ******** ***** 0.2 ************** 0.0 -------------- Multilevel CATGCATTGACAGT consensus ATATG C TAA sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 39515 152 8.91e-09 ATCAAGGCAT CTTGCATTGACAGT ACTGCCGCAC 44703 456 1.03e-07 ACCACCCGTC AATTCATTGACAGT GTGAGTCCCT 48354 424 2.06e-06 GCTATAGAGT CTAGGATTGACTGT TTTACATCAC 37215 365 3.18e-06 ACTACCTTAC CTATCACTCACAGT CACATGCGCG 49496 278 3.55e-06 GTTCACAAAA AATGGATTCCCAGT AGCCGTCATG 33952 439 4.90e-06 AATTTGTGAT CTTGCATTAACAAA GTGATTGACT 44324 229 6.25e-06 AACTGGGATC CATTCATTGTCAAT CTTCCCTTCT 43562 279 6.25e-06 ACCTTTTACA ATTACATTTACAGT TACGGAAGTT 35830 9 1.03e-05 TTCAACGA AATGGATTAAAAGT TAGACGCAAT 33458 300 2.83e-05 CTCAGGCCTT CAATCAGTCACTGA AAAAGATTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39515 8.9e-09 151_[+2]_335 44703 1e-07 455_[+2]_31 48354 2.1e-06 423_[+2]_63 37215 3.2e-06 364_[+2]_122 49496 3.6e-06 277_[+2]_209 33952 4.9e-06 438_[+2]_48 44324 6.2e-06 228_[+2]_258 43562 6.2e-06 278_[+2]_208 35830 1e-05 8_[+2]_478 33458 2.8e-05 299_[+2]_187 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=10 39515 ( 152) CTTGCATTGACAGT 1 44703 ( 456) AATTCATTGACAGT 1 48354 ( 424) CTAGGATTGACTGT 1 37215 ( 365) CTATCACTCACAGT 1 49496 ( 278) AATGGATTCCCAGT 1 33952 ( 439) CTTGCATTAACAAA 1 44324 ( 229) CATTCATTGTCAAT 1 43562 ( 279) ATTACATTTACAGT 1 35830 ( 9) AATGGATTAAAAGT 1 33458 ( 300) CAATCAGTCACTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 4870 bayes= 8.92481 E= 3.8e+001 56 142 -997 -997 89 -997 -997 80 15 -997 -997 128 -143 -997 120 48 -997 164 46 -997 188 -997 -997 -997 -997 -116 -112 148 -997 -997 -997 180 -44 42 88 -152 156 -116 -997 -152 -143 200 -997 -997 156 -997 -997 -52 -44 -997 188 -997 -44 -997 -997 148 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 10 E= 3.8e+001 0.400000 0.600000 0.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.300000 0.000000 0.000000 0.700000 0.100000 0.000000 0.500000 0.400000 0.000000 0.700000 0.300000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.100000 0.100000 0.800000 0.000000 0.000000 0.000000 1.000000 0.200000 0.300000 0.400000 0.100000 0.800000 0.100000 0.000000 0.100000 0.100000 0.900000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.800000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CA][AT][TA][GT][CG]ATT[GCA]AC[AT][GA][TA] -------------------------------------------------------------------------------- Time 2.06 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 6 llr = 76 E-value = 9.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 32a:a7::7a:: pos.-specific C :8:8:2a73:a8 probability G :::::::3:::2 matrix T 7::2:2:::::: bits 2.2 * * 2.0 * * * ** 1.8 * * * ** 1.5 **** * *** Relative 1.3 **** ** *** Entropy 1.1 **** ****** (18.3 bits) 0.9 ***** ****** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TCACAACCAACC consensus A GC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44324 44 5.32e-08 GTCTTTCCCG TCACAACCAACC TTTCCGGATT 44703 386 1.99e-07 ACAAGGAACA TCACAACGAACC TCCTAGCTAC 37215 218 3.27e-07 AGCTGAACCT TCACAACGCACC TTTTGCCGAG 33458 121 1.16e-06 CCAAAACTCG ACACACCCCACC AAAAACGCTT 35830 307 2.61e-06 TCGCCTGTTC TCATATCCAACC AAGTCTTCCT 49496 183 4.11e-06 TTCCCCGTCA AAACAACCAACG CCGTTTCGGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44324 5.3e-08 43_[+3]_445 44703 2e-07 385_[+3]_103 37215 3.3e-07 217_[+3]_271 33458 1.2e-06 120_[+3]_368 35830 2.6e-06 306_[+3]_182 49496 4.1e-06 182_[+3]_306 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=6 44324 ( 44) TCACAACCAACC 1 44703 ( 386) TCACAACGAACC 1 37215 ( 218) TCACAACGCACC 1 33458 ( 121) ACACACCCCACC 1 35830 ( 307) TCATATCCAACC 1 49496 ( 183) AAACAACCAACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 10.1168 E= 9.3e+001 30 -923 -923 121 -70 189 -923 -923 188 -923 -923 -923 -923 189 -923 -79 188 -923 -923 -923 130 -43 -923 -79 -923 216 -923 -923 -923 157 61 -923 130 57 -923 -923 188 -923 -923 -923 -923 216 -923 -923 -923 189 -38 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 9.3e+001 0.333333 0.000000 0.000000 0.666667 0.166667 0.833333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.000000 0.166667 1.000000 0.000000 0.000000 0.000000 0.666667 0.166667 0.000000 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.666667 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TA]CACAAC[CG][AC]ACC -------------------------------------------------------------------------------- Time 2.94 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37215 1.21e-08 21_[+1(3.18e-07)]_184_\ [+3(3.27e-07)]_135_[+2(3.18e-06)]_122 48354 4.39e-05 423_[+2(2.06e-06)]_12_\ [+1(2.48e-06)]_39 43562 3.90e-04 107_[+1(3.18e-06)]_159_\ [+2(6.25e-06)]_208 39515 1.07e-06 151_[+2(8.91e-09)]_137_\ [+1(3.82e-06)]_186 49496 1.17e-06 149_[+1(3.45e-06)]_21_\ [+3(4.11e-06)]_83_[+2(3.55e-06)]_53_[+2(3.95e-05)]_142 33458 2.99e-04 120_[+3(1.16e-06)]_167_\ [+2(2.83e-05)]_187 35830 4.67e-07 8_[+2(1.03e-05)]_225_[+3(6.98e-05)]_\ 47_[+3(2.61e-06)]_128_[+1(6.78e-07)]_42 44703 7.68e-10 206_[+1(8.22e-07)]_167_\ [+3(1.99e-07)]_58_[+2(1.03e-07)]_31 33952 1.05e-04 21_[+1(8.22e-07)]_405_\ [+2(4.90e-06)]_48 44324 3.08e-06 43_[+3(5.32e-08)]_173_\ [+2(6.25e-06)]_258 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************