******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/199/199.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 13779 1.0000 500 14208 1.0000 500 14180 1.0000 500 14845 1.0000 500 25334 1.0000 500 30471 1.0000 500 44089 1.0000 500 50476 1.0000 500 6282 1.0000 500 10693 1.0000 500 6254 1.0000 500 45072 1.0000 500 12088 1.0000 500 7969 1.0000 500 47461 1.0000 500 44659 1.0000 500 46077 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/199/199.seqs.fa -oc motifs/199 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.237 C 0.276 G 0.237 T 0.249 Background letter frequencies (from dataset with add-one prior applied): A 0.237 C 0.276 G 0.237 T 0.249 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 13 llr = 154 E-value = 1.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::51::72:3::7 pos.-specific C 2511:11::851::: probability G 2488::1a1::6:63 matrix T 7212588:215:a4: bits 2.1 * * 1.9 * * 1.7 * * 1.5 * * Relative 1.2 * *** * * Entropy 1.0 ****** *** (17.1 bits) 0.8 * ************* 0.6 * ************* 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel TCGGTTTGACCGTGA consensus G A T TA TG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 12088 48 5.83e-09 GAGGAATGGT TGGGATTGACTGTGA CTGCGAACGG 6282 97 5.83e-09 GTACGAGGAG TGGGATTGACTGTGA TTACGAACGG 50476 424 3.08e-07 ACACGTTCGC TCCGATTGACTGTGA GACAACCCAA 7969 109 5.40e-07 ACCTCGCCCG TCGGTTCGACCATGA CGGTGGCGCT 30471 434 1.12e-06 ACAATCACAC GCGGATTGGCTGTGA AAACATACCA 45072 402 1.47e-06 CATTCGTCTG TTGGTTTGTACGTTA CTCGCTGTTA 44089 313 1.57e-06 TGTCTGCGCT TGGTTCTGACTGTGA GTACCAGACG 25334 335 2.16e-06 CCAAACCTGT CCGGTATGACCATGA GCACAACCAA 13779 367 3.78e-06 ATAACGAGTC TGGCATTGTCCGTTG ATGCTGCCTA 6254 259 4.79e-06 AACACTCTTT GGTGTTTGACCATGG TAGTAGAGTA 14845 165 5.52e-06 ATCCTCCAAT TCGGTTGGTATGTTA CAGCTAGAAG 46077 63 6.26e-06 TCATTTCGTT CCGGTTTGATCATTG TCGTGGCCAG 44659 443 9.35e-06 AGCAAGCAGT TTGTATTGACCCTTG TGGATTCGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12088 5.8e-09 47_[+1]_438 6282 5.8e-09 96_[+1]_389 50476 3.1e-07 423_[+1]_62 7969 5.4e-07 108_[+1]_377 30471 1.1e-06 433_[+1]_52 45072 1.5e-06 401_[+1]_84 44089 1.6e-06 312_[+1]_173 25334 2.2e-06 334_[+1]_151 13779 3.8e-06 366_[+1]_119 6254 4.8e-06 258_[+1]_227 14845 5.5e-06 164_[+1]_321 46077 6.3e-06 62_[+1]_423 44659 9.4e-06 442_[+1]_43 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=13 12088 ( 48) TGGGATTGACTGTGA 1 6282 ( 97) TGGGATTGACTGTGA 1 50476 ( 424) TCCGATTGACTGTGA 1 7969 ( 109) TCGGTTCGACCATGA 1 30471 ( 434) GCGGATTGGCTGTGA 1 45072 ( 402) TTGGTTTGTACGTTA 1 44089 ( 313) TGGTTCTGACTGTGA 1 25334 ( 335) CCGGTATGACCATGA 1 13779 ( 367) TGGCATTGTCCGTTG 1 6254 ( 259) GGTGTTTGACCATGG 1 14845 ( 165) TCGGTTGGTATGTTA 1 46077 ( 63) CCGGTTTGATCATTG 1 44659 ( 443) TTGTATTGACCCTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 8262 bayes= 9.84078 E= 1.4e-001 -1035 -84 -62 147 -1035 74 70 -70 -1035 -184 183 -169 -1035 -184 170 -70 96 -1035 -1035 111 -162 -184 -1035 176 -1035 -184 -162 176 -1035 -1035 208 -1035 154 -1035 -162 -11 -62 148 -1035 -169 -1035 96 -1035 89 37 -184 137 -1035 -1035 -1035 -1035 200 -1035 -1035 137 63 154 -1035 38 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 13 E= 1.4e-001 0.000000 0.153846 0.153846 0.692308 0.000000 0.461538 0.384615 0.153846 0.000000 0.076923 0.846154 0.076923 0.000000 0.076923 0.769231 0.153846 0.461538 0.000000 0.000000 0.538462 0.076923 0.076923 0.000000 0.846154 0.000000 0.076923 0.076923 0.846154 0.000000 0.000000 1.000000 0.000000 0.692308 0.000000 0.076923 0.230769 0.153846 0.769231 0.000000 0.076923 0.000000 0.538462 0.000000 0.461538 0.307692 0.076923 0.615385 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.615385 0.384615 0.692308 0.000000 0.307692 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[CG]GG[TA]TTG[AT]C[CT][GA]T[GT][AG] -------------------------------------------------------------------------------- Time 2.97 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 10 llr = 115 E-value = 1.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A a88:3::791:1 pos.-specific C ::::4::1:5:: probability G :1:::9a:::89 matrix T :12a31:2142: bits 2.1 * * * 1.9 * * * 1.7 * * ** * * 1.5 * * ** * * Relative 1.2 **** ** * ** Entropy 1.0 **** ** * ** (16.6 bits) 0.8 **** **** ** 0.6 **** ******* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AAATCGGAACGG consensus T A T TT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 6282 76 4.47e-08 GTCACGAGTA AAATCGGAACGG TACGAGGAGT 12088 23 1.64e-07 GTCACGAGTA AAATCGGAATGG AAGGAGGAAT 44659 1 8.94e-07 . AAATTGGAAAGG ACCAATCGGT 14208 486 1.08e-06 GTGCGGAGGC AATTTGGAATGG GCG 44089 22 2.38e-06 ACAAACCCGG AAATTTGAACGG GACCTTGCCA 25334 382 3.52e-06 TGCCGAAACG AAATAGGTATTG CGCTGGAGGA 30471 40 3.95e-06 CGTCACTCCT AGTTCGGAACGG GTAGCGGTGT 7969 405 5.38e-06 GCTTGGAAAG AAATAGGTACGA CCGTGGATGC 47461 239 7.48e-06 AGATCGTACC ATATAGGAATTG TGCGCGTGAG 6254 41 1.04e-05 CTTTCATACA AAATCGGCTCGG GTACGCCAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 6282 4.5e-08 75_[+2]_413 12088 1.6e-07 22_[+2]_466 44659 8.9e-07 [+2]_488 14208 1.1e-06 485_[+2]_3 44089 2.4e-06 21_[+2]_467 25334 3.5e-06 381_[+2]_107 30471 3.9e-06 39_[+2]_449 7969 5.4e-06 404_[+2]_84 47461 7.5e-06 238_[+2]_250 6254 1e-05 40_[+2]_448 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=10 6282 ( 76) AAATCGGAACGG 1 12088 ( 23) AAATCGGAATGG 1 44659 ( 1) AAATTGGAAAGG 1 14208 ( 486) AATTTGGAATGG 1 44089 ( 22) AAATTTGAACGG 1 25334 ( 382) AAATAGGTATTG 1 30471 ( 40) AGTTCGGAACGG 1 7969 ( 405) AAATAGGTACGA 1 47461 ( 239) ATATAGGAATTG 1 6254 ( 41) AAATCGGCTCGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 10.6417 E= 1.0e+000 207 -997 -997 -997 175 -997 -124 -132 175 -997 -997 -32 -997 -997 -997 200 34 53 -997 27 -997 -997 192 -132 -997 -997 207 -997 156 -146 -997 -32 192 -997 -997 -132 -124 86 -997 68 -997 -997 175 -32 -124 -997 192 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 1.0e+000 1.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.100000 0.100000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.300000 0.400000 0.000000 0.300000 0.000000 0.000000 0.900000 0.100000 0.000000 0.000000 1.000000 0.000000 0.700000 0.100000 0.000000 0.200000 0.900000 0.000000 0.000000 0.100000 0.100000 0.500000 0.000000 0.400000 0.000000 0.000000 0.800000 0.200000 0.100000 0.000000 0.900000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- AA[AT]T[CAT]GG[AT]A[CT][GT]G -------------------------------------------------------------------------------- Time 5.75 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 12 llr = 143 E-value = 7.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 172::468:a2::33: pos.-specific C 9:4:843:a:3:81:6 probability G :13::2::::2116:3 matrix T :32a3:22::492:81 bits 2.1 * * 1.9 * ** 1.7 * ** * 1.5 * * *** * Relative 1.2 * * *** * * Entropy 1.0 * ** *** * * (17.2 bits) 0.8 ** ** *** **** 0.6 ** ** **** ***** 0.4 ** ******* ***** 0.2 ** ******* ***** 0.0 ---------------- Multilevel CACTCAAACATTCGTC consensus TG TCC C AAG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 6282 151 5.74e-08 ACAACCCCTA CAGTTCAACATTCGTG TAGGGTCTTG 12088 100 1.20e-07 TTAACCCCTA CAATTCAACATTCGTG GAGGGTCTTG 14180 136 3.33e-07 GCGCAAGTTG CGCTCCAACAGTCGTC GCTCGGGCCC 45072 384 3.65e-07 TTGGTTCGCA CATTCGTACATTCGTC TGTTGGTTTG 44659 357 5.16e-07 GAACACTTTC CAGTCAAACATTGGAC CAGTTTCCTG 14845 409 8.62e-07 GCTCGTGAAT CTCTCCAACACTTATC CCAACTGCAA 30471 245 1.50e-06 CACAATCACA CATTCACACACTTGTG CGACACCCAT 44089 251 1.79e-06 AAATAGAATG CTCTCGAACATTCCTG ATTGAGACCC 50476 372 2.69e-06 CCCTCTCTAC CAATCAATCAATCAAC GTACCTTTTA 47461 429 5.47e-06 TGTGGTCACT CACTCACTCACTCATT ACCTCATCGT 10693 111 7.06e-06 AGTTGGCTCA CAGTCACACAGGCAAC GGCTCTCGCT 7969 350 9.95e-06 GCTGGTGGAC ATCTTCTACAATCGTC CCGATGCCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 6282 5.7e-08 150_[+3]_334 12088 1.2e-07 99_[+3]_385 14180 3.3e-07 135_[+3]_349 45072 3.6e-07 383_[+3]_101 44659 5.2e-07 356_[+3]_128 14845 8.6e-07 408_[+3]_76 30471 1.5e-06 244_[+3]_240 44089 1.8e-06 250_[+3]_234 50476 2.7e-06 371_[+3]_113 47461 5.5e-06 428_[+3]_56 10693 7.1e-06 110_[+3]_374 7969 1e-05 349_[+3]_135 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=12 6282 ( 151) CAGTTCAACATTCGTG 1 12088 ( 100) CAATTCAACATTCGTG 1 14180 ( 136) CGCTCCAACAGTCGTC 1 45072 ( 384) CATTCGTACATTCGTC 1 44659 ( 357) CAGTCAAACATTGGAC 1 14845 ( 409) CTCTCCAACACTTATC 1 30471 ( 245) CATTCACACACTTGTG 1 44089 ( 251) CTCTCGAACATTCCTG 1 50476 ( 372) CAATCAATCAATCAAC 1 47461 ( 429) CACTCACTCACTCATT 1 10693 ( 111) CAGTCACACAGGCAAC 1 7969 ( 350) ATCTTCTACAATCGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8245 bayes= 9.87026 E= 7.9e+000 -151 173 -1023 -1023 149 -1023 -151 0 -51 59 8 -58 -1023 -1023 -1023 200 -1023 144 -1023 0 81 59 -51 -1023 130 -14 -1023 -58 181 -1023 -1023 -58 -1023 186 -1023 -1023 207 -1023 -1023 -1023 -51 -14 -51 74 -1023 -1023 -151 188 -1023 144 -151 -58 49 -173 130 -1023 8 -1023 -1023 159 -1023 108 49 -158 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 12 E= 7.9e+000 0.083333 0.916667 0.000000 0.000000 0.666667 0.000000 0.083333 0.250000 0.166667 0.416667 0.250000 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.250000 0.416667 0.416667 0.166667 0.000000 0.583333 0.250000 0.000000 0.166667 0.833333 0.000000 0.000000 0.166667 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.250000 0.166667 0.416667 0.000000 0.000000 0.083333 0.916667 0.000000 0.750000 0.083333 0.166667 0.333333 0.083333 0.583333 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.583333 0.333333 0.083333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[AT][CG]T[CT][AC][AC]ACA[TC]TC[GA][TA][CG] -------------------------------------------------------------------------------- Time 8.32 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 13779 8.37e-03 366_[+1(3.78e-06)]_119 14208 4.33e-03 485_[+2(1.08e-06)]_3 14180 2.60e-03 135_[+3(3.33e-07)]_349 14845 1.06e-04 164_[+1(5.52e-06)]_229_\ [+3(8.62e-07)]_76 25334 7.75e-05 334_[+1(2.16e-06)]_32_\ [+2(3.52e-06)]_107 30471 1.86e-07 39_[+2(3.95e-06)]_193_\ [+3(1.50e-06)]_173_[+1(1.12e-06)]_52 44089 1.87e-07 21_[+2(2.38e-06)]_217_\ [+3(1.79e-06)]_46_[+1(1.57e-06)]_173 50476 2.32e-05 371_[+3(2.69e-06)]_36_\ [+1(3.08e-07)]_62 6282 1.06e-12 75_[+2(4.47e-08)]_9_[+1(5.83e-09)]_\ 39_[+3(5.74e-08)]_334 10693 5.18e-02 110_[+3(7.06e-06)]_374 6254 2.61e-04 40_[+2(1.04e-05)]_206_\ [+1(4.79e-06)]_227 45072 1.25e-05 383_[+3(3.65e-07)]_2_[+1(1.47e-06)]_\ 84 12088 7.18e-12 22_[+2(1.64e-07)]_13_[+1(5.83e-09)]_\ 37_[+3(1.20e-07)]_385 7969 7.02e-07 108_[+1(5.40e-07)]_226_\ [+3(9.95e-06)]_39_[+2(5.38e-06)]_84 47461 1.61e-04 238_[+2(7.48e-06)]_178_\ [+3(5.47e-06)]_37_[+3(5.60e-05)]_3 44659 1.25e-07 [+2(8.94e-07)]_344_[+3(5.16e-07)]_\ 70_[+1(9.35e-06)]_43 46077 6.02e-02 62_[+1(6.26e-06)]_423 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************