******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/118/118.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 6528 1.0000 500 8835 1.0000 500 9606 1.0000 500 43088 1.0000 500 43223 1.0000 500 46668 1.0000 500 46790 1.0000 500 21201 1.0000 500 47282 1.0000 500 7649 1.0000 500 14552 1.0000 500 33014 1.0000 500 23224 1.0000 500 33118 1.0000 500 25581 1.0000 500 7088 1.0000 500 41811 1.0000 500 44671 1.0000 500 45660 1.0000 500 12267 1.0000 500 35781 1.0000 500 7237 1.0000 500 43339 1.0000 500 46068 1.0000 500 43555 1.0000 500 49131 1.0000 500 45743 1.0000 500 48162 1.0000 500 37549 1.0000 500 44530 1.0000 500 33164 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/118/118.seqs.fa -oc motifs/118 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 31 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 15500 N= 31 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.265 C 0.240 G 0.238 T 0.257 Background letter frequencies (from dataset with add-one prior applied): A 0.265 C 0.240 G 0.238 T 0.257 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 10 llr = 135 E-value = 2.4e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::9:8::1a42:1: pos.-specific C 8:21a:::::4::41 probability G 113:::a:3:2:6:9 matrix T 195::2:a6::845: bits 2.1 * * 1.9 * ** * 1.7 * ** * * 1.4 * ** ** * * Relative 1.2 * ***** * * * Entropy 1.0 ** ***** * ** * (19.5 bits) 0.8 ** ***** * ** * 0.6 ********** **** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CTTACAGTTAATGTG consensus G T G CATC sequence C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 43223 273 9.99e-10 CCTTCTTTTA CTTACAGTTACTGTG TAAACATCGT 9606 103 4.07e-09 TACAAATATG CTTACAGTTAATGCG AATCTCACCT 43339 382 2.85e-08 TGTAGTCCTG CTGACAGTGAATGTG AATCCACCAT 48162 296 2.12e-07 GAACCGGAAC CTTACTGTTAGTTCG GATTTTTCCA 33014 141 2.67e-07 GTTCCTTTAC CTTACAGTTAGTTAG CGCACTATTT 44530 190 5.77e-07 TAGGTTCGAA CTGACAGTAACAGTG ACAGTAAACT 23224 7 6.55e-07 TCTGCA CTCCCAGTGAATGCG TACGAAAGCC 8835 117 1.05e-06 TGGAACGTGA CGGACTGTTACTTTG GTGGATCGGT 7237 49 1.34e-06 AACGATCTCA TTTACAGTGACATCG TCTCACCTCT 46668 310 1.88e-06 AAATGTCGAT GTCACAGTTAATGTC ACAGTCCGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43223 1e-09 272_[+1]_213 9606 4.1e-09 102_[+1]_383 43339 2.8e-08 381_[+1]_104 48162 2.1e-07 295_[+1]_190 33014 2.7e-07 140_[+1]_345 44530 5.8e-07 189_[+1]_296 23224 6.6e-07 6_[+1]_479 8835 1.1e-06 116_[+1]_369 7237 1.3e-06 48_[+1]_437 46668 1.9e-06 309_[+1]_176 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=10 43223 ( 273) CTTACAGTTACTGTG 1 9606 ( 103) CTTACAGTTAATGCG 1 43339 ( 382) CTGACAGTGAATGTG 1 48162 ( 296) CTTACTGTTAGTTCG 1 33014 ( 141) CTTACAGTTAGTTAG 1 44530 ( 190) CTGACAGTAACAGTG 1 23224 ( 7) CTCCCAGTGAATGCG 1 8835 ( 117) CGGACTGTTACTTTG 1 7237 ( 49) TTTACAGTGACATCG 1 46668 ( 310) GTCACAGTTAATGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 15066 bayes= 10.8078 E= 2.4e+000 -997 173 -125 -136 -997 -997 -125 181 -997 -26 33 96 177 -126 -997 -997 -997 206 -997 -997 160 -997 -997 -36 -997 -997 207 -997 -997 -997 -997 196 -140 -997 33 122 192 -997 -997 -997 60 74 -25 -997 -40 -997 -997 164 -997 -997 133 64 -140 74 -997 96 -997 -126 192 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 10 E= 2.4e+000 0.000000 0.800000 0.100000 0.100000 0.000000 0.000000 0.100000 0.900000 0.000000 0.200000 0.300000 0.500000 0.900000 0.100000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.100000 0.000000 0.300000 0.600000 1.000000 0.000000 0.000000 0.000000 0.400000 0.400000 0.200000 0.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.000000 0.600000 0.400000 0.100000 0.400000 0.000000 0.500000 0.000000 0.100000 0.900000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CT[TGC]AC[AT]GT[TG]A[ACG][TA][GT][TC]G -------------------------------------------------------------------------------- Time 7.18 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 13 llr = 173 E-value = 1.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::22:13:2:4:1:::41 pos.-specific C :2:::4552:3:2:5:31:8 probability G :1:::252:42511311222 matrix T a7aa83:256454919674: bits 2.1 1.9 * ** 1.7 * ** * * 1.4 * ** * * Relative 1.2 * *** * * Entropy 1.0 * *** * * * * * * (19.2 bits) 0.8 ***** * * * * *** * 0.6 ***** * * * * *** * 0.4 ***** **** * ******* 0.2 ******************** 0.0 -------------------- Multilevel TTTTTCCCTTTGATCTTTAC consensus C TGGAGCTT G CGT sequence C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 25581 292 1.78e-10 ATGATATCGT TTTTTCCCCTCGATCTTTTC CTAGTACGGT 33164 18 7.86e-09 GATTTTGCGC TTTTTACCTTTTTTCTTGAC ACTCTGAGCC 49131 218 5.25e-08 TTACTTTATG TTTTTGCTTTTTTTCTTGTC CTGTATATAG 46668 361 7.44e-08 TTCGGTATTG TTTTTCGGATCGGTCTCTAC AAGTTGGAGG 46790 151 1.57e-07 ATGTGAATTT TTTTTTCGTTTTCTATTTAC AGTTAACAGT 14552 259 3.05e-07 GAACTCGAAA TCTTTCCCAGAGATGTCGTC CAACAACTGT 33014 163 3.33e-07 TAGCGCACTA TTTTTACCCGTTATCGTTGC ACCTGGTAGT 41811 54 6.51e-07 CCTTGGTGGC TTTTATGCTGCTCTCTGTTC GCTAGGAACC 33118 13 7.61e-07 TGGATCCCTA TTTTTCCCATAGTGGTTTTG AGGGGACACA 44530 358 1.55e-06 ATTCCTAGCT TGTTTCGGTGGTTTCTTCAC AAACACCGAT 6528 125 1.65e-06 GTTCTTTCCT TTTTATGCAGCGATGTCTAA GTTTCGGGGT 7237 71 1.88e-06 TCGTCTCACC TCTTTTGATTGGTTGTTTGG ACAGTTGGAA 45660 194 2.13e-06 GGGTTCTTCG TCTTTGGTCTTGATTTCTGC TTGGACGAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25581 1.8e-10 291_[+2]_189 33164 7.9e-09 17_[+2]_463 49131 5.3e-08 217_[+2]_263 46668 7.4e-08 360_[+2]_120 46790 1.6e-07 150_[+2]_330 14552 3e-07 258_[+2]_222 33014 3.3e-07 162_[+2]_318 41811 6.5e-07 53_[+2]_427 33118 7.6e-07 12_[+2]_468 44530 1.5e-06 357_[+2]_123 6528 1.7e-06 124_[+2]_356 7237 1.9e-06 70_[+2]_410 45660 2.1e-06 193_[+2]_287 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=13 25581 ( 292) TTTTTCCCCTCGATCTTTTC 1 33164 ( 18) TTTTTACCTTTTTTCTTGAC 1 49131 ( 218) TTTTTGCTTTTTTTCTTGTC 1 46668 ( 361) TTTTTCGGATCGGTCTCTAC 1 46790 ( 151) TTTTTTCGTTTTCTATTTAC 1 14552 ( 259) TCTTTCCCAGAGATGTCGTC 1 33014 ( 163) TTTTTACCCGTTATCGTTGC 1 41811 ( 54) TTTTATGCTGCTCTCTGTTC 1 33118 ( 13) TTTTTCCCATAGTGGTTTTG 1 44530 ( 358) TGTTTCGGTGGTTTCTTCAC 1 6528 ( 125) TTTTATGCAGCGATGTCTAA 1 7237 ( 71) TCTTTTGATTGGTTGTTTGG 1 45660 ( 194) TCTTTGGTCTTGATTTCTGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 14911 bayes= 10.6933 E= 1.4e+002 -1035 -1035 -1035 196 -1035 -6 -163 143 -1035 -1035 -1035 196 -1035 -1035 -1035 196 -78 -1035 -1035 172 -78 68 -63 26 -1035 116 95 -1035 -178 116 -4 -74 22 -6 -1035 84 -1035 -1035 69 126 -78 36 -63 58 -1035 -1035 118 84 54 -64 -163 58 -1035 -1035 -163 184 -178 116 37 -174 -1035 -1035 -163 184 -1035 36 -163 126 -1035 -164 -4 143 54 -1035 -4 58 -178 168 -63 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 13 E= 1.4e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.230769 0.076923 0.692308 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.153846 0.000000 0.000000 0.846154 0.153846 0.384615 0.153846 0.307692 0.000000 0.538462 0.461538 0.000000 0.076923 0.538462 0.230769 0.153846 0.307692 0.230769 0.000000 0.461538 0.000000 0.000000 0.384615 0.615385 0.153846 0.307692 0.153846 0.384615 0.000000 0.000000 0.538462 0.461538 0.384615 0.153846 0.076923 0.384615 0.000000 0.000000 0.076923 0.923077 0.076923 0.538462 0.307692 0.076923 0.000000 0.000000 0.076923 0.923077 0.000000 0.307692 0.076923 0.615385 0.000000 0.076923 0.230769 0.692308 0.384615 0.000000 0.230769 0.384615 0.076923 0.769231 0.153846 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[TC]TTT[CT][CG][CG][TAC][TG][TC][GT][AT]T[CG]T[TC][TG][ATG]C -------------------------------------------------------------------------------- Time 15.24 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 104 E-value = 1.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::82::a:2:a::28::28a: pos.-specific C a:2:8:::8::48:2::2::4 probability G :::82::8:2:6:::a222:6 matrix T :a:::a:2:8::28::84::: bits 2.1 * * 1.9 ** ** * * * 1.7 ** ** * * * 1.4 ** ** * * * Relative 1.2 *********** ***** ** Entropy 1.0 ***************** *** (30.1 bits) 0.8 ***************** *** 0.6 ***************** *** 0.4 ***************** *** 0.2 ***************** *** 0.0 --------------------- Multilevel CTAGCTAGCTAGCTAGTTAAG consensus CAG TAG CTAC GAG C sequence C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 43223 131 1.75e-12 AAGGACGGTT CTAGCTAGCTACCTAGTTAAC TAACTAGTTA 48162 141 9.06e-12 GCCTTCAGAC CTAGCTAGCTAGCTAGGTAAC CGTACAGTGT 44671 141 4.61e-10 GAGAACGAAA CTCACTAGCTAGCTAGTAGAG GTATTACCTA 49131 271 4.99e-10 AGACTAGACA CTAGCTAGATAGTTCGTGAAG TCCGCTATCG 47282 411 1.75e-09 ATATACAAGC CTAGGTATCGACCAAGTCAAG ACAGTTAACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43223 1.8e-12 130_[+3]_349 48162 9.1e-12 140_[+3]_339 44671 4.6e-10 140_[+3]_339 49131 5e-10 270_[+3]_209 47282 1.7e-09 410_[+3]_69 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 43223 ( 131) CTAGCTAGCTACCTAGTTAAC 1 48162 ( 141) CTAGCTAGCTAGCTAGGTAAC 1 44671 ( 141) CTCACTAGCTAGCTAGTAGAG 1 49131 ( 271) CTAGCTAGATAGTTCGTGAAG 1 47282 ( 411) CTAGGTATCGACCAAGTCAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 14880 bayes= 11.7903 E= 1.8e+002 -897 206 -897 -897 -897 -897 -897 196 159 -26 -897 -897 -40 -897 175 -897 -897 173 -25 -897 -897 -897 -897 196 192 -897 -897 -897 -897 -897 175 -36 -40 173 -897 -897 -897 -897 -25 164 192 -897 -897 -897 -897 73 133 -897 -897 173 -897 -36 -40 -897 -897 164 159 -26 -897 -897 -897 -897 207 -897 -897 -897 -25 164 -40 -26 -25 64 159 -897 -25 -897 192 -897 -897 -897 -897 73 133 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 1.8e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.800000 0.200000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.200000 0.800000 0.000000 0.000000 0.000000 0.000000 0.200000 0.800000 1.000000 0.000000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.800000 0.000000 0.200000 0.200000 0.000000 0.000000 0.800000 0.800000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.200000 0.200000 0.200000 0.400000 0.800000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CT[AC][GA][CG]TA[GT][CA][TG]A[GC][CT][TA][AC]G[TG][TACG][AG]A[GC] -------------------------------------------------------------------------------- Time 24.06 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 6528 1.94e-03 124_[+2(1.65e-06)]_356 8835 2.02e-03 116_[+1(1.05e-06)]_369 9606 7.68e-05 102_[+1(4.07e-09)]_383 43088 1.85e-02 191_[+1(8.54e-05)]_294 43223 2.15e-13 130_[+3(1.75e-12)]_96_\ [+1(4.44e-05)]_10_[+1(9.99e-10)]_213 46668 2.74e-06 264_[+1(9.78e-05)]_30_\ [+1(1.88e-06)]_36_[+2(7.44e-08)]_85_[+2(6.21e-05)]_15 46790 1.81e-03 150_[+2(1.57e-07)]_330 21201 7.47e-01 500 47282 6.14e-06 4_[+3(2.14e-05)]_385_[+3(1.75e-09)]_\ 69 7649 7.85e-01 500 14552 6.39e-03 258_[+2(3.05e-07)]_222 33014 6.88e-07 140_[+1(2.67e-07)]_7_[+2(3.33e-07)]_\ 318 23224 7.20e-03 6_[+1(6.55e-07)]_479 33118 3.82e-03 12_[+2(7.61e-07)]_32_[+2(9.10e-05)]_\ 136_[+2(9.10e-05)]_260 25581 8.91e-06 291_[+2(1.78e-10)]_189 7088 8.94e-01 500 41811 9.87e-03 53_[+2(6.51e-07)]_427 44671 2.92e-06 140_[+3(4.61e-10)]_339 45660 2.01e-02 193_[+2(2.13e-06)]_287 12267 6.03e-01 500 35781 9.40e-02 455_[+2(1.34e-05)]_25 7237 7.13e-06 48_[+1(1.34e-06)]_7_[+2(1.88e-06)]_\ 410 43339 1.17e-05 101_[+1(5.65e-05)]_265_\ [+1(2.85e-08)]_74_[+3(2.23e-05)]_9 46068 3.99e-02 380_[+1(9.32e-06)]_105 43555 1.36e-01 500 49131 1.67e-09 217_[+2(5.25e-08)]_33_\ [+3(4.99e-10)]_209 45743 3.95e-01 500 48162 1.03e-10 81_[+3(1.40e-05)]_38_[+3(9.06e-12)]_\ 134_[+1(2.12e-07)]_190 37549 8.19e-01 500 44530 2.08e-05 189_[+1(5.77e-07)]_153_\ [+2(1.55e-06)]_123 33164 2.12e-04 17_[+2(7.86e-09)]_463 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************