******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/338/338.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 46423 1.0000 500 46484 1.0000 500 46752 1.0000 500 47080 1.0000 500 47538 1.0000 500 48600 1.0000 500 48766 1.0000 500 39708 1.0000 500 49119 1.0000 500 50303 1.0000 500 43899 1.0000 500 1905 1.0000 500 45953 1.0000 500 35818 1.0000 500 42812 1.0000 500 43085 1.0000 500 43794 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/338/338.seqs.fa -oc motifs/338 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.274 C 0.228 G 0.225 T 0.273 Background letter frequencies (from dataset with add-one prior applied): A 0.274 C 0.228 G 0.225 T 0.273 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 17 llr = 163 E-value = 7.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 256522:96811699 pos.-specific C 322313::1::91:1 probability G :22:6:91219:2:: matrix T 51:2151::2::11: bits 2.2 1.9 1.7 1.5 * ** Relative 1.3 ** ** ** Entropy 1.1 ** ** ** (13.8 bits) 0.9 ** *** ** 0.6 * ********* 0.4 * ************* 0.2 *************** 0.0 --------------- Multilevel TAAAGTGAAAGCAAA consensus CCCCAC G G sequence T A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 43899 66 9.21e-08 TACAAACCGA TAAAGTGACAGCAAA CTTGCAAGTC 49119 195 4.97e-07 CCAATGTTCG TAACGAGAATGCAAA ACATCGAGGC 47538 484 6.66e-07 AACGAAGCAT CAACATGAGAGCAAA TC 43794 190 8.55e-07 ACCTCTATTC TCATGAGAGAGCAAA CTATTGGCCA 46484 280 5.00e-06 TAATTTGGTT TCGAGAGAAAGCAAC CACCATATGC 50303 152 6.16e-06 CGCTATTTGG AACCACGAGAGCAAA GTTGTTTAGG 42812 28 1.01e-05 TTTTTTTGTG TCAATTGAAGGCAAA GAATCATAGA 1905 31 1.01e-05 CCAACTGTTT TACCGAGAGAGAAAA ACACACGACC 48600 7 1.20e-05 GCGAAG CAAACCGGAAGCAAA CATATCGTTG 45953 384 1.58e-05 GGTCAGTCGA CCGAGCGAAAGCGTA GGAAAGAAAC 48766 206 1.71e-05 GAGAGCCTTG TGACTTGACAGCGAA TTAACAGTAA 46423 289 1.87e-05 GGAACAGAGA CAAAGCGAAAGATAA CAACCAACTG 46752 20 3.59e-05 GTGGATGATT AACTGTTAATGCAAA CCGTTACAGG 47080 393 5.61e-05 ATGCATATGA AAATGTTGAAGCGAA CAGAGCTTCG 43085 316 7.90e-05 GACGGACCTA CGATACGAATACAAA TGACGTTACC 39708 103 9.63e-05 TTCGTGACTT TTCAGTGAAAACCAA CTACCTATAA 35818 437 1.16e-04 AAACGCCCAC TGGAATGAAAGCGTC GCCCACTAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43899 9.2e-08 65_[+1]_420 49119 5e-07 194_[+1]_291 47538 6.7e-07 483_[+1]_2 43794 8.6e-07 189_[+1]_296 46484 5e-06 279_[+1]_206 50303 6.2e-06 151_[+1]_334 42812 1e-05 27_[+1]_458 1905 1e-05 30_[+1]_455 48600 1.2e-05 6_[+1]_479 45953 1.6e-05 383_[+1]_102 48766 1.7e-05 205_[+1]_280 46423 1.9e-05 288_[+1]_197 46752 3.6e-05 19_[+1]_466 47080 5.6e-05 392_[+1]_93 43085 7.9e-05 315_[+1]_170 39708 9.6e-05 102_[+1]_383 35818 0.00012 436_[+1]_49 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=17 43899 ( 66) TAAAGTGACAGCAAA 1 49119 ( 195) TAACGAGAATGCAAA 1 47538 ( 484) CAACATGAGAGCAAA 1 43794 ( 190) TCATGAGAGAGCAAA 1 46484 ( 280) TCGAGAGAAAGCAAC 1 50303 ( 152) AACCACGAGAGCAAA 1 42812 ( 28) TCAATTGAAGGCAAA 1 1905 ( 31) TACCGAGAGAGAAAA 1 48600 ( 7) CAAACCGGAAGCAAA 1 45953 ( 384) CCGAGCGAAAGCGTA 1 48766 ( 206) TGACTTGACAGCGAA 1 46423 ( 289) CAAAGCGAAAGATAA 1 46752 ( 20) AACTGTTAATGCAAA 1 47080 ( 393) AAATGTTGAAGCGAA 1 43085 ( 316) CGATACGAATACAAA 1 39708 ( 103) TTCAGTGAAAACCAA 1 35818 ( 437) TGGAATGAAAGCGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 8262 bayes= 8.99152 E= 7.7e+001 -63 37 -1073 95 95 4 -35 -221 110 4 -35 -1073 78 37 -1073 -22 -22 -195 139 -121 -22 37 -1073 78 -1073 -1073 197 -121 169 -1073 -93 -1073 124 -95 7 -1073 148 -1073 -193 -63 -122 -1073 197 -1073 -122 195 -1073 -1073 124 -195 7 -221 169 -1073 -1073 -121 169 -95 -1073 -1073 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 17 E= 7.7e+001 0.176471 0.294118 0.000000 0.529412 0.529412 0.235294 0.176471 0.058824 0.588235 0.235294 0.176471 0.000000 0.470588 0.294118 0.000000 0.235294 0.235294 0.058824 0.588235 0.117647 0.235294 0.294118 0.000000 0.470588 0.000000 0.000000 0.882353 0.117647 0.882353 0.000000 0.117647 0.000000 0.647059 0.117647 0.235294 0.000000 0.764706 0.000000 0.058824 0.176471 0.117647 0.000000 0.882353 0.000000 0.117647 0.882353 0.000000 0.000000 0.647059 0.058824 0.235294 0.058824 0.882353 0.000000 0.000000 0.117647 0.882353 0.117647 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TC][AC][AC][ACT][GA][TCA]GA[AG]AGC[AG]AA -------------------------------------------------------------------------------- Time 2.76 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 7 llr = 100 E-value = 3.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :3::::4:1::993:: pos.-specific C :1::1:4a::3:::1: probability G 71a1::::3a6::1:a matrix T 34:99a1:6:11169: bits 2.2 * * * * 1.9 * * * * * 1.7 * * * * * 1.5 * * * * * Relative 1.3 * **** * * ** ** Entropy 1.1 * **** * * ** ** (20.7 bits) 0.9 * **** * * ** ** 0.6 * **** ****** ** 0.4 * ************** 0.2 * ************** 0.0 ---------------- Multilevel GTGTTTACTGGAATTG consensus TA C G C A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 43085 61 1.85e-08 GTGGTTTCTG GTGTTTTCGGGAATTG GGCGAAGAGG 35818 268 2.32e-08 TAGCTGGCTA GAGGTTCCTGGAATTG ACAGATGACA 46423 137 5.06e-08 GACAGTGACT TGGTTTCCGGGAATTG GGTCATTCCT 46752 293 1.08e-07 CAGTCAGTTC GCGTTTCCTGTAAATG TTGGTACCGG 47538 377 2.10e-07 ACGCGACCCA GAGTTTACTGCTAATG CAATTCCGGT 49119 134 4.77e-07 ATCTTCGACT TTGTTTACTGCAAGCG TTCGTTGTGC 42812 224 5.79e-07 GTTTTGTACC GTGTCTACAGGATTTG GCCCATTTGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43085 1.8e-08 60_[+2]_424 35818 2.3e-08 267_[+2]_217 46423 5.1e-08 136_[+2]_348 46752 1.1e-07 292_[+2]_192 47538 2.1e-07 376_[+2]_108 49119 4.8e-07 133_[+2]_351 42812 5.8e-07 223_[+2]_261 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=7 43085 ( 61) GTGTTTTCGGGAATTG 1 35818 ( 268) GAGGTTCCTGGAATTG 1 46423 ( 137) TGGTTTCCGGGAATTG 1 46752 ( 293) GCGTTTCCTGTAAATG 1 47538 ( 377) GAGTTTACTGCTAATG 1 49119 ( 134) TTGTTTACTGCAAGCG 1 42812 ( 224) GTGTCTACAGGATTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8245 bayes= 10.8069 E= 3.4e+002 -945 -945 167 6 6 -67 -65 65 -945 -945 215 -945 -945 -945 -65 165 -945 -67 -945 165 -945 -945 -945 187 64 91 -945 -93 -945 213 -945 -945 -94 -945 35 106 -945 -945 215 -945 -945 32 135 -93 164 -945 -945 -93 164 -945 -945 -93 6 -945 -65 106 -945 -67 -945 165 -945 -945 215 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 3.4e+002 0.000000 0.000000 0.714286 0.285714 0.285714 0.142857 0.142857 0.428571 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.428571 0.428571 0.000000 0.142857 0.000000 1.000000 0.000000 0.000000 0.142857 0.000000 0.285714 0.571429 0.000000 0.000000 1.000000 0.000000 0.000000 0.285714 0.571429 0.142857 0.857143 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.142857 0.285714 0.000000 0.142857 0.571429 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GT][TA]GTTT[AC]C[TG]G[GC]AA[TA]TG -------------------------------------------------------------------------------- Time 5.41 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 16 llr = 147 E-value = 1.9e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1:1::::::41: pos.-specific C 41:43:1:34:6 probability G 6::3:::661:3 matrix T :9937a941:92 bits 2.2 1.9 * 1.7 * 1.5 ** * Relative 1.3 ** ** * Entropy 1.1 ** **** * (13.2 bits) 0.9 *** ***** * 0.6 *** ******** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GTTCTTTGGATC consensus C TC TCC G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 45953 296 3.14e-07 GAAGGCTTTT GTTCTTTTGCTC GCTTGCTCGT 49119 220 1.02e-06 ACATCGAGGC CTTTTTTGGCTC TTGATACGAT 47538 98 1.02e-06 TATGCCTCAA CTTCTTTTGCTC TTACAGCTTC 39708 69 6.18e-06 CGTAGGCTTG GTTTCTTGGCTG AAACATGCGT 1905 455 9.68e-06 TTTCATTTGG CTTCCTTGCATC ACCAGGTATC 43085 127 1.10e-05 ACAACAGACG GTTTTTTTGGTC CAGATGGGGC 46484 464 1.36e-05 AGGTCTCATT GCTCTTTTGATC GCTTCAAGCG 43899 451 1.65e-05 CTAAGTATTC GTTCTTTTCCTT TTGAACCTTG 50303 24 1.65e-05 TGTACTGCAG GTTGTTTGTATC AACACTTACA 35818 117 5.29e-05 CGTTGTGACT GTTTCTCGGATC TCGCAAGGGG 43794 254 5.68e-05 TAATACCATA GTTTTTTGGCAG TCCAGAATAA 42812 98 7.48e-05 CAACACCTGA CTACTTTGCATG GTAGTTAAAG 47080 423 7.98e-05 CTTCGGGTTT CTTCTTTTCGTT GGTAGACTCA 46423 431 7.98e-05 ACTGACGGTG ATTGCTTGGCTG TGACCCAATA 48766 438 1.08e-04 AGTAACCCTC CCTGTTTGTATC TCTGTCTGTC 48600 400 1.27e-04 CCTACTACTC GTAGCTTTGATT TCGAAACAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45953 3.1e-07 295_[+3]_193 49119 1e-06 219_[+3]_269 47538 1e-06 97_[+3]_391 39708 6.2e-06 68_[+3]_420 1905 9.7e-06 454_[+3]_34 43085 1.1e-05 126_[+3]_362 46484 1.4e-05 463_[+3]_25 43899 1.7e-05 450_[+3]_38 50303 1.7e-05 23_[+3]_465 35818 5.3e-05 116_[+3]_372 43794 5.7e-05 253_[+3]_235 42812 7.5e-05 97_[+3]_391 47080 8e-05 422_[+3]_66 46423 8e-05 430_[+3]_58 48766 0.00011 437_[+3]_51 48600 0.00013 399_[+3]_89 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=16 45953 ( 296) GTTCTTTTGCTC 1 49119 ( 220) CTTTTTTGGCTC 1 47538 ( 98) CTTCTTTTGCTC 1 39708 ( 69) GTTTCTTGGCTG 1 1905 ( 455) CTTCCTTGCATC 1 43085 ( 127) GTTTTTTTGGTC 1 46484 ( 464) GCTCTTTTGATC 1 43899 ( 451) GTTCTTTTCCTT 1 50303 ( 24) GTTGTTTGTATC 1 35818 ( 117) GTTTCTCGGATC 1 43794 ( 254) GTTTTTTGGCAG 1 42812 ( 98) CTACTTTGCATG 1 47080 ( 423) CTTCTTTTCGTT 1 46423 ( 431) ATTGCTTGGCTG 1 48766 ( 438) CCTGTTTGTATC 1 48600 ( 400) GTAGCTTTGATT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 9.75645 E= 1.9e+003 -213 72 132 -1064 -1064 -87 -1064 168 -113 -1064 -1064 168 -1064 94 15 19 -1064 45 -1064 133 -1064 -1064 -1064 187 -1064 -187 -1064 178 -1064 -1064 132 68 -1064 13 148 -113 67 94 -84 -1064 -213 -1064 -1064 178 -1064 130 15 -54 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 16 E= 1.9e+003 0.062500 0.375000 0.562500 0.000000 0.000000 0.125000 0.000000 0.875000 0.125000 0.000000 0.000000 0.875000 0.000000 0.437500 0.250000 0.312500 0.000000 0.312500 0.000000 0.687500 0.000000 0.000000 0.000000 1.000000 0.000000 0.062500 0.000000 0.937500 0.000000 0.000000 0.562500 0.437500 0.000000 0.250000 0.625000 0.125000 0.437500 0.437500 0.125000 0.000000 0.062500 0.000000 0.000000 0.937500 0.000000 0.562500 0.250000 0.187500 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC]TT[CTG][TC]TT[GT][GC][AC]T[CG] -------------------------------------------------------------------------------- Time 8.07 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46423 1.63e-06 136_[+2(5.06e-08)]_136_\ [+1(1.87e-05)]_127_[+3(7.98e-05)]_58 46484 6.07e-04 279_[+1(5.00e-06)]_169_\ [+3(1.36e-05)]_25 46752 9.22e-05 19_[+1(3.59e-05)]_258_\ [+2(1.08e-07)]_192 47080 1.35e-02 392_[+1(5.61e-05)]_15_\ [+3(7.98e-05)]_66 47538 5.47e-09 97_[+3(1.02e-06)]_152_\ [+2(5.92e-05)]_17_[+2(2.86e-05)]_66_[+2(2.10e-07)]_91_[+1(6.66e-07)]_2 48600 7.46e-03 6_[+1(1.20e-05)]_479 48766 1.28e-02 205_[+1(1.71e-05)]_280 39708 5.09e-03 68_[+3(6.18e-06)]_22_[+1(9.63e-05)]_\ 383 49119 8.90e-09 133_[+2(4.77e-07)]_45_\ [+1(4.97e-07)]_10_[+3(1.02e-06)]_269 50303 6.76e-04 23_[+3(1.65e-05)]_116_\ [+1(6.16e-06)]_334 43899 4.07e-05 65_[+1(9.21e-08)]_370_\ [+3(1.65e-05)]_38 1905 5.54e-04 30_[+1(1.01e-05)]_409_\ [+3(9.68e-06)]_34 45953 1.15e-04 295_[+3(3.14e-07)]_76_\ [+1(1.58e-05)]_102 35818 2.84e-06 116_[+3(5.29e-05)]_139_\ [+2(2.32e-08)]_217 42812 7.85e-06 27_[+1(1.01e-05)]_55_[+3(7.48e-05)]_\ 114_[+2(5.79e-07)]_261 43085 4.04e-07 60_[+2(1.85e-08)]_50_[+3(1.10e-05)]_\ 177_[+1(7.90e-05)]_170 43794 5.43e-04 189_[+1(8.55e-07)]_49_\ [+3(5.68e-05)]_56_[+1(4.50e-05)]_164 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************