******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/66/66.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 50671 1.0000 500 9309 1.0000 500 46561 1.0000 500 46753 1.0000 500 13838 1.0000 500 48647 1.0000 500 15140 1.0000 500 49206 1.0000 500 49405 1.0000 500 49480 1.0000 500 50271 1.0000 500 3755 1.0000 500 33624 1.0000 500 50510 1.0000 500 44432 1.0000 500 34974 1.0000 500 45325 1.0000 500 35101 1.0000 500 43093 1.0000 500 44119 1.0000 500 35605 1.0000 500 43946 1.0000 500 35753 1.0000 500 49308 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/66/66.seqs.fa -oc motifs/66 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 24 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 12000 N= 24 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.278 C 0.228 G 0.220 T 0.273 Background letter frequencies (from dataset with add-one prior applied): A 0.278 C 0.228 G 0.220 T 0.273 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 20 llr = 200 E-value = 2.4e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::1a:4:::956 pos.-specific C 5:6:a:124:2: probability G :14::1a:6:32 matrix T 59:::6:811:3 bits 2.2 * 2.0 * * 1.7 ** * 1.5 * ** * Relative 1.3 * ** ** * Entropy 1.1 ** ** ** * (14.4 bits) 0.9 ***** **** 0.7 ********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CTCACTGTGAAA consensus T G A CC GT sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44119 132 8.25e-07 GATACAAGAA CTCACTGTGAAT TACTTCACTC 43093 382 8.25e-07 AGACATGTGA CTGACTGTGAGA AATGCCATTC 33624 109 1.25e-06 CGTTGGTGCG TTCACTGTGAAT GCGGAAGAGA 35753 301 1.85e-06 ATATTCATGT CTCACAGTCAAA CATGATGCAA 34974 247 2.31e-06 TTCGACGAAG TTGACTGTGAAT TATTAACGGT 49308 100 3.18e-06 CGTTTGTGTT TTGACTGTGACA TAGACTGTTA 49480 378 4.13e-06 TCGGCAGACT CTGACAGTGAGT CTCGTTTGCC 50671 268 4.13e-06 GTGAGTCACG TTCACTGTCAAT CAGTTCATAG 45325 263 5.32e-06 ACTGTTGTGT CTGACAGTGAAG GGCTGGGATA 46561 316 5.80e-06 ATGAAAGGAC TTCACTGTCAGT AGAGCCAGGA 43946 224 6.52e-06 ATGTTATGGG TTGACTGCGAAA GGAAAGATTA 35101 454 7.03e-06 CGGGAATCGA TTCACAGTCACA ACCCGCACGG 49206 212 1.03e-05 AGTAAGTAGA CGCACTGTGAGA ACGACAAATT 50510 362 2.24e-05 TTTATTTTGA TTCACAGTTAAA TGTGGAACAC 49405 433 2.84e-05 AATAAAATCT TGCACTGTGAGG CCTTCAGCGA 15140 450 2.84e-05 GCGACCCCCG CTCACGGCGAAA TCCAAGAATC 50271 52 3.96e-05 GAAGGTAACA CTAACTGTCACA GTTTGGTATG 46753 208 5.01e-05 ATTCCGATGC CTGACAGCGTGA GACGCAAACA 35605 181 5.25e-05 AGGGAAAACA TTGACACTCAAA ATATTTACTA 13838 477 1.21e-04 GATCTACTAC CTCACAGCCTCG GCACGGACGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44119 8.2e-07 131_[+1]_357 43093 8.2e-07 381_[+1]_107 33624 1.2e-06 108_[+1]_380 35753 1.8e-06 300_[+1]_188 34974 2.3e-06 246_[+1]_242 49308 3.2e-06 99_[+1]_389 49480 4.1e-06 377_[+1]_111 50671 4.1e-06 267_[+1]_221 45325 5.3e-06 262_[+1]_226 46561 5.8e-06 315_[+1]_173 43946 6.5e-06 223_[+1]_265 35101 7e-06 453_[+1]_35 49206 1e-05 211_[+1]_277 50510 2.2e-05 361_[+1]_127 49405 2.8e-05 432_[+1]_56 15140 2.8e-05 449_[+1]_39 50271 4e-05 51_[+1]_437 46753 5e-05 207_[+1]_281 35605 5.3e-05 180_[+1]_308 13838 0.00012 476_[+1]_12 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=20 44119 ( 132) CTCACTGTGAAT 1 43093 ( 382) CTGACTGTGAGA 1 33624 ( 109) TTCACTGTGAAT 1 35753 ( 301) CTCACAGTCAAA 1 34974 ( 247) TTGACTGTGAAT 1 49308 ( 100) TTGACTGTGACA 1 49480 ( 378) CTGACAGTGAGT 1 50671 ( 268) TTCACTGTCAAT 1 45325 ( 263) CTGACAGTGAAG 1 46561 ( 316) TTCACTGTCAGT 1 43946 ( 224) TTGACTGCGAAA 1 35101 ( 454) TTCACAGTCACA 1 49206 ( 212) CGCACTGTGAGA 1 50510 ( 362) TTCACAGTTAAA 1 49405 ( 433) TGCACTGTGAGG 1 15140 ( 450) CTCACGGCGAAA 1 50271 ( 52) CTAACTGTCACA 1 46753 ( 208) CTGACAGCGTGA 1 35605 ( 181) TTGACACTCAAA 1 13838 ( 477) CTCACAGCCTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 11736 bayes= 10.1389 E= 2.4e-006 -1097 113 -1097 87 -1097 -1097 -114 172 -247 127 86 -1097 185 -1097 -1097 -1097 -1097 213 -1097 -1097 52 -1097 -214 101 -1097 -219 211 -1097 -1097 -19 -1097 155 -1097 62 144 -245 169 -1097 -1097 -145 85 -19 45 -1097 98 -1097 -55 13 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 20 E= 2.4e-006 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.100000 0.900000 0.050000 0.550000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.000000 0.050000 0.550000 0.000000 0.050000 0.950000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.350000 0.600000 0.050000 0.900000 0.000000 0.000000 0.100000 0.500000 0.200000 0.300000 0.000000 0.550000 0.000000 0.150000 0.300000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CT]T[CG]AC[TA]G[TC][GC]A[AGC][AT] -------------------------------------------------------------------------------- Time 5.02 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 10 llr = 146 E-value = 1.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::3:1:7::12::1121:8:1 pos.-specific C 25:34:21:18112636:15: probability G 142::8:::2::9212:a119 matrix T 71575219a6:9:5233::4: bits 2.2 * 2.0 * * 1.7 * * * * 1.5 ** ** * * Relative 1.3 * ** *** * * Entropy 1.1 * * ** *** ** * (21.0 bits) 0.9 ** * * ** *** ** * 0.7 ** ****** *** ***** 0.4 ************* * ***** 0.2 *************** ***** 0.0 --------------------- Multilevel TCTTTGATTTCTGTCCCGACG consensus CGACCTC GA CTTT T sequence G G A G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 9309 379 3.11e-11 CTTCAAAACC TGTCCGATTTCTGTCCCGATG TTTCCGCATT 34974 284 2.32e-08 AGGCCGATTT TGGTTGATTTCCGTTTTGACG TTGACGACGA 49308 164 5.04e-08 TGTTCACAAC TGTTTGATTTATCCCTTGACG TCACCCGAGT 44432 272 5.51e-08 TTCATAAACC TCTTATCTTTCTGTTCCGATG AAAAGTGTAC 3755 476 6.61e-08 TCGCTTGCTT TCGTCGATTGCTGCAGCGAGG CGAA 33624 17 1.21e-07 GTTGCGTACT TGATCGATTTCTGGCAAGATA ATGGTGCAAT 13838 166 1.81e-07 GATCGTGGCA CCACCGATTCCTGGCACGCCG TCGGGCAGGG 46561 259 2.46e-07 GAAAGGCTTG GCTTTGACTAATGTCGCGACG ACAGTTAGTT 48647 382 2.64e-07 AAAAACTTTG TCTCTGTTTTCTGTGTTGGTG AGATTTGTGA 49480 30 6.09e-07 AGCGTAGGTT CTATTTCTTGCTGACCCGACG AGAAAGATGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9309 3.1e-11 378_[+2]_101 34974 2.3e-08 283_[+2]_196 49308 5e-08 163_[+2]_316 44432 5.5e-08 271_[+2]_208 3755 6.6e-08 475_[+2]_4 33624 1.2e-07 16_[+2]_463 13838 1.8e-07 165_[+2]_314 46561 2.5e-07 258_[+2]_221 48647 2.6e-07 381_[+2]_98 49480 6.1e-07 29_[+2]_450 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=10 9309 ( 379) TGTCCGATTTCTGTCCCGATG 1 34974 ( 284) TGGTTGATTTCCGTTTTGACG 1 49308 ( 164) TGTTTGATTTATCCCTTGACG 1 44432 ( 272) TCTTATCTTTCTGTTCCGATG 1 3755 ( 476) TCGTCGATTGCTGCAGCGAGG 1 33624 ( 17) TGATCGATTTCTGGCAAGATA 1 13838 ( 166) CCACCGATTCCTGGCACGCCG 1 46561 ( 259) GCTTTGACTAATGTCGCGACG 1 48647 ( 382) TCTCTGTTTTCTGTGTTGGTG 1 49480 ( 30) CTATTTCTTGCTGACCCGACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 11520 bayes= 10.4204 E= 1.4e+002 -997 -19 -114 136 -997 113 86 -145 11 -997 -14 87 -997 40 -997 136 -147 81 -997 87 -997 -997 186 -45 133 -19 -997 -145 -997 -119 -997 172 -997 -997 -997 187 -147 -119 -14 113 -48 181 -997 -997 -997 -119 -997 172 -997 -119 203 -997 -147 -19 -14 87 -147 139 -114 -45 -48 40 -14 13 -147 139 -997 13 -997 -997 218 -997 152 -119 -114 -997 -997 113 -114 55 -147 -997 203 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 1.4e+002 0.000000 0.200000 0.100000 0.700000 0.000000 0.500000 0.400000 0.100000 0.300000 0.000000 0.200000 0.500000 0.000000 0.300000 0.000000 0.700000 0.100000 0.400000 0.000000 0.500000 0.000000 0.000000 0.800000 0.200000 0.700000 0.200000 0.000000 0.100000 0.000000 0.100000 0.000000 0.900000 0.000000 0.000000 0.000000 1.000000 0.100000 0.100000 0.200000 0.600000 0.200000 0.800000 0.000000 0.000000 0.000000 0.100000 0.000000 0.900000 0.000000 0.100000 0.900000 0.000000 0.100000 0.200000 0.200000 0.500000 0.100000 0.600000 0.100000 0.200000 0.200000 0.300000 0.200000 0.300000 0.100000 0.600000 0.000000 0.300000 0.000000 0.000000 1.000000 0.000000 0.800000 0.100000 0.100000 0.000000 0.000000 0.500000 0.100000 0.400000 0.100000 0.000000 0.900000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TC][CG][TAG][TC][TC][GT][AC]TT[TG][CA]TG[TCG][CT][CTAG][CT]GA[CT]G -------------------------------------------------------------------------------- Time 10.53 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 107 E-value = 1.1e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 3:8:8:a:2a:323:::3::8 pos.-specific C :::8:3::5:522:::2228: probability G 78:227:a2:5552:a8:5:2 matrix T :22:::::2:::25a::532: bits 2.2 * * 2.0 * ** 1.7 ** * ** 1.5 * * ** * *** * Relative 1.3 * ***** * *** ** Entropy 1.1 ******** ** *** ** (25.8 bits) 0.9 ******** ** *** ** 0.7 ******** *** *** *** 0.4 ******** *** ******** 0.2 ********************* 0.0 --------------------- Multilevel GGACAGAGCACGGTTGGTGCA consensus A C GA A AT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 9309 13 3.37e-11 CAACCTGGTG GGACAGAGCAGAGGTGGAGCA TACGGCTTGC 43093 285 6.92e-10 AGCTCTCTAC GGACAGAGGACAGATGCTGCA CTTCAATATT 48647 425 1.17e-09 CTAAGTGTGC AGAGAGAGCAGCGTTGGTTCA ATTCCTCCTC 15140 173 8.36e-09 GAGGAATGGA GGACGCAGTACGCTTGGACCA CTGGAATGCA 3755 6 1.62e-08 GGTTC GTTCAGAGCACGTATGGTGTA CTGTATACTT 49405 135 2.29e-08 ATTGTCGATC AGACACAGAAGGATTGGCTCG CTACTGATGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9309 3.4e-11 12_[+3]_467 43093 6.9e-10 284_[+3]_195 48647 1.2e-09 424_[+3]_55 15140 8.4e-09 172_[+3]_307 3755 1.6e-08 5_[+3]_474 49405 2.3e-08 134_[+3]_345 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 9309 ( 13) GGACAGAGCAGAGGTGGAGCA 1 43093 ( 285) GGACAGAGGACAGATGCTGCA 1 48647 ( 425) AGAGAGAGCAGCGTTGGTTCA 1 15140 ( 173) GGACGCAGTACGCTTGGACCA 1 3755 ( 6) GTTCAGAGCACGTATGGTGTA 1 49405 ( 135) AGACACAGAAGGATTGGCTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 11520 bayes= 11.3538 E= 1.1e+003 26 -923 160 -923 -923 -923 192 -71 158 -923 -923 -71 -923 187 -40 -923 158 -923 -40 -923 -923 55 160 -923 184 -923 -923 -923 -923 -923 218 -923 -74 113 -40 -71 184 -923 -923 -923 -923 113 118 -923 26 -45 118 -923 -74 -45 118 -71 26 -923 -40 87 -923 -923 -923 187 -923 -923 218 -923 -923 -45 192 -923 26 -45 -923 87 -923 -45 118 29 -923 187 -923 -71 158 -923 -40 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 1.1e+003 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 0.833333 0.166667 0.833333 0.000000 0.000000 0.166667 0.000000 0.833333 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.500000 0.166667 0.166667 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.333333 0.166667 0.500000 0.000000 0.166667 0.166667 0.500000 0.166667 0.333333 0.000000 0.166667 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.333333 0.166667 0.000000 0.500000 0.000000 0.166667 0.500000 0.333333 0.000000 0.833333 0.000000 0.166667 0.833333 0.000000 0.166667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA]GACA[GC]AGCA[CG][GA]G[TA]TGG[TA][GT]CA -------------------------------------------------------------------------------- Time 15.24 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50671 8.79e-03 245_[+1(6.52e-06)]_10_\ [+1(4.13e-06)]_60_[+1(3.51e-05)]_149 9309 1.22e-13 12_[+3(3.37e-11)]_345_\ [+2(3.11e-11)]_101 46561 8.52e-06 218_[+1(3.96e-05)]_28_\ [+2(2.46e-07)]_36_[+1(5.80e-06)]_173 46753 3.02e-02 207_[+1(5.01e-05)]_281 13838 7.23e-05 165_[+2(1.81e-07)]_314 48647 1.35e-08 381_[+2(2.64e-07)]_22_\ [+3(1.17e-09)]_55 15140 8.37e-06 172_[+3(8.36e-09)]_256_\ [+1(2.84e-05)]_39 49206 2.04e-02 211_[+1(1.03e-05)]_277 49405 1.61e-05 134_[+3(2.29e-08)]_277_\ [+1(2.84e-05)]_56 49480 6.25e-05 29_[+2(6.09e-07)]_327_\ [+1(4.13e-06)]_111 50271 2.29e-01 51_[+1(3.96e-05)]_437 3755 5.79e-08 5_[+3(1.62e-08)]_449_[+2(6.61e-08)]_\ 4 33624 6.46e-07 16_[+2(1.21e-07)]_71_[+1(1.25e-06)]_\ 380 50510 1.02e-01 361_[+1(2.24e-05)]_127 44432 1.26e-03 271_[+2(5.51e-08)]_208 34974 7.12e-07 246_[+1(2.31e-06)]_2_[+1(6.98e-05)]_\ 11_[+2(2.32e-08)]_196 45325 1.83e-02 262_[+1(5.32e-06)]_226 35101 2.62e-02 453_[+1(7.03e-06)]_35 43093 4.59e-09 284_[+3(6.92e-10)]_76_\ [+1(8.25e-07)]_107 44119 5.73e-03 131_[+1(8.25e-07)]_357 35605 2.54e-01 180_[+1(5.25e-05)]_308 43946 4.12e-02 223_[+1(6.52e-06)]_265 35753 1.48e-02 300_[+1(1.85e-06)]_188 49308 5.49e-06 99_[+1(3.18e-06)]_52_[+2(5.04e-08)]_\ 77_[+1(8.69e-05)]_215_[+1(3.04e-05)] -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************