******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/153/153.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31870 1.0000 500 31999 1.0000 500 41702 1.0000 500 46363 1.0000 500 48534 1.0000 500 522 1.0000 500 39457 1.0000 500 43429 1.0000 500 5236 1.0000 500 9697 1.0000 500 23083 1.0000 500 23429 1.0000 500 50480 1.0000 500 24161 1.0000 500 11009 1.0000 500 1874 1.0000 500 45277 1.0000 500 12355 1.0000 500 35766 1.0000 500 38866 1.0000 500 44997 1.0000 500 37087 1.0000 500 33907 1.0000 500 45424 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/153/153.seqs.fa -oc motifs/153 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 24 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 12000 N= 24 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.257 C 0.248 G 0.235 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.257 C 0.248 G 0.235 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 14 sites = 23 llr = 208 E-value = 6.4e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a5:3719363426: pos.-specific C ::a:25::22:6:: probability G :4:6:317:55338 matrix T :1::11::2::::1 bits 2.1 1.9 1.7 * * 1.5 * * * Relative 1.3 * * ** Entropy 1.0 * * ** * (13.0 bits) 0.8 * ** ** * ** 0.6 * *** ******** 0.4 ***** ******** 0.2 ************** 0.0 -------------- Multilevel AACGACAGAGGCAG consensus G A G ATAAGG sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 44997 109 1.85e-07 TCCTCTCGAC AACGAGAGAGAGAG ATTCTTGGTT 522 480 3.31e-07 GGTTGAAATG AGCGACAGCGACAG TTCCGAA 50480 78 1.51e-06 GTCCAGCTCA AGCGATAGAGGGAG ATGGAGAGAT 33907 132 2.56e-06 TGCTACTCCG AACGAAAGAAACGG GATTAGCTTT 39457 284 2.87e-06 CAGTTACTGT AGCAACAGTAACAG GAAACGCTTA 31999 400 3.66e-06 CGGCATGCGG AACAAGAAACGCAG AGACGGCCAT 9697 277 4.13e-06 TACCCGATGC AACGAAAAAGACGG GGTAGTTACT 37087 363 1.21e-05 ACGATATCGT AACAATAAAGGCGG GCCAGCTATC 45424 411 2.13e-05 AGAAACCCAA AGCGACAGAATAAG AAAGGTTCAC 23429 477 2.54e-05 CGTCCTCATC GACGCGAGAGACAG ACTCTACAGC 31870 14 2.54e-05 ACAACCAAAA AACGAGGGTGGAAG CGGGAGGATG 38866 261 3.31e-05 GATCAGTGCT AGCGCGAGCCGGGG CTCTCAGCAC 1874 274 3.31e-05 CGTAAGACGT ATCGCCAGCCGCAG AAGGGTGAAG 35766 87 4.25e-05 ATTAGCAAAA AGCAGAAGAAGCAG TGTCAAGATG 24161 345 4.25e-05 GAAATCGTAT ATCGACAGTGACAT TGGTACGGAC 5236 129 4.25e-05 TCTGCCTAAC AGCATCAGTAGAAG GATGGATCAA 23083 381 4.62e-05 TTCCGCAACC AACAAGAGACGCGA GTACTGTTCT 48534 338 5.86e-05 ACTCGAGTCC AACGACGATGGGGG ACACCGCTGC 12355 165 6.33e-05 GATTAGTAAC AGCAACAGCGACGC ACCCAAAGAC 46363 196 6.33e-05 ACCGGGAGTG AACGTCAGAGAAGT TTTTCGTTTG 43429 279 1.14e-04 TCTGCGGAAA ACCACGAAAGGGAG ACATCGTGAC 11009 398 1.30e-04 GTGCCAATGA AGGTACAGACGCAG TAATCGACCA 45277 395 1.81e-04 GTACTGTGCC AACGTCAAAAAGTG TACAAAACGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44997 1.9e-07 108_[+1]_378 522 3.3e-07 479_[+1]_7 50480 1.5e-06 77_[+1]_409 33907 2.6e-06 131_[+1]_355 39457 2.9e-06 283_[+1]_203 31999 3.7e-06 399_[+1]_87 9697 4.1e-06 276_[+1]_210 37087 1.2e-05 362_[+1]_124 45424 2.1e-05 410_[+1]_76 23429 2.5e-05 476_[+1]_10 31870 2.5e-05 13_[+1]_473 38866 3.3e-05 260_[+1]_226 1874 3.3e-05 273_[+1]_213 35766 4.3e-05 86_[+1]_400 24161 4.3e-05 344_[+1]_142 5236 4.3e-05 128_[+1]_358 23083 4.6e-05 380_[+1]_106 48534 5.9e-05 337_[+1]_149 12355 6.3e-05 164_[+1]_322 46363 6.3e-05 195_[+1]_291 43429 0.00011 278_[+1]_208 11009 0.00013 397_[+1]_89 45277 0.00018 394_[+1]_92 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=14 seqs=23 44997 ( 109) AACGAGAGAGAGAG 1 522 ( 480) AGCGACAGCGACAG 1 50480 ( 78) AGCGATAGAGGGAG 1 33907 ( 132) AACGAAAGAAACGG 1 39457 ( 284) AGCAACAGTAACAG 1 31999 ( 400) AACAAGAAACGCAG 1 9697 ( 277) AACGAAAAAGACGG 1 37087 ( 363) AACAATAAAGGCGG 1 45424 ( 411) AGCGACAGAATAAG 1 23429 ( 477) GACGCGAGAGACAG 1 31870 ( 14) AACGAGGGTGGAAG 1 38866 ( 261) AGCGCGAGCCGGGG 1 1874 ( 274) ATCGCCAGCCGCAG 1 35766 ( 87) AGCAGAAGAAGCAG 1 24161 ( 345) ATCGACAGTGACAT 1 5236 ( 129) AGCATCAGTAGAAG 1 23083 ( 381) AACAAGAGACGCGA 1 48534 ( 338) AACGACGATGGGGG 1 12355 ( 165) AGCAACAGCGACGC 1 46363 ( 196) AACGTCAGAGAAGT 1 43429 ( 279) ACCACGAAAGGGAG 1 11009 ( 398) AGGTACAGACGCAG 1 45277 ( 395) AACGTCAAAAAGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 11688 bayes= 9.38958 E= 6.4e+000 190 -1117 -243 -1117 90 -251 74 -158 -1117 195 -243 -1117 44 -1117 137 -258 134 -51 -243 -100 -98 95 37 -158 183 -1117 -143 -1117 2 -1117 165 -1117 124 -51 -1117 -26 2 -19 115 -1117 76 -1117 115 -258 -56 119 15 -1117 124 -1117 57 -258 -256 -251 182 -158 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 23 E= 6.4e+000 0.956522 0.000000 0.043478 0.000000 0.478261 0.043478 0.391304 0.086957 0.000000 0.956522 0.043478 0.000000 0.347826 0.000000 0.608696 0.043478 0.652174 0.173913 0.043478 0.130435 0.130435 0.478261 0.304348 0.086957 0.913043 0.000000 0.086957 0.000000 0.260870 0.000000 0.739130 0.000000 0.608696 0.173913 0.000000 0.217391 0.260870 0.217391 0.521739 0.000000 0.434783 0.000000 0.521739 0.043478 0.173913 0.565217 0.260870 0.000000 0.608696 0.000000 0.347826 0.043478 0.043478 0.043478 0.826087 0.086957 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[AG]C[GA]A[CG]A[GA][AT][GAC][GA][CG][AG]G -------------------------------------------------------------------------------- Time 5.57 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 10 llr = 112 E-value = 3.8e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::78::13::3 pos.-specific C 5::3:179:::: probability G 59a::6::72a7 matrix T :1::233::8:: bits 2.1 * * 1.9 * * 1.7 ** * 1.5 ** * * Relative 1.3 ** * ***** Entropy 1.0 ***** ****** (16.2 bits) 0.8 ************ 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CGGAAGCCGTGG consensus G CTTT AG A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 37087 142 3.08e-07 AAAAAGATAC CGGCAGCCGTGG TAAACAGTTT 12355 193 3.08e-07 AAAGACCCTG CGGCAGCCGTGG ACACCCCCAC 41702 44 1.22e-06 GGGTAGTCCA GGGAATCCGTGA CTTAAGCTTC 44997 469 1.71e-06 CGGGGATTTG CGGAAGTCATGG CGATTTTGCA 35766 11 2.68e-06 GCCGCGCGCT GTGAAGCCGTGG TGAATCTTTC 23429 135 4.29e-06 GTTAGTAAGA CGGAATCCATGA TGGGTGTTCC 45277 428 5.92e-06 AGTTGACCAT GGGATTCCATGG CAGCAGCTCA 522 367 7.39e-06 TCTCGATTCT GGGCTGCCGGGG ATTCCTCGGA 31999 52 9.21e-06 CCGCTGGCGG CGGAACTCGTGA CACAGCACAA 50480 282 1.50e-05 TAGTTGTTTT GGGAAGTAGGGG TAACTAATTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37087 3.1e-07 141_[+2]_347 12355 3.1e-07 192_[+2]_296 41702 1.2e-06 43_[+2]_445 44997 1.7e-06 468_[+2]_20 35766 2.7e-06 10_[+2]_478 23429 4.3e-06 134_[+2]_354 45277 5.9e-06 427_[+2]_61 522 7.4e-06 366_[+2]_122 31999 9.2e-06 51_[+2]_437 50480 1.5e-05 281_[+2]_207 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=10 37087 ( 142) CGGCAGCCGTGG 1 12355 ( 193) CGGCAGCCGTGG 1 41702 ( 44) GGGAATCCGTGA 1 44997 ( 469) CGGAAGTCATGG 1 35766 ( 11) GTGAAGCCGTGG 1 23429 ( 135) CGGAATCCATGA 1 45277 ( 428) GGGATTCCATGG 1 522 ( 367) GGGCTGCCGGGG 1 31999 ( 52) CGGAACTCGTGA 1 50480 ( 282) GGGAAGTAGGGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 11736 bayes= 11.6702 E= 3.8e+003 -997 101 109 -997 -997 -997 194 -138 -997 -997 209 -997 144 27 -997 -997 164 -997 -997 -38 -997 -131 135 20 -997 150 -997 20 -136 186 -997 -997 22 -997 158 -997 -997 -997 -23 162 -997 -997 209 -997 22 -997 158 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 3.8e+003 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.900000 0.100000 0.000000 0.000000 1.000000 0.000000 0.700000 0.300000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.100000 0.600000 0.300000 0.000000 0.700000 0.000000 0.300000 0.100000 0.900000 0.000000 0.000000 0.300000 0.000000 0.700000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 1.000000 0.000000 0.300000 0.000000 0.700000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG]GG[AC][AT][GT][CT]C[GA][TG]G[GA] -------------------------------------------------------------------------------- Time 10.84 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 18 sites = 4 llr = 77 E-value = 2.3e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :3:::::::::5:::::: pos.-specific C a::83:a:3:3:5::::a probability G :8:38::a8383:::3:: matrix T ::a::a:::8:35aa8a: bits 2.1 * ** * 1.9 * * *** ** ** 1.7 * * *** ** ** 1.5 * * *** ** ** Relative 1.3 *********** ***** Entropy 1.0 *********** ****** (27.9 bits) 0.8 *********** ****** 0.6 *********** ****** 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel CGTCGTCGGTGACTTTTC consensus A GC CGCGT G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 31999 361 1.21e-10 TGGTGGAATA CGTCGTCGGTGACTTGTC CGGCATTTAG 45277 235 5.74e-10 AACGAACAGC CGTCGTCGCTGGTTTTTC TTTTGCCATC 44997 233 2.42e-09 CTCGGAAAGT CATCGTCGGTCTTTTTTC TCCCTGTTCG 12355 54 2.55e-09 CGTTCCGCCG CGTGCTCGGGGACTTTTC CTTACATTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31999 1.2e-10 360_[+3]_122 45277 5.7e-10 234_[+3]_248 44997 2.4e-09 232_[+3]_250 12355 2.6e-09 53_[+3]_429 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=18 seqs=4 31999 ( 361) CGTCGTCGGTGACTTGTC 1 45277 ( 235) CGTCGTCGCTGGTTTTTC 1 44997 ( 233) CATCGTCGGTCTTTTTTC 1 12355 ( 54) CGTGCTCGGGGACTTTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 11592 bayes= 11.5003 E= 2.3e+003 -865 201 -865 -865 -4 -865 167 -865 -865 -865 -865 194 -865 159 9 -865 -865 1 167 -865 -865 -865 -865 194 -865 201 -865 -865 -865 -865 209 -865 -865 1 167 -865 -865 -865 9 152 -865 1 167 -865 96 -865 9 -6 -865 101 -865 94 -865 -865 -865 194 -865 -865 -865 194 -865 -865 9 152 -865 -865 -865 194 -865 201 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 4 E= 2.3e+003 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.250000 0.750000 0.000000 0.500000 0.000000 0.250000 0.250000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[GA]T[CG][GC]TCG[GC][TG][GC][AGT][CT]TT[TG]TC -------------------------------------------------------------------------------- Time 15.64 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31870 6.83e-02 13_[+1(2.54e-05)]_473 31999 2.03e-10 51_[+2(9.21e-06)]_297_\ [+3(1.21e-10)]_21_[+1(3.66e-06)]_87 41702 1.63e-03 43_[+2(1.22e-06)]_445 46363 1.85e-01 195_[+1(6.33e-05)]_291 48534 2.11e-01 337_[+1(5.86e-05)]_149 522 5.65e-05 366_[+2(7.39e-06)]_101_\ [+1(3.31e-07)]_7 39457 1.81e-03 283_[+1(2.87e-06)]_203 43429 5.71e-02 500 5236 1.73e-01 128_[+1(4.25e-05)]_358 9697 3.40e-02 276_[+1(4.13e-06)]_210 23083 1.47e-01 380_[+1(4.62e-05)]_106 23429 1.39e-03 134_[+2(4.29e-06)]_330_\ [+1(2.54e-05)]_10 50480 1.10e-04 77_[+1(1.51e-06)]_190_\ [+2(1.50e-05)]_207 24161 5.21e-02 344_[+1(4.25e-05)]_142 11009 1.72e-01 500 1874 2.19e-02 273_[+1(3.31e-05)]_213 45277 2.02e-08 234_[+3(5.74e-10)]_175_\ [+2(5.92e-06)]_61 12355 2.04e-09 53_[+3(2.55e-09)]_93_[+1(6.33e-05)]_\ 14_[+2(3.08e-07)]_296 35766 1.24e-03 10_[+2(2.68e-06)]_64_[+1(4.25e-05)]_\ 400 38866 3.93e-02 260_[+1(3.31e-05)]_226 44997 4.26e-11 108_[+1(1.85e-07)]_20_\ [+3(4.07e-05)]_72_[+3(2.42e-09)]_218_[+2(1.71e-06)]_20 37087 1.14e-05 141_[+2(3.08e-07)]_209_\ [+1(1.21e-05)]_124 33907 3.48e-02 131_[+1(2.56e-06)]_355 45424 7.75e-02 410_[+1(2.13e-05)]_76 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************