******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/74/74.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10098 1.0000 500 11173 1.0000 500 11225 1.0000 500 20964 1.0000 500 23344 1.0000 500 24498 1.0000 500 25494 1.0000 500 6805 1.0000 500 bd1945 1.0000 500 bd396 1.0000 500 bd696 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/74/74.seqs.fa -oc motifs/74 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.282 C 0.233 G 0.235 T 0.250 Background letter frequencies (from dataset with add-one prior applied): A 0.282 C 0.233 G 0.235 T 0.250 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 105 E-value = 7.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :823:3:82735:8:7::7:a pos.-specific C 8277:27::225a2:2:::7: probability G 2:::8:328:5:::a::822: matrix T ::2:25:::2:::::2a222: bits 2.1 * * * 1.9 * * * * 1.7 * * * * 1.5 * * * * * ** * Relative 1.3 ** * *** *** ** * Entropy 1.1 ** ** *** **** ** * (25.2 bits) 0.8 ***** *** **** ** ** 0.6 ***** *************** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CACCGTCAGAGACAGATGACA consensus A AG AC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 6805 28 1.88e-11 GTTGATACAT CACCGTCAGAACCAGCTGACA CATGCCCGCG 20964 96 4.15e-09 GAACCTCCTC CACCGCCAATGCCAGTTGACA CTACGAACAT bd396 147 5.79e-09 TAATATGACT CATCGAGAGAGCCAGATTTCA TTCAGCTCAT bd1945 274 6.27e-09 GTGCCCGTGC CACCGACGGACACCGATGGCA AGATGGAATA bd696 13 1.33e-08 TACCGACGTC GACAGTGAGCAACAGATGAGA TGGATTGACA 24498 79 2.41e-08 TTTGAAGCGG CCAATTCAGAGACAGATGATA TGAATTTGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 6805 1.9e-11 27_[+1]_452 20964 4.1e-09 95_[+1]_384 bd396 5.8e-09 146_[+1]_333 bd1945 6.3e-09 273_[+1]_206 bd696 1.3e-08 12_[+1]_467 24498 2.4e-08 78_[+1]_401 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 6805 ( 28) CACCGTCAGAACCAGCTGACA 1 20964 ( 96) CACCGCCAATGCCAGTTGACA 1 bd396 ( 147) CATCGAGAGAGCCAGATTTCA 1 bd1945 ( 274) CACCGACGGACACCGATGGCA 1 bd696 ( 13) GACAGTGAGCAACAGATGAGA 1 24498 ( 79) CCAATTCAGAGACAGATGATA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5280 bayes= 9.43824 E= 7.7e+001 -923 184 -50 -923 156 -48 -923 -923 -76 151 -923 -58 24 151 -923 -923 -923 -923 182 -58 24 -48 -923 100 -923 151 50 -923 156 -923 -50 -923 -76 -923 182 -923 124 -48 -923 -58 24 -48 109 -923 83 110 -923 -923 -923 210 -923 -923 156 -48 -923 -923 -923 -923 209 -923 124 -48 -923 -58 -923 -923 -923 200 -923 -923 182 -58 124 -923 -50 -58 -923 151 -50 -58 182 -923 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 7.7e+001 0.000000 0.833333 0.166667 0.000000 0.833333 0.166667 0.000000 0.000000 0.166667 0.666667 0.000000 0.166667 0.333333 0.666667 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.333333 0.166667 0.000000 0.500000 0.000000 0.666667 0.333333 0.000000 0.833333 0.000000 0.166667 0.000000 0.166667 0.000000 0.833333 0.000000 0.666667 0.166667 0.000000 0.166667 0.333333 0.166667 0.500000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.166667 0.000000 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.666667 0.000000 0.166667 0.166667 0.000000 0.666667 0.166667 0.166667 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CAC[CA]G[TA][CG]AGA[GA][AC]CAGATGACA -------------------------------------------------------------------------------- Time 1.15 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 11 llr = 108 E-value = 1.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :a6::a:53:3: pos.-specific C ::::3::2:::: probability G a:447:525729 matrix T :::6::513351 bits 2.1 * 1.9 ** * 1.7 ** * * 1.5 ** * * Relative 1.3 ** ** * * Entropy 1.1 ******* * * (14.2 bits) 0.8 ******* * * 0.6 ******* *** 0.4 ******* **** 0.2 ************ 0.0 ------------ Multilevel GAATGATAGGTG consensus GGC G ATA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 23344 351 7.08e-08 AGGAAGAAGT GAATGATAGGTG GGTTGGTTGG bd696 132 2.77e-06 CTGAAAGTGC GAAGGAGAAGTG TGCAGGCATG 11173 256 7.15e-06 GCACAAAAAT GAGGCAGAGGTG AGACGCATGG 20964 402 8.27e-06 TTAATCATCT GAATCATAGTTG AGAGGTACCT 10098 204 1.04e-05 ACTTTCTTGG GAGTGATCGGAG TTGTGATGAC 24498 266 1.40e-05 GAACTGTGGA GAGGGAGATGAG CTCGCTTTCT 6805 7 1.73e-05 GATTGG GAATGATGTTTG TTGATACATC 25494 333 2.08e-05 ACTCCTCAGA GAATGATCAGGG GAAAACATGA 11225 402 4.28e-05 CCATGATTTA GAATGAGTATTG AGTTGCTTCG bd1945 404 4.53e-05 GGGATGCTCA GAGGCAGGGGAG TACACACATG bd396 94 5.50e-05 CAAATGTCGA GAATGATATGGT GGTATGATAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23344 7.1e-08 350_[+2]_138 bd696 2.8e-06 131_[+2]_357 11173 7.2e-06 255_[+2]_233 20964 8.3e-06 401_[+2]_87 10098 1e-05 203_[+2]_285 24498 1.4e-05 265_[+2]_223 6805 1.7e-05 6_[+2]_482 25494 2.1e-05 332_[+2]_156 11225 4.3e-05 401_[+2]_87 bd1945 4.5e-05 403_[+2]_85 bd396 5.5e-05 93_[+2]_395 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=11 23344 ( 351) GAATGATAGGTG 1 bd696 ( 132) GAAGGAGAAGTG 1 11173 ( 256) GAGGCAGAGGTG 1 20964 ( 402) GAATCATAGTTG 1 10098 ( 204) GAGTGATCGGAG 1 24498 ( 266) GAGGGAGATGAG 1 6805 ( 7) GAATGATGTTTG 1 25494 ( 333) GAATGATCAGGG 1 11225 ( 402) GAATGAGTATTG 1 bd1945 ( 404) GAGGCAGGGGAG 1 bd396 ( 94) GAATGATATGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5379 bayes= 8.93074 E= 1.3e+002 -1010 -1010 209 -1010 183 -1010 -1010 -1010 117 -1010 63 -1010 -1010 -1010 63 135 -1010 23 163 -1010 183 -1010 -1010 -1010 -1010 -1010 95 113 95 -36 -37 -146 -5 -1010 95 13 -1010 -1010 163 13 -5 -1010 -37 113 -1010 -1010 195 -146 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 1.3e+002 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.636364 0.000000 0.363636 0.000000 0.000000 0.000000 0.363636 0.636364 0.000000 0.272727 0.727273 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.454545 0.545455 0.545455 0.181818 0.181818 0.090909 0.272727 0.000000 0.454545 0.272727 0.000000 0.000000 0.727273 0.272727 0.272727 0.000000 0.181818 0.545455 0.000000 0.000000 0.909091 0.090909 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GA[AG][TG][GC]A[TG]A[GAT][GT][TA]G -------------------------------------------------------------------------------- Time 2.33 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 11 llr = 108 E-value = 3.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :2::2::21:27 pos.-specific C 9:2a::81285: probability G :4::4:17::4: matrix T 158:5a1:72:3 bits 2.1 * * 1.9 * * 1.7 * * * 1.5 * * * * Relative 1.3 * ** ** * Entropy 1.1 * ** *** * * (14.2 bits) 0.8 * ** ***** * 0.6 * ** ******* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CTTCTTCGTCCA consensus G G GT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 20964 3 1.28e-07 GA CGTCTTCGTCCA CTGCACTAGG 11173 138 1.39e-06 TCCCATCCAT CGTCGTCGTCGT TAGACATCGT bd396 23 1.64e-06 GCTTTATACT CTTCTTCGTTCA GCATTTGTCG bd1945 218 4.44e-06 CAAACAAGCC CTTCGTGGTCCA GACGGGGTAG 6805 153 9.34e-06 CACTCCATGA TGTCGTCGTCGA TGCAACTTTG 23344 74 1.00e-05 CATATTTTGT CATCTTCCTCCA ACGCTGAGAA bd696 455 1.28e-05 ATAGCCATCA CTTCTTCGACAA ATCACCCTGA 11225 326 2.93e-05 CCCCGATCAA CGCCATCGCCGA GCTAATAATA 25494 74 3.62e-05 CTGGAGTACA CTTCTTCACTCA GCTGAAGACA 10098 480 3.86e-05 GAATTACACA CTTCATCATCAT CATATCATC 24498 296 1.11e-04 CTTGCTGAGA CACCGTTGTCGT ATTTGCGACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 20964 1.3e-07 2_[+3]_486 11173 1.4e-06 137_[+3]_351 bd396 1.6e-06 22_[+3]_466 bd1945 4.4e-06 217_[+3]_271 6805 9.3e-06 152_[+3]_336 23344 1e-05 73_[+3]_415 bd696 1.3e-05 454_[+3]_34 11225 2.9e-05 325_[+3]_163 25494 3.6e-05 73_[+3]_415 10098 3.9e-05 479_[+3]_9 24498 0.00011 295_[+3]_193 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=11 20964 ( 3) CGTCTTCGTCCA 1 11173 ( 138) CGTCGTCGTCGT 1 bd396 ( 23) CTTCTTCGTTCA 1 bd1945 ( 218) CTTCGTGGTCCA 1 6805 ( 153) TGTCGTCGTCGA 1 23344 ( 74) CATCTTCCTCCA 1 bd696 ( 455) CTTCTTCGACAA 1 11225 ( 326) CGCCATCGCCGA 1 25494 ( 74) CTTCTTCACTCA 1 10098 ( 480) CTTCATCATCAT 1 24498 ( 296) CACCGTTGTCGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5379 bayes= 9.28648 E= 3.0e+002 -1010 196 -1010 -146 -63 -1010 63 86 -1010 -36 -1010 171 -1010 210 -1010 -1010 -63 -1010 63 86 -1010 -1010 -1010 200 -1010 181 -137 -146 -63 -136 163 -1010 -163 -36 -1010 154 -1010 181 -1010 -46 -63 96 63 -1010 137 -1010 -1010 13 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 3.0e+002 0.000000 0.909091 0.000000 0.090909 0.181818 0.000000 0.363636 0.454545 0.000000 0.181818 0.000000 0.818182 0.000000 1.000000 0.000000 0.000000 0.181818 0.000000 0.363636 0.454545 0.000000 0.000000 0.000000 1.000000 0.000000 0.818182 0.090909 0.090909 0.181818 0.090909 0.727273 0.000000 0.090909 0.181818 0.000000 0.727273 0.000000 0.818182 0.000000 0.181818 0.181818 0.454545 0.363636 0.000000 0.727273 0.000000 0.000000 0.272727 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[TG]TC[TG]TCGTC[CG][AT] -------------------------------------------------------------------------------- Time 3.39 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10098 5.04e-03 203_[+2(1.04e-05)]_264_\ [+3(3.86e-05)]_9 11173 1.73e-04 65_[+3(4.04e-06)]_60_[+3(1.39e-06)]_\ 106_[+2(7.15e-06)]_233 11225 9.12e-03 325_[+3(2.93e-05)]_64_\ [+2(4.28e-05)]_87 20964 2.16e-10 2_[+3(1.28e-07)]_81_[+1(4.15e-09)]_\ 285_[+2(8.27e-06)]_87 23344 6.24e-06 73_[+3(1.00e-05)]_265_\ [+2(7.08e-08)]_138 24498 8.61e-07 78_[+1(2.41e-08)]_166_\ [+2(1.40e-05)]_223 25494 3.99e-03 73_[+3(3.62e-05)]_247_\ [+2(2.08e-05)]_156 6805 1.53e-10 6_[+2(1.73e-05)]_9_[+1(1.88e-11)]_\ 104_[+3(9.34e-06)]_336 bd1945 4.01e-08 217_[+3(4.44e-06)]_44_\ [+1(6.27e-09)]_109_[+2(4.53e-05)]_85 bd396 1.79e-08 22_[+3(1.64e-06)]_59_[+2(5.50e-05)]_\ 41_[+1(5.79e-09)]_333 bd696 1.64e-08 12_[+1(1.33e-08)]_98_[+2(2.77e-06)]_\ 47_[+3(9.46e-05)]_252_[+3(1.28e-05)]_34 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************