******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/114/114.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1056 1.0000 500 11383 1.0000 500 12066 1.0000 500 14899 1.0000 500 21972 1.0000 500 22474 1.0000 500 22683 1.0000 500 23212 1.0000 500 2435 1.0000 500 260914 1.0000 500 262305 1.0000 500 263154 1.0000 500 264664 1.0000 500 268435 1.0000 500 2720 1.0000 500 3330 1.0000 500 4863 1.0000 500 8134 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/114/114.seqs.fa -oc motifs/114 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.266 C 0.229 G 0.240 T 0.265 Background letter frequencies (from dataset with add-one prior applied): A 0.266 C 0.229 G 0.240 T 0.265 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 8 llr = 133 E-value = 8.6e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 3:9::431561:91635::5: pos.-specific C :a:39:59:16a:943539:a probability G :::411:::11:1::4::::: matrix T 8:14:53:511::::1:815: bits 2.1 * * * 1.9 * * * 1.7 * * * 1.5 * * * *** * * Relative 1.3 ** * * *** * * Entropy 1.1 *** * * **** *** * (24.1 bits) 0.8 *** * ** **** ***** 0.6 *** * ** **** ***** 0.4 *************** ***** 0.2 ********************* 0.0 --------------------- Multilevel TCAGCTCCAACCACAGATCAC consensus A T AA T CACC T sequence C T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 268435 439 1.02e-11 TAGACTCTCT TCAGCTCCAACCACCGATCTC TCTTCACGCC 21972 400 1.09e-09 CTCCAGCTGT TCAGCTTCAGCCACACATCAC AGATCGGATC 22474 167 2.99e-09 GAGCCGACCA ACATCACCTTCCACAACTCTC GGTAAAAAGG 262305 260 8.61e-09 CTTCCTCCAT TCATCTACAACCACACACTAC CACAACACCC 260914 407 1.71e-08 CTTCTCCATT TCACCACCTCGCACCTCTCTC TCTACACACT 22683 410 5.02e-08 ACAACCCACC ACAGCTCCAACCGACGCCCAC GCAGGCCTTC 4863 378 6.48e-08 CTCTGAATTT TCTCCGTCTAACACAACTCAC TCTGAACTTC 8134 315 9.25e-08 TTTCTGCACT TCATGAAATATCACAGATCTC ATTTGCTGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 268435 1e-11 438_[+1]_41 21972 1.1e-09 399_[+1]_80 22474 3e-09 166_[+1]_313 262305 8.6e-09 259_[+1]_220 260914 1.7e-08 406_[+1]_73 22683 5e-08 409_[+1]_70 4863 6.5e-08 377_[+1]_102 8134 9.2e-08 314_[+1]_165 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=8 268435 ( 439) TCAGCTCCAACCACCGATCTC 1 21972 ( 400) TCAGCTTCAGCCACACATCAC 1 22474 ( 167) ACATCACCTTCCACAACTCTC 1 262305 ( 260) TCATCTACAACCACACACTAC 1 260914 ( 407) TCACCACCTCGCACCTCTCTC 1 22683 ( 410) ACAGCTCCAACCGACGCCCAC 1 4863 ( 378) TCTCCGTCTAACACAACTCAC 1 8134 ( 315) TCATGAAATATCACAGATCTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 10.0755 E= 8.6e-002 -9 -965 -965 150 -965 212 -965 -965 172 -965 -965 -108 -965 12 64 50 -965 193 -94 -965 50 -965 -94 92 -9 112 -965 -8 -109 193 -965 -965 91 -965 -965 92 123 -87 -94 -108 -109 145 -94 -108 -965 212 -965 -965 172 -965 -94 -965 -109 193 -965 -965 123 71 -965 -965 -9 12 64 -108 91 112 -965 -965 -965 12 -965 150 -965 193 -965 -108 91 -965 -965 92 -965 212 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 8.6e-002 0.250000 0.000000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.875000 0.000000 0.000000 0.125000 0.000000 0.250000 0.375000 0.375000 0.000000 0.875000 0.125000 0.000000 0.375000 0.000000 0.125000 0.500000 0.250000 0.500000 0.000000 0.250000 0.125000 0.875000 0.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.625000 0.125000 0.125000 0.125000 0.125000 0.625000 0.125000 0.125000 0.000000 1.000000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.125000 0.875000 0.000000 0.000000 0.625000 0.375000 0.000000 0.000000 0.250000 0.250000 0.375000 0.125000 0.500000 0.500000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.875000 0.000000 0.125000 0.500000 0.000000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TA]CA[GTC]C[TA][CAT]C[AT]ACCAC[AC][GAC][AC][TC]C[AT]C -------------------------------------------------------------------------------- Time 3.60 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 8 llr = 131 E-value = 4.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :5:a::65::11:313::658 pos.-specific C :1:::::1:3:::3::1:1:: probability G 91a:9a1:9645a486:a351 matrix T 13::1:341154:1119:::1 bits 2.1 * * * * 1.9 ** * * * 1.7 ** * * * 1.5 * **** * * ** Relative 1.3 * **** * * ** Entropy 1.1 * **** * * * ** * (23.6 bits) 0.8 * **** ** * * ** ** 0.6 * *********** ******* 0.4 * *********** ******* 0.2 ********************* 0.0 --------------------- Multilevel GAGAGGAAGGTGGGGGTGAAA consensus T TT CGT A A GG sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 4863 37 9.93e-10 ATGCATAGAT GAGAGGAAGTTGGAGATGAAA GAGCTGTTAG 8134 236 1.12e-09 TGTTTCGTTT GCGAGGATGGAGGAGGTGAGA GGAAGGAATG 22683 309 1.27e-09 TGGTTGGGGG GAGAGGGAGGTGGGAGTGAGA GGGACGCAAA 22474 70 6.88e-09 AAGAGAGAGA GAGAGGTATGGTGCGGTGGAA TTGTCAGACA 264664 119 2.14e-08 TACAAAAAGC GGGAGGATGCGGGTTGTGAAA TGCAGACTCT 21972 131 4.24e-08 GTCCGGAGGA GTGAGGTTGGTTGGGTTGGAT GGTAGGTTGG 263154 90 5.23e-08 CAGGGATCAG GAGAGGACGCGTGGGATGCGG TGCCATCTCT 3330 251 1.76e-07 CAGCATTATA TTGATGAAGGTAGCGGCGAGA CTTGGCGGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 4863 9.9e-10 36_[+2]_443 8134 1.1e-09 235_[+2]_244 22683 1.3e-09 308_[+2]_171 22474 6.9e-09 69_[+2]_410 264664 2.1e-08 118_[+2]_361 21972 4.2e-08 130_[+2]_349 263154 5.2e-08 89_[+2]_390 3330 1.8e-07 250_[+2]_229 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=8 4863 ( 37) GAGAGGAAGTTGGAGATGAAA 1 8134 ( 236) GCGAGGATGGAGGAGGTGAGA 1 22683 ( 309) GAGAGGGAGGTGGGAGTGAGA 1 22474 ( 70) GAGAGGTATGGTGCGGTGGAA 1 264664 ( 119) GGGAGGATGCGGGTTGTGAAA 1 21972 ( 131) GTGAGGTTGGTTGGGTTGGAT 1 263154 ( 90) GAGAGGACGCGTGGGATGCGG 1 3330 ( 251) TTGATGAAGGTAGCGGCGAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 10.0755 E= 4.8e+000 -965 -965 186 -108 91 -87 -94 -8 -965 -965 206 -965 191 -965 -965 -965 -965 -965 186 -108 -965 -965 206 -965 123 -965 -94 -8 91 -87 -965 50 -965 -965 186 -108 -965 12 138 -108 -109 -965 64 92 -109 -965 106 50 -965 -965 206 -965 -9 12 64 -108 -109 -965 164 -108 -9 -965 138 -108 -965 -87 -965 172 -965 -965 206 -965 123 -87 6 -965 91 -965 106 -965 149 -965 -94 -108 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 4.8e+000 0.000000 0.000000 0.875000 0.125000 0.500000 0.125000 0.125000 0.250000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 1.000000 0.000000 0.625000 0.000000 0.125000 0.250000 0.500000 0.125000 0.000000 0.375000 0.000000 0.000000 0.875000 0.125000 0.000000 0.250000 0.625000 0.125000 0.125000 0.000000 0.375000 0.500000 0.125000 0.000000 0.500000 0.375000 0.000000 0.000000 1.000000 0.000000 0.250000 0.250000 0.375000 0.125000 0.125000 0.000000 0.750000 0.125000 0.250000 0.000000 0.625000 0.125000 0.000000 0.125000 0.000000 0.875000 0.000000 0.000000 1.000000 0.000000 0.625000 0.125000 0.250000 0.000000 0.500000 0.000000 0.500000 0.000000 0.750000 0.000000 0.125000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[AT]GAGG[AT][AT]G[GC][TG][GT]G[GAC]G[GA]TG[AG][AG]A -------------------------------------------------------------------------------- Time 7.08 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 14 llr = 170 E-value = 3.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1::131414496:9721421 pos.-specific C :12:1:11111:::2::13: probability G 9118394436:191185:38 matrix T :9614:143::21:::4521 bits 2.1 1.9 1.7 * * 1.5 * * * ** Relative 1.3 ** * * ** * Entropy 1.1 ** * * * ** * * (17.5 bits) 0.8 ** * * ** **** * 0.6 **** * ********* * 0.4 **** * * ********* * 0.2 ****************** * 0.0 -------------------- Multilevel GTTGTGAGAGAAGAAGGTCG consensus C A GTGA T CATAG sequence G T A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 12066 2 1.29e-09 T GTTGTGTGAGAAGAAGGTAG CATCAGCTCT 2435 105 3.43e-08 GTCGTCAGAT GTTGAGGATAAAGAAGTTCG CAGCGAAGAC 22683 268 5.58e-08 TGCTACCATA GTTGGGGTTGAGGACGGATG CTGTTTGTTG 263154 155 6.27e-08 AGAGCGTGTG GTCGTGCTGGATGAAGTTGG ACCAAGTCAA 4863 280 4.52e-07 TTTTGAGATA GTTTCGAGAAAAGAAAGACG AAGTATCGGG 14899 267 5.94e-07 GAAAAGCGGC GGTATGGTTGAAGAAGGTTG AAGAAAGACT 22474 354 7.76e-07 CGCCAACAAG ATTGAGCGGGAAGAAATACG GAACCAGGCG 268435 209 1.01e-06 TGGTTGGTAA GTTGGGAGAAATGGAGGTCT GATGAATCCT 21972 11 2.09e-06 GTCAGAATGC GTCGGGAGGCAGGAAGGAGT TGTGAGGATA 1056 273 2.44e-06 ACCTCTGTGT GCTGAGATTGAATAAGATGG AAACATGGCT 262305 137 3.06e-06 GTGTGCTCTA GTTGTAGTAGAAGAGGTAGA AAATGATGGA 2720 391 6.20e-06 GTCCGTCGGA GTGGTAATGAATGACAGTAG CACCCTAGAG 8134 470 8.62e-06 AAACCAAAGC GTCGAGTCCAAAGAAGTCAG ATTCCAAGCC 264664 38 1.04e-05 TATAGCCGTT GTGTGGGGAGCAGACGAATG ACGCAGGCTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12066 1.3e-09 1_[+3]_479 2435 3.4e-08 104_[+3]_376 22683 5.6e-08 267_[+3]_213 263154 6.3e-08 154_[+3]_326 4863 4.5e-07 279_[+3]_201 14899 5.9e-07 266_[+3]_214 22474 7.8e-07 353_[+3]_127 268435 1e-06 208_[+3]_272 21972 2.1e-06 10_[+3]_470 1056 2.4e-06 272_[+3]_208 262305 3.1e-06 136_[+3]_344 2720 6.2e-06 390_[+3]_90 8134 8.6e-06 469_[+3]_11 264664 1e-05 37_[+3]_443 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=14 12066 ( 2) GTTGTGTGAGAAGAAGGTAG 1 2435 ( 105) GTTGAGGATAAAGAAGTTCG 1 22683 ( 268) GTTGGGGTTGAGGACGGATG 1 263154 ( 155) GTCGTGCTGGATGAAGTTGG 1 4863 ( 280) GTTTCGAGAAAAGAAAGACG 1 14899 ( 267) GGTATGGTTGAAGAAGGTTG 1 22474 ( 354) ATTGAGCGGGAAGAAATACG 1 268435 ( 209) GTTGGGAGAAATGGAGGTCT 1 21972 ( 11) GTCGGGAGGCAGGAAGGAGT 1 1056 ( 273) GCTGAGATTGAATAAGATGG 1 262305 ( 137) GTTGTAGTAGAAGAGGTAGA 1 2720 ( 391) GTGGTAATGAATGACAGTAG 1 8134 ( 470) GTCGAGTCCAAAGAAGTCAG 1 264664 ( 38) GTGTGGGGAGCAGACGAATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 8658 bayes= 9.11374 E= 3.2e+000 -189 -1045 195 -1045 -1045 -168 -175 170 -1045 -10 -75 128 -189 -1045 171 -89 10 -168 25 43 -90 -1045 183 -1045 43 -68 57 -89 -189 -168 83 70 43 -168 25 11 43 -168 125 -1045 180 -168 -1045 -1045 127 -1045 -75 -30 -1045 -1045 195 -189 180 -1045 -175 -1045 143 -10 -175 -1045 -31 -1045 171 -1045 -90 -1045 106 43 69 -168 -1045 92 -31 32 25 -30 -189 -1045 171 -89 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 14 E= 3.2e+000 0.071429 0.000000 0.928571 0.000000 0.000000 0.071429 0.071429 0.857143 0.000000 0.214286 0.142857 0.642857 0.071429 0.000000 0.785714 0.142857 0.285714 0.071429 0.285714 0.357143 0.142857 0.000000 0.857143 0.000000 0.357143 0.142857 0.357143 0.142857 0.071429 0.071429 0.428571 0.428571 0.357143 0.071429 0.285714 0.285714 0.357143 0.071429 0.571429 0.000000 0.928571 0.071429 0.000000 0.000000 0.642857 0.000000 0.142857 0.214286 0.000000 0.000000 0.928571 0.071429 0.928571 0.000000 0.071429 0.000000 0.714286 0.214286 0.071429 0.000000 0.214286 0.000000 0.785714 0.000000 0.142857 0.000000 0.500000 0.357143 0.428571 0.071429 0.000000 0.500000 0.214286 0.285714 0.285714 0.214286 0.071429 0.000000 0.785714 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GT[TC]G[TAG]G[AG][GT][AGT][GA]A[AT]GA[AC][GA][GT][TA][CGAT]G -------------------------------------------------------------------------------- Time 10.53 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1056 2.17e-02 272_[+3(2.44e-06)]_208 11383 9.98e-01 500 12066 1.97e-05 1_[+3(1.29e-09)]_479 14899 2.33e-03 266_[+3(5.94e-07)]_214 21972 5.88e-12 10_[+3(2.09e-06)]_100_\ [+2(4.24e-08)]_248_[+1(1.09e-09)]_80 22474 1.08e-12 69_[+2(6.88e-09)]_76_[+1(2.99e-09)]_\ 52_[+2(3.22e-05)]_93_[+3(7.76e-07)]_127 22683 2.62e-13 115_[+2(5.71e-05)]_131_\ [+3(5.58e-08)]_21_[+2(1.27e-09)]_80_[+1(5.02e-08)]_70 23212 8.18e-01 500 2435 8.17e-06 104_[+3(3.43e-08)]_87_\ [+3(4.00e-05)]_110_[+3(7.71e-05)]_106_[+1(2.49e-05)]_12 260914 2.79e-04 364_[+1(3.37e-05)]_21_\ [+1(1.71e-08)]_73 262305 5.34e-07 136_[+3(3.06e-06)]_103_\ [+1(8.61e-09)]_220 263154 1.41e-07 89_[+2(5.23e-08)]_44_[+3(6.27e-08)]_\ 326 264664 2.94e-06 37_[+3(1.04e-05)]_61_[+2(2.14e-08)]_\ 90_[+2(9.75e-05)]_250 268435 1.03e-10 208_[+3(1.01e-06)]_210_\ [+1(1.02e-11)]_41 2720 2.17e-02 390_[+3(6.20e-06)]_90 3330 1.10e-03 250_[+2(1.76e-07)]_229 4863 1.90e-12 36_[+2(9.93e-10)]_222_\ [+3(4.52e-07)]_78_[+1(6.48e-08)]_102 8134 4.74e-11 157_[+2(5.84e-06)]_57_\ [+2(1.12e-09)]_58_[+1(9.25e-08)]_134_[+3(8.62e-06)]_11 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************