******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/450/450.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31533 1.0000 500 32459 1.0000 500 43748 1.0000 500 55035 1.0000 500 44478 1.0000 500 34423 1.0000 500 45367 1.0000 500 44539 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/450/450.seqs.fa -oc motifs/450 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4000 N= 8 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.269 C 0.251 G 0.219 T 0.262 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.250 G 0.219 T 0.262 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 5 llr = 88 E-value = 1.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::2:2:::4::2:::442:: pos.-specific C :::22::::::62:224:a: probability G 22286a44:8:28:8::::: matrix T 886:::6662a::a:428:a bits 2.2 * 2.0 * * * ** 1.8 * * * ** 1.5 * * * *** ** Relative 1.3 ** * * ** *** ** Entropy 1.1 ** * *** ** *** *** (25.3 bits) 0.9 ** * ****** *** *** 0.7 *************** *** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel TTTGGGTTTGTCGTGAATCT consensus GGACA GGAT AC CTCA sequence G C G CT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 43748 208 4.48e-10 ACATGCTCAC TTTGGGGTTGTCGTGCAACT TTAGGCGCTC 44478 239 8.45e-10 GTTCGAGTTT TTTGGGTTTGTACTGACTCT ACGTGAAACA 55035 214 1.56e-09 CTTCCTGCAT TTAGGGTTAGTCGTCTATCT GTTGTTGCTC 32459 12 1.86e-08 TATTCTGGAT TTGGAGGGATTCGTGATTCT CCGGCATCAC 45367 12 3.68e-08 CGAGAGGTCG GGTCCGTGTGTGGTGTCTCT GGCCTTTTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43748 4.5e-10 207_[+1]_273 44478 8.4e-10 238_[+1]_242 55035 1.6e-09 213_[+1]_267 32459 1.9e-08 11_[+1]_469 45367 3.7e-08 11_[+1]_469 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=5 43748 ( 208) TTTGGGGTTGTCGTGCAACT 1 44478 ( 239) TTTGGGTTTGTACTGACTCT 1 55035 ( 214) TTAGGGTTAGTCGTCTATCT 1 32459 ( 12) TTGGAGGGATTCGTGATTCT 1 45367 ( 12) GGTCCGTGTGTGGTGTCTCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 3848 bayes= 9.83793 E= 1.7e+002 -897 -897 -13 161 -897 -897 -13 161 -42 -897 -13 119 -897 -32 187 -897 -42 -32 145 -897 -897 -897 219 -897 -897 -897 87 119 -897 -897 87 119 57 -897 -897 119 -897 -897 187 -39 -897 -897 -897 193 -42 126 -13 -897 -897 -32 187 -897 -897 -897 -897 193 -897 -32 187 -897 57 -32 -897 61 57 67 -897 -39 -42 -897 -897 161 -897 199 -897 -897 -897 -897 -897 193 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 5 E= 1.7e+002 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.800000 0.200000 0.000000 0.200000 0.600000 0.000000 0.200000 0.800000 0.000000 0.200000 0.200000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.400000 0.600000 0.400000 0.000000 0.000000 0.600000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.200000 0.600000 0.200000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.800000 0.000000 0.400000 0.200000 0.000000 0.400000 0.400000 0.400000 0.000000 0.200000 0.200000 0.000000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TG][TG][TAG][GC][GAC]G[TG][TG][TA][GT]T[CAG][GC]T[GC][ATC][ACT][TA]CT -------------------------------------------------------------------------------- Time 0.52 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 4 llr = 61 E-value = 6.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 8:3:::::::3::: pos.-specific C :::3:38::3:a:a probability G 3a88:83aa83:3: matrix T ::::a:::::5:8: bits 2.2 * ** 2.0 * * ** * * 1.8 * * ** * * 1.5 * * ** * * Relative 1.3 ********* * * Entropy 1.1 ********** *** (21.9 bits) 0.9 ********** *** 0.7 ********** *** 0.4 ************** 0.2 ************** 0.0 -------------- Multilevel AGGGTGCGGGTCTC consensus G AC CG CA G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 55035 40 3.20e-08 ATGAATGGAT AGGGTCCGGGGCTC CTGGGGGACT 34423 112 4.64e-08 GGAACAGACA AGAGTGCGGGACTC AGAATATATG 44539 148 5.15e-08 CACAAACCCT GGGCTGCGGGTCTC GAACAGAAGT 44478 214 1.59e-07 CCTTGTAGAC AGGGTGGGGCTCGC TGTTCGAGTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 55035 3.2e-08 39_[+2]_447 34423 4.6e-08 111_[+2]_375 44539 5.2e-08 147_[+2]_339 44478 1.6e-07 213_[+2]_273 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=4 55035 ( 40) AGGGTCCGGGGCTC 1 34423 ( 112) AGAGTGCGGGACTC 1 44539 ( 148) GGGCTGCGGGTCTC 1 44478 ( 214) AGGGTGGGGCTCGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 3896 bayes= 9.9263 E= 6.9e+001 148 -865 19 -865 -865 -865 219 -865 -10 -865 177 -865 -865 0 177 -865 -865 -865 -865 193 -865 0 177 -865 -865 158 19 -865 -865 -865 219 -865 -865 -865 219 -865 -865 0 177 -865 -10 -865 19 93 -865 199 -865 -865 -865 -865 19 152 -865 199 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 4 E= 6.9e+001 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.250000 0.000000 0.250000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AG]G[GA][GC]T[GC][CG]GG[GC][TAG]C[TG]C -------------------------------------------------------------------------------- Time 1.09 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 8 llr = 97 E-value = 4.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :34:1491981:3::4 pos.-specific C 93151114:::94:9: probability G :::58::513::3a16 matrix T 155::5::::911::: bits 2.2 * 2.0 * 1.8 * 1.5 * * ** Relative 1.3 * * * ** ** Entropy 1.1 * ** * **** *** (17.4 bits) 0.9 * ** * **** *** 0.7 * ** ****** *** 0.4 ************ *** 0.2 ************ *** 0.0 ---------------- Multilevel CTTCGTAGAATCCGCG consensus AAG A C G A A sequence C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 32459 227 6.36e-08 CTGATTGCGT CTTCGTAGGATCCGCG TTTCATCTGG 44539 218 3.38e-07 GGTTCCATTC CTCGGCAGAATCAGCG AATCAGAACT 34423 390 3.79e-07 CCGAGATAGG CCACGACGAATCCGCG ACCCGGTACG 44478 445 4.66e-07 GAATCCAACT CATCGTACAAACCGCG ATCCAATAAC 31533 114 5.18e-07 TACCGCAAAT CAAGGTACAGTCAGCA AAGGTAAATC 45367 473 1.48e-06 AAAGAAAGTC CTTGGAAGAATTGGGG AGTGGACACA 43748 90 5.69e-06 TTTTCTCTAG CTAGAAAAAATCTGCA GCGTAACATT 55035 414 1.16e-05 GCAACAACTG TCTCCTACAGTCGGCA GTCGTCTATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32459 6.4e-08 226_[+3]_258 44539 3.4e-07 217_[+3]_267 34423 3.8e-07 389_[+3]_95 44478 4.7e-07 444_[+3]_40 31533 5.2e-07 113_[+3]_371 45367 1.5e-06 472_[+3]_12 43748 5.7e-06 89_[+3]_395 55035 1.2e-05 413_[+3]_71 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=8 32459 ( 227) CTTCGTAGGATCCGCG 1 44539 ( 218) CTCGGCAGAATCAGCG 1 34423 ( 390) CCACGACGAATCCGCG 1 44478 ( 445) CATCGTACAAACCGCG 1 31533 ( 114) CAAGGTACAGTCAGCA 1 45367 ( 473) CTTGGAAGAATTGGGG 1 43748 ( 90) CTAGAAAAAATCTGCA 1 55035 ( 414) TCTCCTACAGTCGGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 3880 bayes= 9.65702 E= 4.9e+002 -965 180 -965 -107 -10 0 -965 93 48 -100 -965 93 -965 100 119 -965 -110 -100 177 -965 48 -100 -965 93 170 -100 -965 -965 -110 58 119 -965 170 -965 -81 -965 148 -965 19 -965 -110 -965 -965 174 -965 180 -965 -107 -10 58 19 -107 -965 -965 219 -965 -965 180 -81 -965 48 -965 151 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 4.9e+002 0.000000 0.875000 0.000000 0.125000 0.250000 0.250000 0.000000 0.500000 0.375000 0.125000 0.000000 0.500000 0.000000 0.500000 0.500000 0.000000 0.125000 0.125000 0.750000 0.000000 0.375000 0.125000 0.000000 0.500000 0.875000 0.125000 0.000000 0.000000 0.125000 0.375000 0.500000 0.000000 0.875000 0.000000 0.125000 0.000000 0.750000 0.000000 0.250000 0.000000 0.125000 0.000000 0.000000 0.875000 0.000000 0.875000 0.000000 0.125000 0.250000 0.375000 0.250000 0.125000 0.000000 0.000000 1.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.375000 0.000000 0.625000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[TAC][TA][CG]G[TA]A[GC]A[AG]TC[CAG]GC[GA] -------------------------------------------------------------------------------- Time 1.61 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31533 1.37e-03 113_[+3(5.18e-07)]_371 32459 6.26e-09 11_[+1(1.86e-08)]_195_\ [+3(6.36e-08)]_258 43748 1.28e-07 89_[+3(5.69e-06)]_102_\ [+1(4.48e-10)]_273 55035 3.25e-11 39_[+2(3.20e-08)]_160_\ [+1(1.56e-09)]_180_[+3(1.16e-05)]_71 44478 4.02e-12 213_[+2(1.59e-07)]_11_\ [+1(8.45e-10)]_186_[+3(4.66e-07)]_40 34423 5.82e-07 111_[+2(4.64e-08)]_92_\ [+3(6.23e-05)]_156_[+3(3.79e-07)]_95 45367 8.98e-07 11_[+1(3.68e-08)]_441_\ [+3(1.48e-06)]_12 44539 6.93e-07 147_[+2(5.15e-08)]_56_\ [+3(3.38e-07)]_267 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************