******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/272/272.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 50160 1.0000 500 34280 1.0000 500 45516 1.0000 500 12051 1.0000 500 42883 1.0000 500 32198 1.0000 500 39497 1.0000 500 45819 1.0000 500 46096 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/272/272.seqs.fa -oc motifs/272 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.253 G 0.216 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.253 G 0.216 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 9 llr = 98 E-value = 8.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 2221::8::::: pos.-specific C :::1:8:2:a:1 probability G 782:12:::::1 matrix T 1:689:28a:a8 bits 2.2 2.0 *** 1.8 *** 1.5 * *** Relative 1.3 * ** *** Entropy 1.1 * ******* (15.7 bits) 0.9 ** ********* 0.7 ** ********* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GGTTTCATTCTT consensus AAA GTC sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 39497 360 1.19e-07 CGGCCGTAAA GGGTTCATTCTT GATCGAGGAC 12051 11 1.87e-07 CGGCGACGAC GGATTCATTCTT CCATCACATT 46096 100 1.12e-06 CTCTTCGTCG GGTTTCATTCTC CATCGTTGCT 34280 312 2.55e-06 GGAAGGGTAG GATTTCACTCTT GTACGTTACA 45516 353 4.56e-06 TAGTTGGGTG GGTCTCTTTCTT TTGTTTTGGG 42883 262 9.77e-06 AAGAAGATCG AGGATCATTCTT GCCACGTGCG 50160 151 1.19e-05 ACAGGAAATC GAATGCATTCTT CTGGTTCGCG 45819 304 1.82e-05 CAAAAGAAAA AGTTTGTCTCTT TAAGTACAAA 32198 234 1.82e-05 TCCTTCCACG TGTTTGATTCTG CGGACTCCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39497 1.2e-07 359_[+1]_129 12051 1.9e-07 10_[+1]_478 46096 1.1e-06 99_[+1]_389 34280 2.6e-06 311_[+1]_177 45516 4.6e-06 352_[+1]_136 42883 9.8e-06 261_[+1]_227 50160 1.2e-05 150_[+1]_338 45819 1.8e-05 303_[+1]_185 32198 1.8e-05 233_[+1]_255 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=9 39497 ( 360) GGGTTCATTCTT 1 12051 ( 11) GGATTCATTCTT 1 46096 ( 100) GGTTTCATTCTC 1 34280 ( 312) GATTTCACTCTT 1 45516 ( 353) GGTCTCTTTCTT 1 42883 ( 262) AGGATCATTCTT 1 50160 ( 151) GAATGCATTCTT 1 45819 ( 304) AGTTTGTCTCTT 1 32198 ( 234) TGTTTGATTCTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4401 bayes= 9.0653 E= 8.2e+000 -29 -982 163 -123 -29 -982 185 -982 -29 -982 4 109 -128 -118 -982 158 -982 -982 -96 177 -982 162 4 -982 152 -982 -982 -23 -982 -18 -982 158 -982 -982 -982 194 -982 198 -982 -982 -982 -982 -982 194 -982 -118 -96 158 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 8.2e+000 0.222222 0.000000 0.666667 0.111111 0.222222 0.000000 0.777778 0.000000 0.222222 0.000000 0.222222 0.555556 0.111111 0.111111 0.000000 0.777778 0.000000 0.000000 0.111111 0.888889 0.000000 0.777778 0.222222 0.000000 0.777778 0.000000 0.000000 0.222222 0.000000 0.222222 0.000000 0.777778 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.111111 0.111111 0.777778 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA][GA][TAG]TT[CG][AT][TC]TCTT -------------------------------------------------------------------------------- Time 0.77 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 7 llr = 91 E-value = 2.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::3446::::::43 pos.-specific C :46::::::a:::1: probability G :61::4::1:97947 matrix T a:37614a9:131:: bits 2.2 2.0 * * * 1.8 * * * 1.5 * * ** * Relative 1.3 * ****** * Entropy 1.1 ** * ****** * (18.8 bits) 0.9 ** ** ******* * 0.7 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel TGCTTAATTCGGGAG consensus CTAAGT T GA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 34280 136 9.44e-09 ATTTATTTGC TCCTTGTTTCGGGGG ACACTTGGTG 46096 56 2.93e-07 CTGAGTATTG TGCTTATTTCTGGAG GTCGGCGCTG 12051 175 2.93e-07 CGAACGAACT TGCTAATTGCGGGAG GAGGGAACGA 45819 212 4.60e-07 TAAAATCATT TGTTTTATTCGTGGG TGTCTTACAT 42883 129 5.54e-07 AACGAATACG TCGAAGATTCGGGAG GATTCTCGAT 32198 172 1.61e-06 ATGATTCCAC TGCTTGATTCGGTCA ATGCAATTTT 45516 34 2.38e-06 AGACCGAGCC TCTAAAATTCGTGGA ATGCATCTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34280 9.4e-09 135_[+2]_350 46096 2.9e-07 55_[+2]_430 12051 2.9e-07 174_[+2]_311 45819 4.6e-07 211_[+2]_274 42883 5.5e-07 128_[+2]_357 32198 1.6e-06 171_[+2]_314 45516 2.4e-06 33_[+2]_452 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=7 34280 ( 136) TCCTTGTTTCGGGGG 1 46096 ( 56) TGCTTATTTCTGGAG 1 12051 ( 175) TGCTAATTGCGGGAG 1 45819 ( 212) TGTTTTATTCGTGGG 1 42883 ( 129) TCGAAGATTCGGGAG 1 32198 ( 172) TGCTTGATTCGGTCA 1 45516 ( 34) TCTAAAATTCGTGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4374 bayes= 9.12869 E= 2.4e+002 -945 -945 -945 194 -945 76 140 -945 -945 118 -59 13 8 -945 -945 145 66 -945 -945 113 66 -945 99 -86 107 -945 -945 72 -945 -945 -945 194 -945 -945 -59 172 -945 198 -945 -945 -945 -945 199 -86 -945 -945 173 13 -945 -945 199 -86 66 -82 99 -945 8 -945 173 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 7 E= 2.4e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.428571 0.571429 0.000000 0.000000 0.571429 0.142857 0.285714 0.285714 0.000000 0.000000 0.714286 0.428571 0.000000 0.000000 0.571429 0.428571 0.000000 0.428571 0.142857 0.571429 0.000000 0.000000 0.428571 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.857143 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.714286 0.285714 0.000000 0.000000 0.857143 0.142857 0.428571 0.142857 0.428571 0.000000 0.285714 0.000000 0.714286 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[GC][CT][TA][TA][AG][AT]TTCG[GT]G[AG][GA] -------------------------------------------------------------------------------- Time 1.74 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 3 llr = 57 E-value = 3.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::3::::::3:::3: pos.-specific C aa::::a:::733a:a probability G ::a7:::aa::7::3: matrix T ::::aa:::a::7:3: bits 2.2 * ** 2.0 *** ****** * * 1.8 *** ****** * * 1.5 *** ****** * * Relative 1.3 *** ****** * * * Entropy 1.1 ************** * (27.3 bits) 0.9 ************** * 0.7 ************** * 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CCGGTTCGGTCGTCAC consensus A ACC G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 39497 407 6.70e-10 AAAGGCGTTA CCGGTTCGGTAGTCGC CGATTCGTCG 12051 80 2.30e-09 GCATTGTGGT CCGATTCGGTCGTCTC TCAATAACAA 50160 18 3.86e-09 GTCAGTTTCA CCGGTTCGGTCCCCAC CCTATACAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39497 6.7e-10 406_[+3]_78 12051 2.3e-09 79_[+3]_405 50160 3.9e-09 17_[+3]_467 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=3 39497 ( 407) CCGGTTCGGTAGTCGC 1 12051 ( 80) CCGATTCGGTCGTCTC 1 50160 ( 18) CCGGTTCGGTCCCCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4365 bayes= 10.1645 E= 3.6e+002 -823 198 -823 -823 -823 198 -823 -823 -823 -823 221 -823 30 -823 162 -823 -823 -823 -823 194 -823 -823 -823 194 -823 198 -823 -823 -823 -823 221 -823 -823 -823 221 -823 -823 -823 -823 194 30 140 -823 -823 -823 40 162 -823 -823 40 -823 135 -823 198 -823 -823 30 -823 63 35 -823 198 -823 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 3 E= 3.6e+002 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.333333 0.333333 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CCG[GA]TTCGGT[CA][GC][TC]C[AGT]C -------------------------------------------------------------------------------- Time 2.52 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50160 5.78e-07 17_[+3(3.86e-09)]_117_\ [+1(1.19e-05)]_338 34280 9.27e-07 135_[+2(9.44e-09)]_161_\ [+1(2.55e-06)]_177 45516 1.66e-04 33_[+2(2.38e-06)]_304_\ [+1(4.56e-06)]_136 12051 7.86e-12 10_[+1(1.87e-07)]_57_[+3(2.30e-09)]_\ 79_[+2(2.93e-07)]_311 42883 1.07e-04 128_[+2(5.54e-07)]_118_\ [+1(9.77e-06)]_227 32198 2.28e-04 171_[+2(1.61e-06)]_47_\ [+1(1.82e-05)]_255 39497 4.61e-09 359_[+1(1.19e-07)]_35_\ [+3(6.70e-10)]_78 45819 1.98e-04 211_[+2(4.60e-07)]_77_\ [+1(1.82e-05)]_185 46096 9.74e-06 36_[+1(4.59e-05)]_7_[+2(2.93e-07)]_\ 29_[+1(1.12e-06)]_389 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************