******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/451/451.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 45165 1.0000 500 36990 1.0000 500 35092 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/451/451.seqs.fa -oc motifs/451 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 3 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 1500 N= 3 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.263 C 0.227 G 0.223 T 0.288 Background letter frequencies (from dataset with add-one prior applied): A 0.263 C 0.227 G 0.223 T 0.288 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 14 sites = 3 llr = 46 E-value = 1.5e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 3:3aaa::3:37:3 pos.-specific C 7a::::aa::7377 probability G ::7:::::7:::3: matrix T :::::::::a:::: bits 2.2 * ** 2.0 * ***** 1.7 * ***** * 1.5 * ***** * Relative 1.3 * ***** * * Entropy 1.1 ************** (22.1 bits) 0.9 ************** 0.7 ************** 0.4 ************** 0.2 ************** 0.0 -------------- Multilevel CCGAAACCGTCACC consensus A A A ACGA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 35092 143 6.92e-08 GGGTATGACA CCGAAACCATAACC AATTTATCGG 36990 175 6.92e-08 TATGAGTACA ACAAAACCGTCACC GTAGGTAGTT 45165 478 7.84e-08 TGGGTCTCTG CCGAAACCGTCCGA AAAACCCTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35092 6.9e-08 142_[+1]_344 36990 6.9e-08 174_[+1]_312 45165 7.8e-08 477_[+1]_9 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=14 seqs=3 35092 ( 143) CCGAAACCATAACC 1 36990 ( 175) ACAAAACCGTCACC 1 45165 ( 478) CCGAAACCGTCCGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 1461 bayes= 8.92481 E= 1.5e+003 34 155 -823 -823 -823 214 -823 -823 34 -823 158 -823 192 -823 -823 -823 192 -823 -823 -823 192 -823 -823 -823 -823 214 -823 -823 -823 214 -823 -823 34 -823 158 -823 -823 -823 -823 179 34 155 -823 -823 134 56 -823 -823 -823 155 58 -823 34 155 -823 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 3 E= 1.5e+003 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.333333 0.666667 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.333333 0.666667 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CA]C[GA]AAACC[GA]T[CA][AC][CG][CA] -------------------------------------------------------------------------------- Time 0.11 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 3 llr = 50 E-value = 2.7e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A a::::333:7::a::a pos.-specific C :::3a3::3:3::::: probability G ::a::3:37:7:::a: matrix T :a:7::73:3:a:a:: bits 2.2 * * * 2.0 * * * * ** 1.7 *** * ***** 1.5 *** * ***** Relative 1.3 *** * * ****** Entropy 1.1 ***** * ****** (23.9 bits) 0.9 ***** * ******** 0.7 ***** * ******** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel ATGTCATAGAGTATGA consensus C CAGCTC sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 36990 154 9.87e-09 GGTCAAAACG ATGTCGTACAGTATGA GTACAACAAA 35092 228 2.26e-08 TAGAAAAGTT ATGTCATTGTGTATGA CTCCCTTTTA 45165 430 4.20e-08 ATCACCCGAA ATGCCCAGGACTATGA AACACTTCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36990 9.9e-09 153_[+2]_331 35092 2.3e-08 227_[+2]_257 45165 4.2e-08 429_[+2]_55 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=3 36990 ( 154) ATGTCGTACAGTATGA 1 35092 ( 228) ATGTCATTGTGTATGA 1 45165 ( 430) ATGCCCAGGACTATGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 1455 bayes= 8.91886 E= 2.7e+003 192 -823 -823 -823 -823 -823 -823 179 -823 -823 216 -823 -823 56 -823 121 -823 214 -823 -823 34 56 58 -823 34 -823 -823 121 34 -823 58 21 -823 56 158 -823 134 -823 -823 21 -823 56 158 -823 -823 -823 -823 179 192 -823 -823 -823 -823 -823 -823 179 -823 -823 216 -823 192 -823 -823 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 3 E= 2.7e+003 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.333333 0.333333 0.333333 0.000000 0.333333 0.000000 0.000000 0.666667 0.333333 0.000000 0.333333 0.333333 0.000000 0.333333 0.666667 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- ATG[TC]C[ACG][TA][AGT][GC][AT][GC]TATGA -------------------------------------------------------------------------------- Time 0.21 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 13 sites = 3 llr = 42 E-value = 2.3e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :a:a3::::a:3: pos.-specific C 3:::77:a7:3:: probability G 7:7::3a:3:::a matrix T ::3:::::::77: bits 2.2 ** * 2.0 * * ** * * 1.7 * * ** * * 1.5 * * ** * * Relative 1.3 ** * ***** * Entropy 1.1 *********** * (20.2 bits) 0.9 ************* 0.7 ************* 0.4 ************* 0.2 ************* 0.0 ------------- Multilevel GAGACCGCCATTG consensus C T AG G CA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- 36990 38 1.11e-07 ACACACGTTC GAGACCGCCACAG ACAGGAGCTT 35092 71 1.76e-07 TGTCTTACAA CAGACGGCCATTG TGTATGGTTA 45165 378 6.88e-07 TGATTCTCGT GATAACGCGATTG ACCGTATTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36990 1.1e-07 37_[+3]_450 35092 1.8e-07 70_[+3]_417 45165 6.9e-07 377_[+3]_110 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=13 seqs=3 36990 ( 38) GAGACCGCCACAG 1 35092 ( 71) CAGACGGCCATTG 1 45165 ( 378) GATAACGCGATTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 1464 bayes= 8.92778 E= 2.3e+003 -823 56 158 -823 192 -823 -823 -823 -823 -823 158 21 192 -823 -823 -823 34 155 -823 -823 -823 155 58 -823 -823 -823 216 -823 -823 214 -823 -823 -823 155 58 -823 192 -823 -823 -823 -823 56 -823 121 34 -823 -823 121 -823 -823 216 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 3 E= 2.3e+003 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.333333 0.000000 0.000000 0.666667 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC]A[GT]A[CA][CG]GC[CG]A[TC][TA]G -------------------------------------------------------------------------------- Time 0.32 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45165 1.18e-10 377_[+3(6.88e-07)]_39_\ [+2(4.20e-08)]_32_[+1(7.84e-08)]_9 36990 4.87e-12 37_[+3(1.11e-07)]_103_\ [+2(9.87e-09)]_5_[+1(6.92e-08)]_312 35092 1.64e-11 70_[+3(1.76e-07)]_41_[+2(3.75e-06)]_\ 2_[+1(6.92e-08)]_71_[+2(2.26e-08)]_257 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************