******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/459/459.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 15297 1.0000 500 23709 1.0000 500 7666 1.0000 500 45551 1.0000 500 36449 1.0000 500 47008 1.0000 500 48077 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/459/459.seqs.fa -oc motifs/459 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.245 C 0.262 G 0.229 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.245 C 0.262 G 0.229 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 4 llr = 80 E-value = 1.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 88::a3:8:3:53:a:5:3:8 pos.-specific C :3a::5::3:3:8::538::: probability G 3::a:3a338:::a::338a: matrix T ::::::::5:85:::5::::3 bits 2.1 ** * ** * 1.9 *** * ** * 1.7 *** * ** * 1.5 *** * ** * Relative 1.3 ***** ** * ** **** Entropy 1.1 ***** ** ****** **** (28.8 bits) 0.8 ***** ** ******* **** 0.6 ***** ** ******* **** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel AACGACGATGTACGACACGGA consensus GC A GCACTA TCGA T sequence G G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45551 285 2.69e-10 TGTGAGCAAG AACGACGACATACGACACAGA CACGCATGCT 7666 26 4.35e-10 GCGTTGGACC GACGAAGATGTTCGATGGGGA AACGGTTCGA 48077 357 6.11e-10 CCTGTAAGGT AACGACGGGGTTCGATCCGGT TTCGACCTCT 23709 114 6.75e-10 CACCCTTCCG ACCGAGGATGCAAGACACGGA GTCACGTCGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45551 2.7e-10 284_[+1]_195 7666 4.4e-10 25_[+1]_454 48077 6.1e-10 356_[+1]_123 23709 6.8e-10 113_[+1]_366 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=4 45551 ( 285) AACGACGACATACGACACAGA 1 7666 ( 26) GACGAAGATGTTCGATGGGGA 1 48077 ( 357) AACGACGGGGTTCGATCCGGT 1 23709 ( 114) ACCGAGGATGCAAGACACGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3360 bayes= 9.71253 E= 1.9e+002 161 -865 12 -865 161 -7 -865 -865 -865 193 -865 -865 -865 -865 212 -865 202 -865 -865 -865 3 93 12 -865 -865 -865 212 -865 161 -865 12 -865 -865 -7 12 92 3 -865 171 -865 -865 -7 -865 151 102 -865 -865 92 3 152 -865 -865 -865 -865 212 -865 202 -865 -865 -865 -865 93 -865 92 102 -7 12 -865 -865 152 12 -865 3 -865 171 -865 -865 -865 212 -865 161 -865 -865 -8 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 1.9e+002 0.750000 0.000000 0.250000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.250000 0.250000 0.500000 0.250000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 0.750000 0.500000 0.000000 0.000000 0.500000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.500000 0.250000 0.250000 0.000000 0.000000 0.750000 0.250000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.000000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AG][AC]CGA[CAG]G[AG][TCG][GA][TC][AT][CA]GA[CT][ACG][CG][GA]G[AT] -------------------------------------------------------------------------------- Time 0.54 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 13 sites = 4 llr = 60 E-value = 2.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :5:::8:a:::aa pos.-specific C ::8:a:a::5a:: probability G a3::::::a:::: matrix T :33a:3:::5::: bits 2.1 * ** ** 1.9 * ** *** *** 1.7 * ** *** *** 1.5 * ** *** *** Relative 1.3 * ****** *** Entropy 1.1 * ******* *** (21.8 bits) 0.8 * *********** 0.6 * *********** 0.4 ************* 0.2 ************* 0.0 ------------- Multilevel GACTCACAGCCAA consensus GT T T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- 36449 428 3.01e-08 GCGTGTTTTC GACTCACAGTCAA CCGGCAGAAA 7666 465 5.83e-08 ACACCACAAC GGCTCACAGCCAA TTCCACACGC 45551 233 9.06e-08 CCGTTTGTTT GTCTCACAGTCAA TACTTCGAAG 47008 455 3.11e-07 GTCGCCTTTT GATTCTCAGCCAA GTAATCGCAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36449 3e-08 427_[+2]_60 7666 5.8e-08 464_[+2]_23 45551 9.1e-08 232_[+2]_255 47008 3.1e-07 454_[+2]_33 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=13 seqs=4 36449 ( 428) GACTCACAGTCAA 1 7666 ( 465) GGCTCACAGCCAA 1 45551 ( 233) GTCTCACAGTCAA 1 47008 ( 455) GATTCTCAGCCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 3416 bayes= 9.7364 E= 2.5e+002 -865 -865 212 -865 102 -865 12 -8 -865 152 -865 -8 -865 -865 -865 192 -865 193 -865 -865 161 -865 -865 -8 -865 193 -865 -865 202 -865 -865 -865 -865 -865 212 -865 -865 93 -865 92 -865 193 -865 -865 202 -865 -865 -865 202 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 4 E= 2.5e+002 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.250000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[AGT][CT]TC[AT]CAG[CT]CAA -------------------------------------------------------------------------------- Time 1.05 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 7 llr = 79 E-value = 6.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1a:::3:::a:: pos.-specific C 6:6::::9::6: probability G 1:4494:1a:3: matrix T 1::613a:::1a bits 2.1 * ** 1.9 * * ** * 1.7 * * ** * 1.5 * * * ** * Relative 1.3 * * **** * Entropy 1.1 **** **** * (16.4 bits) 0.8 **** **** * 0.6 **** ****** 0.4 *********** 0.2 ************ 0.0 ------------ Multilevel CACTGGTCGACT consensus GG A G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47008 314 6.47e-07 TTTTCCTTGG CAGTGTTCGACT GGATTTGCTT 36449 178 1.11e-06 ATTCTCTGTT CAGTGATCGAGT TGTGCTGACC 45551 111 1.27e-06 TAGCTAGCTA GACTGGTCGACT CATCCGTGCT 15297 413 1.99e-06 TATTGAAATG CACGGGTCGATT GCCAGAACCA 7666 47 3.86e-06 TCGATGGGGA AACGGTTCGACT CGGAATCGGT 48077 17 5.16e-06 GCCTCTGCCC CAGTGATGGACT TGGGATACTC 23709 387 1.74e-05 TTCGTATCGA TACGTGTCGAGT ACCTTCGCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47008 6.5e-07 313_[+3]_175 36449 1.1e-06 177_[+3]_311 45551 1.3e-06 110_[+3]_378 15297 2e-06 412_[+3]_76 7666 3.9e-06 46_[+3]_442 48077 5.2e-06 16_[+3]_472 23709 1.7e-05 386_[+3]_102 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=7 47008 ( 314) CAGTGTTCGACT 1 36449 ( 178) CAGTGATCGAGT 1 45551 ( 111) GACTGGTCGACT 1 15297 ( 413) CACGGGTCGATT 1 7666 ( 47) AACGGTTCGACT 1 48077 ( 17) CAGTGATGGACT 1 23709 ( 387) TACGTGTCGAGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3423 bayes= 9.53747 E= 6.3e+001 -78 113 -68 -88 203 -945 -945 -945 -945 113 90 -945 -945 -945 90 112 -945 -945 190 -88 22 -945 90 12 -945 -945 -945 192 -945 171 -68 -945 -945 -945 212 -945 203 -945 -945 -945 -945 113 32 -88 -945 -945 -945 192 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 6.3e+001 0.142857 0.571429 0.142857 0.142857 1.000000 0.000000 0.000000 0.000000 0.000000 0.571429 0.428571 0.000000 0.000000 0.000000 0.428571 0.571429 0.000000 0.000000 0.857143 0.142857 0.285714 0.000000 0.428571 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.571429 0.285714 0.142857 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CA[CG][TG]G[GAT]TCGA[CG]T -------------------------------------------------------------------------------- Time 1.51 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 15297 2.07e-02 412_[+3(1.99e-06)]_76 23709 3.72e-07 113_[+1(6.75e-10)]_252_\ [+3(1.74e-05)]_102 7666 6.15e-12 25_[+1(4.35e-10)]_[+3(3.86e-06)]_\ 406_[+2(5.83e-08)]_23 45551 2.08e-12 110_[+3(1.27e-06)]_110_\ [+2(9.06e-08)]_39_[+1(2.69e-10)]_195 36449 1.33e-06 177_[+3(1.11e-06)]_238_\ [+2(3.01e-08)]_60 47008 6.52e-06 313_[+3(6.47e-07)]_129_\ [+2(3.11e-07)]_33 48077 5.45e-08 16_[+3(5.16e-06)]_328_\ [+1(6.11e-10)]_123 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************