******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/173/173.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 24610 1.0000 500 43240 1.0000 500 15393 1.0000 500 25308 1.0000 500 32747 1.0000 500 49261 1.0000 500 10567 1.0000 500 33304 1.0000 500 18793 1.0000 500 5902 1.0000 500 12605 1.0000 500 46317 1.0000 500 44472 1.0000 500 34355 1.0000 500 48612 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/173/173.seqs.fa -oc motifs/173 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.272 C 0.257 G 0.220 T 0.250 Background letter frequencies (from dataset with add-one prior applied): A 0.272 C 0.257 G 0.220 T 0.250 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 4 llr = 85 E-value = 3.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::3a3:::585:::::::: pos.-specific C ::3a:::a::3:33583::aa probability G :a::8:::aa:335::3:8:: matrix T a:8:::8:::3::3535a3:: bits 2.2 * ** 2.0 ** * * *** * ** 1.7 ** * * *** * ** 1.5 ** * * *** * ** Relative 1.3 ** *** *** **** Entropy 1.1 ********** * * **** (30.5 bits) 0.9 ********** * ** **** 0.7 ********** * *** **** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel TGTCGATCGGAAAGCCTTGCC consensus C A A CGCCTTC T sequence T GT G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 49261 218 4.72e-11 TCTCTCGGGT TGTCAATCGGAAACCCTTGCC ATTGTTGAAA 12605 338 1.53e-10 GAAGACCCTT TGTCGATCGGTACGTCGTTCC TTCACGTGCC 25308 252 2.07e-10 CTCGAACAAG TGCCGATCGGCGAGTCCTGCC CTCGTCATCA 46317 110 2.80e-10 GCACCAATCA TGTCGAACGGAAGTCTTTGCC TGCAAGGAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49261 4.7e-11 217_[+1]_262 12605 1.5e-10 337_[+1]_142 25308 2.1e-10 251_[+1]_228 46317 2.8e-10 109_[+1]_370 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=4 49261 ( 218) TGTCAATCGGAAACCCTTGCC 1 12605 ( 338) TGTCGATCGGTACGTCGTTCC 1 25308 ( 252) TGCCGATCGGCGAGTCCTGCC 1 46317 ( 110) TGTCGAACGGAAGTCTTTGCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 10.813 E= 3.3e+002 -865 -865 -865 199 -865 -865 218 -865 -865 -4 -865 158 -865 196 -865 -865 -12 -865 177 -865 187 -865 -865 -865 -12 -865 -865 158 -865 196 -865 -865 -865 -865 218 -865 -865 -865 218 -865 87 -4 -865 0 146 -865 18 -865 87 -4 18 -865 -865 -4 118 0 -865 96 -865 100 -865 154 -865 0 -865 -4 18 100 -865 -865 -865 199 -865 -865 177 0 -865 196 -865 -865 -865 196 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 3.3e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.250000 0.000000 0.250000 0.750000 0.000000 0.250000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.500000 0.000000 0.500000 0.000000 0.750000 0.000000 0.250000 0.000000 0.250000 0.250000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TG[TC]C[GA]A[TA]CGG[ACT][AG][ACG][GCT][CT][CT][TCG]T[GT]CC -------------------------------------------------------------------------------- Time 1.89 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 6 llr = 102 E-value = 6.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :7:273:a237732::2:::: pos.-specific C ::::23::22:2::2::8::: probability G a:58:2a:73222:7a:2278 matrix T :35:22:::22:582:8:832 bits 2.2 * * * 2.0 * ** * 1.7 * ** * 1.5 * * ** * * Relative 1.3 * * ** * ****** Entropy 1.1 **** ** * ****** (24.5 bits) 0.9 **** *** ******** 0.7 ***** *** ** ******** 0.4 ***** *** *********** 0.2 ***** *** *********** 0.0 --------------------- Multilevel GAGGAAGAGAAATTGGTCTGG consensus TT C G A T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 25308 203 3.95e-10 AAAAATAAAG GAGGACGAGAAATTTGTGTGG CCGCCAAACG 46317 467 2.55e-09 TCGCATCTCC GATACGGAGTAATTGGTCTGG TATCGCGACC 5902 195 4.92e-09 TCCGGTTAAC GATGACGACAAGATGGTCGGG ACCTGATGAG 12605 420 1.05e-08 TCCAGATATC GTGGAAGAAGAAGTGGACTTG TCGCAATTCA 34355 224 2.08e-08 CCAAACAAAT GTGGAAGAGCGAAACGTCTGG GATTCGACCG 48612 318 3.57e-08 TTGACGTCAC GATGTTGAGGTCTTGGTCTTT TAAATGCGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25308 3.9e-10 202_[+2]_277 46317 2.6e-09 466_[+2]_13 5902 4.9e-09 194_[+2]_285 12605 1.1e-08 419_[+2]_60 34355 2.1e-08 223_[+2]_256 48612 3.6e-08 317_[+2]_162 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=6 25308 ( 203) GAGGACGAGAAATTTGTGTGG 1 46317 ( 467) GATACGGAGTAATTGGTCTGG 1 5902 ( 195) GATGACGACAAGATGGTCGGG 1 12605 ( 420) GTGGAAGAAGAAGTGGACTTG 1 34355 ( 224) GTGGAAGAGCGAAACGTCTGG 1 48612 ( 318) GATGTTGAGGTCTTGGTCTTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 10.6754 E= 6.3e+002 -923 -923 218 -923 129 -923 -923 41 -923 -923 118 100 -71 -923 192 -923 129 -62 -923 -59 29 37 -40 -59 -923 -923 218 -923 187 -923 -923 -923 -71 -62 160 -923 29 -62 60 -59 129 -923 -40 -59 129 -62 -40 -923 29 -923 -40 100 -71 -923 -923 173 -923 -62 160 -59 -923 -923 218 -923 -71 -923 -923 173 -923 170 -40 -923 -923 -923 -40 173 -923 -923 160 41 -923 -923 192 -59 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 6.3e+002 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 0.000000 0.500000 0.500000 0.166667 0.000000 0.833333 0.000000 0.666667 0.166667 0.000000 0.166667 0.333333 0.333333 0.166667 0.166667 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.166667 0.666667 0.000000 0.333333 0.166667 0.333333 0.166667 0.666667 0.000000 0.166667 0.166667 0.666667 0.166667 0.166667 0.000000 0.333333 0.000000 0.166667 0.500000 0.166667 0.000000 0.000000 0.833333 0.000000 0.166667 0.666667 0.166667 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.000000 0.833333 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.833333 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[AT][GT]GA[AC]GAG[AG]AA[TA]TGGTCT[GT]G -------------------------------------------------------------------------------- Time 3.77 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 4 llr = 58 E-value = 6.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :3::3::::a:a pos.-specific C :5:8::8a:::: probability G a3a38a3:a:a: matrix T :::::::::::: bits 2.2 * * * * * 2.0 * * * ***** 1.7 * * * ***** 1.5 * * * ***** Relative 1.3 * ********** Entropy 1.1 * ********** (20.8 bits) 0.9 * ********** 0.7 * ********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GCGCGGCCGAGA consensus A GA G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 10567 451 6.79e-08 CCGGTGGCAC GGGCGGCCGAGA CGGACACCAA 5902 1 1.69e-07 . GCGGGGCCGAGA ACAGCACGCC 46317 148 2.15e-07 AAGCCGAAGA GCGCAGCCGAGA ATTGCTCCAA 25308 364 3.34e-07 CGTACCAAGC GAGCGGGCGAGA TAGTACCTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10567 6.8e-08 450_[+3]_38 5902 1.7e-07 [+3]_488 46317 2.1e-07 147_[+3]_341 25308 3.3e-07 363_[+3]_125 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=4 10567 ( 451) GGGCGGCCGAGA 1 5902 ( 1) GCGGGGCCGAGA 1 46317 ( 148) GCGCAGCCGAGA 1 25308 ( 364) GAGCGGGCGAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 10.8398 E= 6.3e+002 -865 -865 218 -865 -12 96 18 -865 -865 -865 218 -865 -865 154 18 -865 -12 -865 177 -865 -865 -865 218 -865 -865 154 18 -865 -865 196 -865 -865 -865 -865 218 -865 187 -865 -865 -865 -865 -865 218 -865 187 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 4 E= 6.3e+002 0.000000 0.000000 1.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[CAG]G[CG][GA]G[CG]CGAGA -------------------------------------------------------------------------------- Time 5.62 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 24610 6.49e-01 500 43240 9.05e-01 500 15393 9.54e-01 500 25308 2.63e-15 202_[+2(3.95e-10)]_28_\ [+1(2.07e-10)]_91_[+3(3.34e-07)]_125 32747 2.65e-01 500 49261 3.83e-07 217_[+1(4.72e-11)]_262 10567 3.26e-04 450_[+3(6.79e-08)]_38 33304 2.88e-01 500 18793 9.83e-01 500 5902 4.43e-08 [+3(1.69e-07)]_182_[+2(4.92e-09)]_\ 285 12605 5.53e-11 337_[+1(1.53e-10)]_61_\ [+2(1.05e-08)]_60 46317 1.36e-14 109_[+1(2.80e-10)]_17_\ [+3(2.15e-07)]_307_[+2(2.55e-09)]_13 44472 9.56e-01 500 34355 4.44e-04 223_[+2(2.08e-08)]_256 48612 3.16e-04 317_[+2(3.57e-08)]_162 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************