******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/445/445.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 36390 1.0000 500 46823 1.0000 500 48701 1.0000 500 54954 1.0000 500 49163 1.0000 500 50012 1.0000 500 44387 1.0000 500 39465 1.0000 500 49887 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/445/445.seqs.fa -oc motifs/445 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.282 C 0.238 G 0.195 T 0.285 Background letter frequencies (from dataset with add-one prior applied): A 0.282 C 0.238 G 0.195 T 0.285 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 6 llr = 89 E-value = 1.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :8::5:::a778:::: pos.-specific C 2:37::a7:3::a3:8 probability G 8:::33:3::32:::2 matrix T :27327:::::::7a: bits 2.4 2.1 * * 1.9 * * * * 1.6 * * * * * Relative 1.4 * * * * ** Entropy 1.2 ** * **** *** ** (21.3 bits) 0.9 **** *********** 0.7 **** *********** 0.5 **************** 0.2 **************** 0.0 ---------------- Multilevel GATCATCCAAAACTTC consensus CTGG G CG C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 50012 21 1.87e-08 GTTGTCACTT GATTGGCCAAAACTTC CCCCGACTAC 39465 105 3.22e-08 GCGCATCATT GATCTTCCACAACTTC CATAATCTCA 49887 377 3.67e-08 CGACGGTCTG GATCATCCAAAGCCTC AATCCTACAT 49163 105 1.04e-07 AAATAGAACG GACCATCGAAAACTTG CTTCTAGTGA 48701 359 1.81e-07 CCCGAAAGGA GACTGGCCACGACCTC GAACGCCAAT 44387 105 4.61e-07 AGAATGTCTC CTTCATCGAAGACTTC ACAGCTTGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50012 1.9e-08 20_[+1]_464 39465 3.2e-08 104_[+1]_380 49887 3.7e-08 376_[+1]_108 49163 1e-07 104_[+1]_380 48701 1.8e-07 358_[+1]_126 44387 4.6e-07 104_[+1]_380 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=6 50012 ( 21) GATTGGCCAAAACTTC 1 39465 ( 105) GATCTTCCACAACTTC 1 49887 ( 377) GATCATCCAAAGCCTC 1 49163 ( 105) GACCATCGAAAACTTG 1 48701 ( 359) GACTGGCCACGACCTC 1 44387 ( 105) CTTCATCGAAGACTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4365 bayes= 9.0186 E= 1.1e+002 -923 -51 209 -923 156 -923 -923 -77 -923 49 -923 122 -923 149 -923 23 83 -923 77 -77 -923 -923 77 122 -923 207 -923 -923 -923 149 77 -923 183 -923 -923 -923 124 49 -923 -923 124 -923 77 -923 156 -923 -23 -923 -923 207 -923 -923 -923 49 -923 122 -923 -923 -923 181 -923 181 -23 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 1.1e+002 0.000000 0.166667 0.833333 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.333333 0.000000 0.666667 0.000000 0.666667 0.000000 0.333333 0.500000 0.000000 0.333333 0.166667 0.000000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.166667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GA[TC][CT][AG][TG]C[CG]A[AC][AG]AC[TC]TC -------------------------------------------------------------------------------- Time 0.85 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 4 llr = 79 E-value = 1.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::aa::::5835:3a388a: pos.-specific C a3::a:::3:55:3:5:::a probability G ::::::a:33::a3:3:::: matrix T :8:::a:a::3::3::33:: bits 2.4 * * 2.1 * * * * * 1.9 * ****** * * ** 1.6 * ****** * * ** Relative 1.4 * ****** * * ** Entropy 1.2 ******** * * * ** (28.6 bits) 0.9 ******** * ** * **** 0.7 ******** * ** * **** 0.5 ************* ****** 0.2 ************* ****** 0.0 -------------------- Multilevel CTAACTGTAACAGAACAAAC consensus C CGAC C ATT sequence G T G G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 36390 288 1.67e-10 CTTCGATGGA CTAACTGTAACAGTAAAAAC TTCCCAGGCA 49163 400 2.54e-10 TGTATAACAT CTAACTGTAACAGAACTAAC AGACGTTTTC 44387 174 1.19e-09 GCAGTGTTGG CTAACTGTGAACGGAGATAC AAAACCAATT 49887 441 1.94e-09 TAATTGGGCT CCAACTGTCGTCGCACAAAC TGATCGACCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36390 1.7e-10 287_[+2]_193 49163 2.5e-10 399_[+2]_81 44387 1.2e-09 173_[+2]_307 49887 1.9e-09 440_[+2]_40 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=4 36390 ( 288) CTAACTGTAACAGTAAAAAC 1 49163 ( 400) CTAACTGTAACAGAACTAAC 1 44387 ( 174) CTAACTGTGAACGGAGATAC 1 49887 ( 441) CCAACTGTCGTCGCACAAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4329 bayes= 10.0785 E= 1.8e+002 -865 207 -865 -865 -865 7 -865 139 182 -865 -865 -865 182 -865 -865 -865 -865 207 -865 -865 -865 -865 -865 181 -865 -865 235 -865 -865 -865 -865 181 83 7 35 -865 141 -865 35 -865 -17 107 -865 -19 83 107 -865 -865 -865 -865 235 -865 -17 7 35 -19 182 -865 -865 -865 -17 107 35 -865 141 -865 -865 -19 141 -865 -865 -19 182 -865 -865 -865 -865 207 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 1.8e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.500000 0.250000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.250000 0.500000 0.000000 0.250000 0.500000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.250000 0.250000 0.250000 1.000000 0.000000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.750000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.250000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[TC]AACTGT[ACG][AG][CAT][AC]G[ACGT]A[CAG][AT][AT]AC -------------------------------------------------------------------------------- Time 1.59 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 6 llr = 96 E-value = 1.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::2722:255:7:::::2: pos.-specific C :77:33:::28::::a3:: probability G a3:::57352::::::28: matrix T ::235:35:223aaa:5:a bits 2.4 * 2.1 * * 1.9 * **** * 1.6 * **** ** Relative 1.4 * * **** ** Entropy 1.2 ** * * * **** ** (23.2 bits) 0.9 ** * * * ****** ** 0.7 **** ** * ****** ** 0.5 ********* ********* 0.2 ******************* 0.0 ------------------- Multilevel GCCATGGTAACATTTCTGT consensus G TCCTGG T C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 49887 88 7.74e-09 AAACGGAGAC GGCACGTTGCCATTTCCGT TCGAGGGTAC 39465 403 1.76e-08 ACGGAACAAG GCTTTGGTATCATTTCTGT GCGTGCCTCC 44387 474 1.76e-08 ATCCATTACG GCCATCGGAGCATTTCTAT CTGCTCTG 46823 161 1.93e-08 GGGAGAGCGA GCAATGGAAACATTTCGGT GAAGTGGTTG 54954 166 2.53e-08 ATTGCTTTCA GCCTACTTGACTTTTCTGT TCCTTAAAAA 36390 384 6.37e-08 AACCGGCATT GGCACAGGGATTTTTCCGT GTAGACTCTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49887 7.7e-09 87_[+3]_394 39465 1.8e-08 402_[+3]_79 44387 1.8e-08 473_[+3]_8 46823 1.9e-08 160_[+3]_321 54954 2.5e-08 165_[+3]_316 36390 6.4e-08 383_[+3]_98 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=6 49887 ( 88) GGCACGTTGCCATTTCCGT 1 39465 ( 403) GCTTTGGTATCATTTCTGT 1 44387 ( 474) GCCATCGGAGCATTTCTAT 1 46823 ( 161) GCAATGGAAACATTTCGGT 1 54954 ( 166) GCCTACTTGACTTTTCTGT 1 36390 ( 384) GGCACAGGGATTTTTCCGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 4338 bayes= 9.94385 E= 1.3e+002 -923 -923 235 -923 -923 149 77 -923 -76 149 -923 -77 124 -923 -923 23 -76 49 -923 81 -76 49 135 -923 -923 -923 177 23 -76 -923 77 81 83 -923 135 -923 83 -51 -23 -77 -923 181 -923 -77 124 -923 -923 23 -923 -923 -923 181 -923 -923 -923 181 -923 -923 -923 181 -923 207 -923 -923 -923 49 -23 81 -76 -923 209 -923 -923 -923 -923 181 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 6 E= 1.3e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.166667 0.666667 0.000000 0.166667 0.666667 0.000000 0.000000 0.333333 0.166667 0.333333 0.000000 0.500000 0.166667 0.333333 0.500000 0.000000 0.000000 0.000000 0.666667 0.333333 0.166667 0.000000 0.333333 0.500000 0.500000 0.000000 0.500000 0.000000 0.500000 0.166667 0.166667 0.166667 0.000000 0.833333 0.000000 0.166667 0.666667 0.000000 0.000000 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.166667 0.500000 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[CG]C[AT][TC][GC][GT][TG][AG]AC[AT]TTTC[TC]GT -------------------------------------------------------------------------------- Time 2.37 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36390 6.31e-10 287_[+2(1.67e-10)]_76_\ [+3(6.37e-08)]_98 46823 2.59e-04 160_[+3(1.93e-08)]_321 48701 5.79e-04 358_[+1(1.81e-07)]_126 54954 1.19e-04 165_[+3(2.53e-08)]_316 49163 1.23e-09 104_[+1(1.04e-07)]_279_\ [+2(2.54e-10)]_81 50012 1.48e-04 20_[+1(1.87e-08)]_464 44387 6.81e-13 104_[+1(4.61e-07)]_53_\ [+2(1.19e-09)]_280_[+3(1.76e-08)]_8 39465 3.30e-08 104_[+1(3.22e-08)]_282_\ [+3(1.76e-08)]_79 49887 4.55e-14 87_[+3(7.74e-09)]_270_\ [+1(3.67e-08)]_48_[+2(1.94e-09)]_40 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************