******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/127/127.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9400 1.0000 500 46623 1.0000 500 47154 1.0000 500 21873 1.0000 500 47779 1.0000 500 47791 1.0000 500 48834 1.0000 500 54101 1.0000 500 49678 1.0000 500 49919 1.0000 500 40792 1.0000 500 50373 1.0000 500 45751 1.0000 500 45880 1.0000 500 46524 1.0000 500 39243 1.0000 71 49287 1.0000 500 48121 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/127/127.seqs.fa -oc motifs/127 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8571 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.273 C 0.234 G 0.216 T 0.277 Background letter frequencies (from dataset with add-one prior applied): A 0.273 C 0.234 G 0.216 T 0.277 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 4 llr = 66 E-value = 1.4e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :3:::3:::3::3:: pos.-specific C a::::8:::8::3:: probability G ::3:3:a8a::a3aa matrix T :88a8::3::a:3:: bits 2.2 * * * ** 2.0 * * * * ** 1.8 * * * * ** ** 1.5 * * * * ** ** Relative 1.3 * * ******* ** Entropy 1.1 ************ ** (23.9 bits) 0.9 ************ ** 0.7 ************ ** 0.4 ************ ** 0.2 ************ ** 0.0 --------------- Multilevel CTTTTCGGGCTGAGG consensus AG GA T A C sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 48834 330 2.12e-09 TGATGGGAAG CTTTTCGGGCTGAGG TACGAGCTGA 40792 16 1.55e-08 TCTTCGAACA CTTTTCGTGCTGTGG CACTCGCCGA 49919 50 2.23e-08 CTGATTGGCT CTTTGCGGGATGCGG TCAAGTGATA 9400 212 5.52e-08 GCTTTCTTTA CAGTTAGGGCTGGGG TGGGATCCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48834 2.1e-09 329_[+1]_156 40792 1.5e-08 15_[+1]_470 49919 2.2e-08 49_[+1]_436 9400 5.5e-08 211_[+1]_274 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=4 48834 ( 330) CTTTTCGGGCTGAGG 1 40792 ( 16) CTTTTCGTGCTGTGG 1 49919 ( 50) CTTTGCGGGATGCGG 1 9400 ( 212) CAGTTAGGGCTGGGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 8319 bayes= 11.0215 E= 1.4e+003 -865 210 -865 -865 -13 -865 -865 144 -865 -865 21 144 -865 -865 -865 185 -865 -865 21 144 -13 168 -865 -865 -865 -865 221 -865 -865 -865 179 -15 -865 -865 221 -865 -13 168 -865 -865 -865 -865 -865 185 -865 -865 221 -865 -13 10 21 -15 -865 -865 221 -865 -865 -865 221 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 4 E= 1.4e+003 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.250000 0.250000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[TA][TG]T[TG][CA]G[GT]G[CA]TG[ACGT]GG -------------------------------------------------------------------------------- Time 3.58 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 6 llr = 76 E-value = 2.0e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::a::25:aa pos.-specific C 3:82:72:2a:: probability G 2a28::883::: matrix T 5::::3:::::: bits 2.2 * 2.0 * * 1.8 * * *** 1.5 **** ** *** Relative 1.3 **** ** *** Entropy 1.1 ******* *** (18.2 bits) 0.9 ******* *** 0.7 *********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGCGACGGACAA consensus C T G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 46623 341 1.13e-07 TTATCATGTT CGCGACGGACAA CTCAACGTCA 47154 155 2.26e-07 GTGGGGAGGT TGCGATGGACAA GAACGAACTG 46524 424 3.30e-07 AGCGTTCAAA TGCGATGGGCAA TCAGCCACTA 21873 367 1.01e-06 GTGAAAATAA CGCCACGGACAA GGTCCAACTT 39243 53 3.03e-06 TGTCGAATAC TGGGACCGGCAA TCGTGTC 47791 161 3.84e-06 TGCTTTTTGC GGCGACGACCAA CAATAACGCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46623 1.1e-07 340_[+2]_148 47154 2.3e-07 154_[+2]_334 46524 3.3e-07 423_[+2]_65 21873 1e-06 366_[+2]_122 39243 3e-06 52_[+2]_7 47791 3.8e-06 160_[+2]_328 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=6 46623 ( 341) CGCGACGGACAA 1 47154 ( 155) TGCGATGGACAA 1 46524 ( 424) TGCGATGGGCAA 1 21873 ( 367) CGCCACGGACAA 1 39243 ( 53) TGGGACCGGCAA 1 47791 ( 161) GGCGACGACCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8373 bayes= 10.8933 E= 2.0e+003 -923 51 -38 85 -923 -923 221 -923 -923 183 -38 -923 -923 -49 194 -923 187 -923 -923 -923 -923 151 -923 27 -923 -49 194 -923 -71 -923 194 -923 87 -49 62 -923 -923 210 -923 -923 187 -923 -923 -923 187 -923 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 2.0e+003 0.000000 0.333333 0.166667 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.166667 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.166667 0.833333 0.000000 0.166667 0.000000 0.833333 0.000000 0.500000 0.166667 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TC]GCGA[CT]GG[AG]CAA -------------------------------------------------------------------------------- Time 7.30 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 17 sites = 2 llr = 44 E-value = 5.5e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::a::a:a:55a:::: pos.-specific C :5:::::5::5::a::a probability G a5a:aa:5:a:5::aa: matrix T ::::::::::::::::: bits 2.2 * * ** * ** 2.0 * * ** * **** 1.8 * ***** ** ***** 1.5 * ***** ** ***** Relative 1.3 * ***** ** ***** Entropy 1.1 ********** ****** (31.5 bits) 0.9 ***************** 0.7 ***************** 0.4 ***************** 0.2 ***************** 0.0 ----------------- Multilevel GCGAGGACAGAAACGGC consensus G G CG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ----------------- 50373 197 1.46e-10 TGACAAGACT GGGAGGACAGAGACGGC ATCGCTTTAG 49287 36 2.11e-10 ACCGAGGATG GCGAGGAGAGCAACGGC AGAAATCTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50373 1.5e-10 196_[+3]_287 49287 2.1e-10 35_[+3]_448 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=17 seqs=2 50373 ( 197) GGGAGGACAGAGACGGC 1 49287 ( 36) GCGAGGAGAGCAACGGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 17 n= 8283 bayes= 12.0156 E= 5.5e+003 -765 -765 220 -765 -765 109 120 -765 -765 -765 220 -765 187 -765 -765 -765 -765 -765 220 -765 -765 -765 220 -765 187 -765 -765 -765 -765 109 120 -765 187 -765 -765 -765 -765 -765 220 -765 87 109 -765 -765 87 -765 120 -765 187 -765 -765 -765 -765 209 -765 -765 -765 -765 220 -765 -765 -765 220 -765 -765 209 -765 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 17 nsites= 2 E= 5.5e+003 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[CG]GAGGA[CG]AG[AC][AG]ACGGC -------------------------------------------------------------------------------- Time 10.77 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9400 7.27e-04 211_[+1(5.52e-08)]_274 46623 1.56e-03 340_[+2(1.13e-07)]_148 47154 3.08e-03 154_[+2(2.26e-07)]_334 21873 4.60e-03 366_[+2(1.01e-06)]_122 47779 3.36e-01 500 47791 3.06e-02 160_[+2(3.84e-06)]_328 48834 4.98e-05 329_[+1(2.12e-09)]_156 54101 9.46e-01 500 49678 7.04e-01 500 49919 6.28e-04 49_[+1(2.23e-08)]_202_\ [+1(2.39e-05)]_219 40792 3.79e-05 15_[+1(1.55e-08)]_470 50373 3.90e-06 196_[+3(1.46e-10)]_287 45751 6.16e-01 500 45880 4.88e-01 500 46524 4.64e-03 423_[+2(3.30e-07)]_65 39243 6.91e-04 52_[+2(3.03e-06)]_7 49287 5.39e-06 35_[+3(2.11e-10)]_448 48121 6.83e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************