******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/210/210.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 46479 1.0000 500 20893 1.0000 500 47235 1.0000 500 37681 1.0000 500 48512 1.0000 500 43491 1.0000 500 43745 1.0000 500 33186 1.0000 500 44183 1.0000 500 45498 1.0000 500 51921 1.0000 500 45711 1.0000 500 45881 1.0000 500 45863 1.0000 500 45289 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/210/210.seqs.fa -oc motifs/210 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.279 C 0.230 G 0.220 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.279 C 0.230 G 0.220 T 0.271 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 4 llr = 76 E-value = 4.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::3:::38::a:583a: pos.-specific C :::::8:3:::5:a5:3:: probability G aa3a33a853a3::::3:a matrix T ::8:5:::3::3:::33:: bits 2.2 ** * * * * * 2.0 ** * * * * * 1.7 ** * * * ** ** 1.5 ** * * * ** ** Relative 1.3 ** * *** * ** ** Entropy 1.1 **** *** ** **** ** (27.4 bits) 0.9 **** *** ** **** ** 0.7 **** *** ******* ** 0.4 **************** ** 0.2 **************** ** 0.0 ------------------- Multilevel GGTGTCGGGAGCACAAAAG consensus G AG CAG G CTC sequence G T T G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 45498 143 4.90e-12 GATCCGACAT GGTGTCGGGAGCACAAGAG TGGATATTGT 44183 134 2.83e-09 AGGTAGTCCG GGGGTCGCTAGCACCAAAG ATCTGTCGTA 45289 292 3.64e-09 CCAGTCATTG GGTGACGGAAGGACCTCAG TTTATGTCAT 45881 154 3.87e-09 CAACCATCAC GGTGGGGGGGGTACAATAG TCGAAAAAGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45498 4.9e-12 142_[+1]_339 44183 2.8e-09 133_[+1]_348 45289 3.6e-09 291_[+1]_190 45881 3.9e-09 153_[+1]_328 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=4 45498 ( 143) GGTGTCGGGAGCACAAGAG 1 44183 ( 134) GGGGTCGCTAGCACCAAAG 1 45289 ( 292) GGTGACGGAAGGACCTCAG 1 45881 ( 154) GGTGGGGGGGGTACAATAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 7230 bayes= 11.5563 E= 4.7e+002 -865 -865 218 -865 -865 -865 218 -865 -865 -865 18 147 -865 -865 218 -865 -16 -865 18 88 -865 170 18 -865 -865 -865 218 -865 -865 12 176 -865 -16 -865 118 -12 143 -865 18 -865 -865 -865 218 -865 -865 112 18 -12 184 -865 -865 -865 -865 212 -865 -865 84 112 -865 -865 143 -865 -865 -12 -16 12 18 -12 184 -865 -865 -865 -865 -865 218 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 4 E= 4.7e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.250000 0.500000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.250000 0.000000 0.500000 0.250000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.250000 0.250000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.250000 0.250000 0.250000 0.250000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GG[TG]G[TAG][CG]G[GC][GAT][AG]G[CGT]AC[AC][AT][ACGT]AG -------------------------------------------------------------------------------- Time 1.94 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 6 llr = 101 E-value = 4.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::32375:::::3::875: pos.-specific C 7:::::37::::3::::3a probability G 377::3::2a:a27a:32: matrix T :3:87:238:a:23:2::: bits 2.2 * * * * 2.0 *** * * 1.7 *** * * 1.5 *** * * Relative 1.3 * * **** ** * Entropy 1.1 **** * ***** **** * (24.2 bits) 0.9 ****** ***** **** * 0.7 ****** ***** **** * 0.4 ************ ****** 0.2 ************ ****** 0.0 ------------------- Multilevel CGGTTAACTGTGAGGAAAC consensus GTA AGCT CT GC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 43745 52 1.63e-09 CTTCTCACGG CGGTTGCCTGTGCTGAGCC ATCCCGGGTT 44183 60 1.88e-09 GCCGTTAGTA CGGTTATCTGTGGGGAGAC CTCGTTTTGT 20893 230 3.69e-09 GTAAACGCAA GTGTTAACTGTGAGGAAGC CAAGAAAGTC 45711 305 2.85e-08 GCCACTATCA GTGTTGATTGTGCGGTAAC TGTATTTCGA 45881 256 3.25e-08 GGTCTGTATA CGAAAACCTGTGTGGAACC ATCGCATCAC 33186 87 4.72e-08 CGGACTGTAT CGATAAATGGTGATGAAAC TGTGCCCTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43745 1.6e-09 51_[+2]_430 44183 1.9e-09 59_[+2]_422 20893 3.7e-09 229_[+2]_252 45711 2.8e-08 304_[+2]_177 45881 3.2e-08 255_[+2]_226 33186 4.7e-08 86_[+2]_395 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=6 43745 ( 52) CGGTTGCCTGTGCTGAGCC 1 44183 ( 60) CGGTTATCTGTGGGGAGAC 1 20893 ( 230) GTGTTAACTGTGAGGAAGC 1 45711 ( 305) GTGTTGATTGTGCGGTAAC 1 45881 ( 256) CGAAAACCTGTGTGGAACC 1 33186 ( 87) CGATAAATGGTGATGAAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 7230 bayes= 10.6814 E= 4.7e+002 -923 154 60 -923 -923 -923 160 30 26 -923 160 -923 -74 -923 -923 162 26 -923 -923 130 126 -923 60 -923 84 54 -923 -70 -923 154 -923 30 -923 -923 -40 162 -923 -923 218 -923 -923 -923 -923 188 -923 -923 218 -923 26 54 -40 -70 -923 -923 160 30 -923 -923 218 -923 158 -923 -923 -70 126 -923 60 -923 84 54 -40 -923 -923 212 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 6 E= 4.7e+002 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 0.666667 0.333333 0.333333 0.000000 0.666667 0.000000 0.166667 0.000000 0.000000 0.833333 0.333333 0.000000 0.000000 0.666667 0.666667 0.000000 0.333333 0.000000 0.500000 0.333333 0.000000 0.166667 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.333333 0.166667 0.166667 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.666667 0.000000 0.333333 0.000000 0.500000 0.333333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG][GT][GA]T[TA][AG][AC][CT]TGTG[AC][GT]GA[AG][AC]C -------------------------------------------------------------------------------- Time 3.74 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 13 sites = 5 llr = 72 E-value = 3.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::::::2::2a pos.-specific C :6:::::6:4::: probability G :2a:a:a:8:28: matrix T a2:a:a:4:68:: bits 2.2 * * * 2.0 * ***** 1.7 * ***** * 1.5 * ***** * Relative 1.3 * ***** * *** Entropy 1.1 * *********** (20.8 bits) 0.9 * *********** 0.7 ************* 0.4 ************* 0.2 ************* 0.0 ------------- Multilevel TCGTGTGCGTTGA consensus G TACGA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- 43491 80 4.48e-08 ATCGTGTGTA TCGTGTGTGCTGA CGACCGAAAA 45881 112 7.75e-08 TCGAGCCCGT TTGTGTGCGTTGA TCGCGAAAAC 48512 150 1.34e-07 ATGTTGACGA TCGTGTGCGTTAA AACAATCATC 44183 454 2.16e-07 CAGAGTTGAC TCGTGTGTGCGGA TTGGCGTTCG 43745 128 3.53e-07 CACGGAGAGA TGGTGTGCATTGA TGCCTTACAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43491 4.5e-08 79_[+3]_408 45881 7.8e-08 111_[+3]_376 48512 1.3e-07 149_[+3]_338 44183 2.2e-07 453_[+3]_34 43745 3.5e-07 127_[+3]_360 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=13 seqs=5 43491 ( 80) TCGTGTGTGCTGA 1 45881 ( 112) TTGTGTGCGTTGA 1 48512 ( 150) TCGTGTGCGTTAA 1 44183 ( 454) TCGTGTGTGCGGA 1 43745 ( 128) TGGTGTGCATTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 7320 bayes= 10.7664 E= 3.1e+002 -897 -897 -897 188 -897 138 -14 -44 -897 -897 218 -897 -897 -897 -897 188 -897 -897 218 -897 -897 -897 -897 188 -897 -897 218 -897 -897 138 -897 56 -48 -897 186 -897 -897 80 -897 114 -897 -897 -14 156 -48 -897 186 -897 184 -897 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 5 E= 3.1e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.200000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.200000 0.000000 0.800000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.200000 0.800000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[CGT]GTGTG[CT][GA][TC][TG][GA]A -------------------------------------------------------------------------------- Time 5.67 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46479 2.41e-01 500 20893 1.01e-04 229_[+2(3.69e-09)]_252 47235 5.87e-01 500 37681 1.73e-01 165_[+3(6.27e-05)]_322 48512 1.96e-03 149_[+3(1.34e-07)]_338 43491 1.97e-04 79_[+3(4.48e-08)]_408 43745 2.26e-08 51_[+2(1.63e-09)]_57_[+3(3.53e-07)]_\ 360 33186 5.03e-04 86_[+2(4.72e-08)]_395 44183 9.21e-14 59_[+2(1.88e-09)]_55_[+1(2.83e-09)]_\ 301_[+3(2.16e-07)]_34 45498 2.28e-07 142_[+1(4.90e-12)]_339 51921 5.81e-01 500 45711 8.36e-05 304_[+2(2.85e-08)]_177 45881 6.95e-13 111_[+3(7.75e-08)]_29_\ [+1(3.87e-09)]_83_[+2(3.25e-08)]_109_[+3(5.26e-06)]_104 45863 7.88e-01 500 45289 1.41e-04 291_[+1(3.64e-09)]_190 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************