******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/279/279.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 17895 1.0000 500 54656 1.0000 500 14355 1.0000 500 15224 1.0000 500 29824 1.0000 500 41766 1.0000 500 45171 1.0000 500 45485 1.0000 500 45664 1.0000 500 48509 1.0000 500 45094 1.0000 500 36648 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/279/279.seqs.fa -oc motifs/279 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.270 C 0.232 G 0.227 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.269 C 0.232 G 0.227 T 0.271 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 8 llr = 90 E-value = 1.3e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 4::::61::::8 pos.-specific C :::aa:633:1: probability G 6:3::4351a93 matrix T :a8::::36::: bits 2.1 ** * 1.9 * ** * 1.7 * ** * 1.5 * ** ** Relative 1.3 * ** ** Entropy 1.1 ****** *** (16.2 bits) 0.9 ******* *** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GTTCCACGTGGA consensus A G GGCC G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 41766 298 8.80e-08 CGTCGATGTC GTTCCGCGTGGA TCTAGAAGGA 54656 358 1.45e-07 ACATGAAAGA ATTCCACGTGGA ATTGTTTTGT 15224 321 8.85e-07 TTCAGCATGG GTTCCACCCGGA CTGCGGCATG 29824 213 3.29e-06 CAATGAATGC GTGCCACGGGGA GTATCACAGA 17895 158 4.34e-06 CATTTCGTCC GTTCCAACTGGA TTTGCGTTGT 45094 161 6.26e-06 GGGGTCAAAA GTTCCGGGCGGG CACCAACGTG 36648 260 1.16e-05 CTCAGCCCTG ATTCCGGTTGGG ACGGATTTAC 14355 211 1.77e-05 GGTTGCGGAA ATGCCACTTGCA CCTCGTCTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41766 8.8e-08 297_[+1]_191 54656 1.4e-07 357_[+1]_131 15224 8.9e-07 320_[+1]_168 29824 3.3e-06 212_[+1]_276 17895 4.3e-06 157_[+1]_331 45094 6.3e-06 160_[+1]_328 36648 1.2e-05 259_[+1]_229 14355 1.8e-05 210_[+1]_278 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=8 41766 ( 298) GTTCCGCGTGGA 1 54656 ( 358) ATTCCACGTGGA 1 15224 ( 321) GTTCCACCCGGA 1 29824 ( 213) GTGCCACGGGGA 1 17895 ( 158) GTTCCAACTGGA 1 45094 ( 161) GTTCCGGGCGGG 1 36648 ( 260) ATTCCGGTTGGG 1 14355 ( 211) ATGCCACTTGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 9.51668 E= 1.3e+003 48 -965 146 -965 -965 -965 -965 188 -965 -965 14 147 -965 211 -965 -965 -965 211 -965 -965 121 -965 72 -965 -111 143 14 -965 -965 11 114 -12 -965 11 -86 120 -965 -965 214 -965 -965 -89 194 -965 148 -965 14 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 1.3e+003 0.375000 0.000000 0.625000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.625000 0.000000 0.375000 0.000000 0.125000 0.625000 0.250000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.250000 0.125000 0.625000 0.000000 0.000000 1.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.750000 0.000000 0.250000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA]T[TG]CC[AG][CG][GCT][TC]GG[AG] -------------------------------------------------------------------------------- Time 1.25 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 2 llr = 42 E-value = 7.3e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A aa:::::a:::::::: pos.-specific C :::a:5::::::a5a: probability G ::a:a5a:5:aa:5:a matrix T ::::::::5a:::::: bits 2.1 *** * *** ** 1.9 ***** ** **** ** 1.7 ***** ** **** ** 1.5 ***** ** **** ** Relative 1.3 ***** ** **** ** Entropy 1.1 **************** (30.0 bits) 0.9 **************** 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel AAGCGCGAGTGGCCCG consensus G T G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 14355 146 3.21e-10 TGGACATTGA AAGCGCGAGTGGCGCG ATCCAGAATC 54656 406 8.14e-10 GGCCGGAGAC AAGCGGGATTGGCCCG TTCTGTAATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14355 3.2e-10 145_[+2]_339 54656 8.1e-10 405_[+2]_79 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=2 14355 ( 146) AAGCGCGAGTGGCGCG 1 54656 ( 406) AAGCGGGATTGGCCCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 11.5063 E= 7.3e+003 189 -765 -765 -765 189 -765 -765 -765 -765 -765 213 -765 -765 210 -765 -765 -765 -765 213 -765 -765 110 113 -765 -765 -765 213 -765 189 -765 -765 -765 -765 -765 113 88 -765 -765 -765 188 -765 -765 213 -765 -765 -765 213 -765 -765 210 -765 -765 -765 110 113 -765 -765 210 -765 -765 -765 -765 213 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 2 E= 7.3e+003 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- AAGCG[CG]GA[GT]TGGC[CG]CG -------------------------------------------------------------------------------- Time 2.49 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 11 llr = 107 E-value = 2.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1:948:::2:5: pos.-specific C 92:52157545: probability G :8:2:91:36:9 matrix T ::1:::431::1 bits 2.1 1.9 1.7 * * * 1.5 *** * * Relative 1.3 *** ** * * * Entropy 1.1 *** ** * *** (14.1 bits) 0.9 *** ** * *** 0.6 ******** *** 0.4 ******** *** 0.2 ************ 0.0 ------------ Multilevel CGACAGCCCGCG consensus A TTGCA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 54656 8 1.10e-06 GTCACAC CGACAGCCGCAG CTCTTTTGCG 48509 378 2.02e-06 GTGAGTCTTG CGACAGCCTGCG GTGTCGCAAT 45485 468 5.67e-06 ATAGTATTAT CGACAGCCCGCT GAGCACTTTT 15224 233 6.83e-06 GGTAAAAATT CCAGAGCCCGAG CAACTGGGAT 29824 13 1.07e-05 ACCCGAACCC CGAGCGTCCGAG CGACATAAAA 45664 334 1.16e-05 GTGGAGTTTA CGAAAGGTCGCG ACTAAGACGT 45094 40 1.29e-05 ATGGCGACCA AGAAAGCCGGCG GCTCGTCGGG 36648 327 1.64e-05 ATCAAACCCT CGTCAGTCCCCG AGGACACTAC 17895 194 2.57e-05 GATTTCACTC CCACAGTCACAG TCCATGAGCT 41766 97 2.73e-05 GGAAATTCTG CGAACGCTAGAG CATTTAAAAA 14355 379 7.07e-05 GCGAAGCGAA CGAAACTTGCCG AGCGCGCTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54656 1.1e-06 7_[+3]_481 48509 2e-06 377_[+3]_111 45485 5.7e-06 467_[+3]_21 15224 6.8e-06 232_[+3]_256 29824 1.1e-05 12_[+3]_476 45664 1.2e-05 333_[+3]_155 45094 1.3e-05 39_[+3]_449 36648 1.6e-05 326_[+3]_162 17895 2.6e-05 193_[+3]_295 41766 2.7e-05 96_[+3]_392 14355 7.1e-05 378_[+3]_110 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=11 54656 ( 8) CGACAGCCGCAG 1 48509 ( 378) CGACAGCCTGCG 1 45485 ( 468) CGACAGCCCGCT 1 15224 ( 233) CCAGAGCCCGAG 1 29824 ( 13) CGAGCGTCCGAG 1 45664 ( 334) CGAAAGGTCGCG 1 45094 ( 40) AGAAAGCCGGCG 1 36648 ( 327) CGTCAGTCCCCG 1 17895 ( 194) CCACAGTCACAG 1 41766 ( 97) CGAACGCTAGAG 1 14355 ( 379) CGAAACTTGCCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 10.0844 E= 2.2e+003 -157 197 -1010 -1010 -1010 -35 185 -1010 175 -1010 -1010 -157 43 97 -32 -1010 160 -35 -1010 -1010 -1010 -135 200 -1010 -1010 123 -132 42 -1010 165 -1010 1 -57 97 26 -157 -1010 65 148 -1010 75 123 -1010 -1010 -1010 -1010 200 -157 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 2.2e+003 0.090909 0.909091 0.000000 0.000000 0.000000 0.181818 0.818182 0.000000 0.909091 0.000000 0.000000 0.090909 0.363636 0.454545 0.181818 0.000000 0.818182 0.181818 0.000000 0.000000 0.000000 0.090909 0.909091 0.000000 0.000000 0.545455 0.090909 0.363636 0.000000 0.727273 0.000000 0.272727 0.181818 0.454545 0.272727 0.090909 0.000000 0.363636 0.636364 0.000000 0.454545 0.545455 0.000000 0.000000 0.000000 0.000000 0.909091 0.090909 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CGA[CA]AG[CT][CT][CG][GC][CA]G -------------------------------------------------------------------------------- Time 3.69 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 17895 1.15e-03 157_[+1(4.34e-06)]_24_\ [+3(2.57e-05)]_295 54656 8.10e-12 7_[+3(1.10e-06)]_338_[+1(1.45e-07)]_\ 36_[+2(8.14e-10)]_79 14355 1.41e-08 145_[+2(3.21e-10)]_49_\ [+1(1.77e-05)]_156_[+3(7.07e-05)]_110 15224 3.99e-05 232_[+3(6.83e-06)]_76_\ [+1(8.85e-07)]_168 29824 5.60e-04 12_[+3(1.07e-05)]_188_\ [+1(3.29e-06)]_276 41766 5.68e-05 96_[+3(2.73e-05)]_189_\ [+1(8.80e-08)]_191 45171 7.22e-01 500 45485 1.14e-02 467_[+3(5.67e-06)]_21 45664 5.04e-02 333_[+3(1.16e-05)]_155 48509 5.93e-03 377_[+3(2.02e-06)]_111 45094 1.25e-03 39_[+3(1.29e-05)]_109_\ [+1(6.26e-06)]_328 36648 1.02e-03 259_[+1(1.16e-05)]_55_\ [+3(1.64e-05)]_162 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************