******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/250/250.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42687 1.0000 500 36582 1.0000 500 36881 1.0000 500 47406 1.0000 500 48469 1.0000 500 9987 1.0000 500 10104 1.0000 500 16157 1.0000 500 33462 1.0000 500 44885 1.0000 500 54442 1.0000 500 27375 1.0000 500 34484 1.0000 500 35960 1.0000 500 45952 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/250/250.seqs.fa -oc motifs/250 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.281 C 0.240 G 0.229 T 0.250 Background letter frequencies (from dataset with add-one prior applied): A 0.281 C 0.240 G 0.229 T 0.250 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 11 llr = 141 E-value = 4.8e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 14:82:28:5:::67: pos.-specific C 3:525:7:9::1a1:6 probability G 6:2:::11::a::221 matrix T :63:3a:115:9:113 bits 2.1 * * 1.9 * * * 1.7 * * * * 1.5 * * *** Relative 1.3 * * * *** Entropy 1.1 * * * ** *** (18.5 bits) 0.9 ** * ******** ** 0.6 ************* ** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GTCACTCACAGTCAAC consensus CAT T T T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 27375 266 2.72e-09 GCAATGACAG GTCACTCACTGTCAAT GCATCCAAGA 34484 271 1.82e-08 CCCCCCAAAG GTGACTCACAGTCAAT CGTCCTCCGT 45952 21 8.67e-08 GTCCCGTGTA GTGACTCACAGTCAAG CTAGGTAGGT 36582 88 4.13e-07 CCATGCAGAG CTCCTTCACTGTCAGC TGTCGACCGC 33462 38 7.03e-07 TATTGACACA AACACTCACAGTCCAC CTGAACTTGG 9987 26 7.03e-07 TGTCACACTC GATAATCTCAGTCAAC GAAGGTTGAC 36881 178 9.14e-07 TACTGTTAGT CTCATTCATTGTCGAC GTTCTTGGAA 44885 399 1.15e-06 CATCTTTCGT CACAATCACAGCCAAC ACCATACTGA 10104 78 2.22e-06 GGTAACGCCG GATACTAACAGTCGTC GACCTTTCGA 47406 484 2.37e-06 TATTTTGATG GTTACTAGCTGTCTAC G 48469 265 2.54e-06 GCAGTGAAAA GTCCTTGACTGTCAGT TTAAAAAGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 27375 2.7e-09 265_[+1]_219 34484 1.8e-08 270_[+1]_214 45952 8.7e-08 20_[+1]_464 36582 4.1e-07 87_[+1]_397 33462 7e-07 37_[+1]_447 9987 7e-07 25_[+1]_459 36881 9.1e-07 177_[+1]_307 44885 1.2e-06 398_[+1]_86 10104 2.2e-06 77_[+1]_407 47406 2.4e-06 483_[+1]_1 48469 2.5e-06 264_[+1]_220 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=11 27375 ( 266) GTCACTCACTGTCAAT 1 34484 ( 271) GTGACTCACAGTCAAT 1 45952 ( 21) GTGACTCACAGTCAAG 1 36582 ( 88) CTCCTTCACTGTCAGC 1 33462 ( 38) AACACTCACAGTCCAC 1 9987 ( 26) GATAATCTCAGTCAAC 1 36881 ( 178) CTCATTCATTGTCGAC 1 44885 ( 399) CACAATCACAGCCAAC 1 10104 ( 78) GATACTAACAGTCGTC 1 47406 ( 484) GTTACTAGCTGTCTAC 1 48469 ( 265) GTCCTTGACTGTCAGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7275 bayes= 9.72269 E= 4.8e-002 -163 19 147 -1010 37 -1010 -1010 135 -1010 118 -33 12 154 -40 -1010 -1010 -63 118 -1010 12 -1010 -1010 -1010 200 -63 160 -133 -1010 154 -1010 -133 -146 -1010 192 -1010 -146 96 -1010 -1010 86 -1010 -1010 213 -1010 -1010 -140 -1010 186 -1010 206 -1010 -1010 118 -140 -33 -146 137 -1010 -33 -146 -1010 141 -133 12 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 4.8e-002 0.090909 0.272727 0.636364 0.000000 0.363636 0.000000 0.000000 0.636364 0.000000 0.545455 0.181818 0.272727 0.818182 0.181818 0.000000 0.000000 0.181818 0.545455 0.000000 0.272727 0.000000 0.000000 0.000000 1.000000 0.181818 0.727273 0.090909 0.000000 0.818182 0.000000 0.090909 0.090909 0.000000 0.909091 0.000000 0.090909 0.545455 0.000000 0.000000 0.454545 0.000000 0.000000 1.000000 0.000000 0.000000 0.090909 0.000000 0.909091 0.000000 1.000000 0.000000 0.000000 0.636364 0.090909 0.181818 0.090909 0.727273 0.000000 0.181818 0.090909 0.000000 0.636364 0.090909 0.272727 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GC][TA][CT]A[CT]TCAC[AT]GTCAA[CT] -------------------------------------------------------------------------------- Time 1.91 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 6 llr = 88 E-value = 3.3e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::72:523:5::3::: pos.-specific C 22::a:873:a:3:82 probability G 88:5:5::55:a3:28 matrix T ::33::::2::::a:: bits 2.1 * ** 1.9 * ** * 1.7 * ** * 1.5 ** * ** *** Relative 1.3 ** * * ** *** Entropy 1.1 *** **** *** *** (21.0 bits) 0.9 *** **** *** *** 0.6 ************ *** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GGAGCACCGACGATCG consensus TT G ACG C sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 54442 373 4.32e-09 AATATGACGA GGAGCACCCACGGTCG GAAAACCAAA 35960 86 3.77e-08 TTACACACAT GGTACGCCGACGCTCG CTGCCGTGGC 9987 208 5.21e-08 ACATGCGCGT GGATCACCGGCGCTGG ACCCGTTATT 44885 92 2.40e-07 GCGTTTCCCT CGAGCGCCGACGATCC CTTTTGTCAT 10104 157 3.33e-07 CCCGAGCCCA GGTTCGAACGCGGTCG TCGGCACGAA 36582 271 3.33e-07 CGATCCTTGT GCAGCACATGCGATCG GCCAGAACCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54442 4.3e-09 372_[+2]_112 35960 3.8e-08 85_[+2]_399 9987 5.2e-08 207_[+2]_277 44885 2.4e-07 91_[+2]_393 10104 3.3e-07 156_[+2]_328 36582 3.3e-07 270_[+2]_214 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=6 54442 ( 373) GGAGCACCCACGGTCG 1 35960 ( 86) GGTACGCCGACGCTCG 1 9987 ( 208) GGATCACCGGCGCTGG 1 44885 ( 92) CGAGCGCCGACGATCC 1 10104 ( 157) GGTTCGAACGCGGTCG 1 36582 ( 271) GCAGCACATGCGATCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7275 bayes= 10.6904 E= 3.3e+003 -923 -52 186 -923 -923 -52 186 -923 124 -923 -923 41 -75 -923 113 41 -923 206 -923 -923 83 -923 113 -923 -75 179 -923 -923 25 147 -923 -923 -923 47 113 -58 83 -923 113 -923 -923 206 -923 -923 -923 -923 213 -923 25 47 54 -923 -923 -923 -923 200 -923 179 -46 -923 -923 -52 186 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 3.3e+003 0.000000 0.166667 0.833333 0.000000 0.000000 0.166667 0.833333 0.000000 0.666667 0.000000 0.000000 0.333333 0.166667 0.000000 0.500000 0.333333 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.166667 0.833333 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.333333 0.500000 0.166667 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.333333 0.333333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.166667 0.833333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GG[AT][GT]C[AG]C[CA][GC][AG]CG[ACG]TCG -------------------------------------------------------------------------------- Time 3.86 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 4 llr = 82 E-value = 4.4e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 8aa::::::3:::::55::a: pos.-specific C :::5:a:a:5::::53:3::8 probability G 3::3::a:::588:3:55a:3 matrix T :::3a:::a3533a33:3::: bits 2.1 *** * 1.9 ** ***** * ** 1.7 ** ***** * ** 1.5 ** ***** * ** Relative 1.3 ** ***** *** *** Entropy 1.1 *** ***** **** * *** (29.5 bits) 0.9 *** ***** **** * *** 0.6 ********* ***** ***** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel AAACTCGCTCGGGTCAAGGAC consensus G G ATTT GCGC G sequence T T TT T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 47406 349 1.56e-10 TTTACATGTT AAAGTCGCTCGTGTCTGGGAC TTGTCGGGAT 45952 75 1.93e-10 CACGAAGGCA AAACTCGCTTGGGTTAGGGAG GGTACTGATG 27375 175 2.16e-10 ATTGTAGAGA AAATTCGCTATGGTCAACGAC AGCTTACTGT 16157 303 1.25e-09 ATCACCCACC GAACTCGCTCTGTTGCATGAC GTTCCGTCCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47406 1.6e-10 348_[+3]_131 45952 1.9e-10 74_[+3]_405 27375 2.2e-10 174_[+3]_305 16157 1.2e-09 302_[+3]_177 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=4 47406 ( 349) AAAGTCGCTCGTGTCTGGGAC 1 45952 ( 75) AAACTCGCTTGGGTTAGGGAG 1 27375 ( 175) AAATTCGCTATGGTCAACGAC 1 16157 ( 303) GAACTCGCTCTGTTGCATGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 10.813 E= 4.4e+003 141 -865 13 -865 183 -865 -865 -865 183 -865 -865 -865 -865 106 13 0 -865 -865 -865 200 -865 206 -865 -865 -865 -865 212 -865 -865 206 -865 -865 -865 -865 -865 200 -17 106 -865 0 -865 -865 112 100 -865 -865 171 0 -865 -865 171 0 -865 -865 -865 200 -865 106 13 0 83 6 -865 0 83 -865 112 -865 -865 6 112 0 -865 -865 212 -865 183 -865 -865 -865 -865 164 13 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 4.4e+003 0.750000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.250000 0.500000 0.000000 0.250000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.250000 0.250000 0.500000 0.250000 0.000000 0.250000 0.500000 0.000000 0.500000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AG]AA[CGT]TCGCT[CAT][GT][GT][GT]T[CGT][ACT][AG][GCT]GA[CG] -------------------------------------------------------------------------------- Time 5.76 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42687 7.29e-01 500 36582 2.01e-06 87_[+1(4.13e-07)]_167_\ [+2(3.33e-07)]_214 36881 5.17e-03 177_[+1(9.14e-07)]_307 47406 5.58e-09 348_[+3(1.56e-10)]_114_\ [+1(2.37e-06)]_1 48469 1.29e-02 264_[+1(2.54e-06)]_220 9987 9.16e-07 25_[+1(7.03e-07)]_166_\ [+2(5.21e-08)]_277 10104 8.27e-06 77_[+1(2.22e-06)]_63_[+2(3.33e-07)]_\ 328 16157 3.27e-06 302_[+3(1.25e-09)]_177 33462 1.40e-03 37_[+1(7.03e-07)]_447 44885 6.54e-06 91_[+2(2.40e-07)]_291_\ [+1(1.15e-06)]_86 54442 8.92e-05 372_[+2(4.32e-09)]_112 27375 5.04e-11 11_[+3(4.64e-05)]_142_\ [+3(2.16e-10)]_70_[+1(2.72e-09)]_219 34484 1.07e-04 270_[+1(1.82e-08)]_214 35960 1.78e-04 85_[+2(3.77e-08)]_399 45952 1.05e-09 20_[+1(8.67e-08)]_38_[+3(1.93e-10)]_\ 405 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************