******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/498/498.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42978 1.0000 500 54064 1.0000 500 46607 1.0000 500 36726 1.0000 500 13994 1.0000 500 43465 1.0000 500 18404 1.0000 500 49724 1.0000 500 50050 1.0000 500 41501 1.0000 500 44258 1.0000 500 11792 1.0000 500 45902 1.0000 500 36253 1.0000 500 42473 1.0000 500 36502 1.0000 500 31835 1.0000 500 42977 1.0000 500 47941 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/498/498.seqs.fa -oc motifs/498 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.266 C 0.240 G 0.222 T 0.272 Background letter frequencies (from dataset with add-one prior applied): A 0.266 C 0.240 G 0.222 T 0.272 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 6 llr = 92 E-value = 3.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::758a:a7:::: pos.-specific C :3a8:52:8:::3a: probability G :3:22:::2::a5:a matrix T a3::2:::::3:2:: bits 2.2 * * 2.0 * * * * * ** 1.7 * * * * * ** 1.5 * ** *** * ** Relative 1.3 * ** **** * ** Entropy 1.1 * ** ******* ** (22.0 bits) 0.9 * ** ******* ** 0.7 * ************* 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel TCCCAAAACAAGGCG consensus G C T C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 45902 450 7.01e-10 TTTGTTGTAT TGCCACAACAAGGCG AGTGAAGAGT 54064 299 9.13e-09 CGTTGTCCTC TTCCACAACAAGCCG CTACTCTACA 43465 128 9.66e-08 GAGAATTGAT TGCGAAAACATGGCG TGGTCGGGAT 49724 56 1.01e-07 GTTCTCGTAG TCCCAAAAGATGGCG ACTTCGCTAG 46607 416 1.09e-07 TCCATGGAAT TCCCGCAACAAGTCG CGTACTTGGT 42978 196 2.50e-07 AGACGGAGCT TTCCTACACAAGCCG ACCACTGTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45902 7e-10 449_[+1]_36 54064 9.1e-09 298_[+1]_187 43465 9.7e-08 127_[+1]_358 49724 1e-07 55_[+1]_430 46607 1.1e-07 415_[+1]_70 42978 2.5e-07 195_[+1]_290 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=6 45902 ( 450) TGCCACAACAAGGCG 1 54064 ( 299) TTCCACAACAAGCCG 1 43465 ( 128) TGCGAAAACATGGCG 1 49724 ( 56) TCCCAAAAGATGGCG 1 46607 ( 416) TCCCGCAACAAGTCG 1 42978 ( 196) TTCCTACACAAGCCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 9234 bayes= 11.0345 E= 3.9e+001 -923 -923 -923 187 -923 47 59 29 -923 205 -923 -923 -923 179 -41 -923 133 -923 -41 -71 91 106 -923 -923 165 -53 -923 -923 191 -923 -923 -923 -923 179 -41 -923 191 -923 -923 -923 133 -923 -923 29 -923 -923 217 -923 -923 47 117 -71 -923 205 -923 -923 -923 -923 217 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 6 E= 3.9e+001 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.333333 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.666667 0.000000 0.166667 0.166667 0.500000 0.500000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.500000 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[CGT]CCA[AC]AACA[AT]G[GC]CG -------------------------------------------------------------------------------- Time 3.39 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 5 llr = 71 E-value = 1.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::8:6:::a:a pos.-specific C ::::a2::2:2: probability G ::a:::a:8:8: matrix T aa:2:2:a:::: bits 2.2 * * 2.0 *** * ** * * 1.7 *** * ** * * 1.5 *** * ****** Relative 1.3 *** * ****** Entropy 1.1 ***** ****** (20.5 bits) 0.9 ***** ****** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTGACAGTGAGA consensus T C C C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 43465 237 5.87e-08 CCAGACAGAG TTGACAGTGAGA CAAGGCAGCG 31835 260 1.12e-07 CCAGAGAGCA TTGACCGTGAGA TTGACTGGGA 42978 352 1.72e-07 TTGGAATTGT TTGACTGTGAGA TTCTGCTGGT 36726 437 2.32e-07 TTGACGGACG TTGTCAGTGAGA CCTCTCAGAA 36253 134 9.17e-07 GAACCGACAG TTGACAGTCACA ATAAAAGCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43465 5.9e-08 236_[+2]_252 31835 1.1e-07 259_[+2]_229 42978 1.7e-07 351_[+2]_137 36726 2.3e-07 436_[+2]_52 36253 9.2e-07 133_[+2]_355 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=5 43465 ( 237) TTGACAGTGAGA 1 31835 ( 260) TTGACCGTGAGA 1 42978 ( 352) TTGACTGTGAGA 1 36726 ( 437) TTGTCAGTGAGA 1 36253 ( 134) TTGACAGTCACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9291 bayes= 11.1106 E= 1.5e+002 -897 -897 -897 187 -897 -897 -897 187 -897 -897 217 -897 159 -897 -897 -44 -897 205 -897 -897 117 -27 -897 -44 -897 -897 217 -897 -897 -897 -897 187 -897 -27 185 -897 191 -897 -897 -897 -897 -27 185 -897 191 -897 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 5 E= 1.5e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 0.600000 0.200000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TTG[AT]C[ACT]GT[GC]A[GC]A -------------------------------------------------------------------------------- Time 6.64 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 11 llr = 118 E-value = 2.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::9:1:2:8544 pos.-specific C 1a:a1::9:2:6 probability G 3:::1a:122:: matrix T 6:1:7:8::16: bits 2.2 * 2.0 * * * 1.7 * * * 1.5 *** * * Relative 1.3 *** **** Entropy 1.1 *** **** * (15.5 bits) 0.9 *** **** ** 0.7 ********* ** 0.4 ********* ** 0.2 ************ 0.0 ------------ Multilevel TCACTGTCAATC consensus G AA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 36502 158 2.12e-07 CGGATATGGT GCACTGTCAATC GACATCATGC 42473 348 8.39e-07 GTGCTTATAC TCACTGACAATC TCGGACGGAC 36253 237 9.57e-07 AGCTCTTCCC TCACTGTCACAC AATTTCCATT 41501 21 1.94e-06 CGTACCAATG TCACTGTCGATA TACGATTTCA 42977 400 2.53e-06 TGTCCGGTTT TCACTGTCGGTC GAGTCACTCT 18404 407 4.59e-06 GTTAAAAAAT TCACGGTCAGTC AAAAAGGGAG 44258 136 9.70e-06 GATAACAACT TCTCTGTCAAAA TGGCCTGTCT 42978 212 1.06e-05 CACAAGCCGA CCACTGTCATTC GCCGGGTCCT 43465 258 1.21e-05 ACAAGGCAGC GCACAGTCAAAA GGACAGGAAA 54064 239 1.40e-05 CGTCCCATGA GCACCGACAATC GTATCCGTCG 36726 284 1.95e-05 TCTCCTGATG TCACTGTGACAA CCAATGCGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36502 2.1e-07 157_[+3]_331 42473 8.4e-07 347_[+3]_141 36253 9.6e-07 236_[+3]_252 41501 1.9e-06 20_[+3]_468 42977 2.5e-06 399_[+3]_89 18404 4.6e-06 406_[+3]_82 44258 9.7e-06 135_[+3]_353 42978 1.1e-05 211_[+3]_277 43465 1.2e-05 257_[+3]_231 54064 1.4e-05 238_[+3]_250 36726 1.9e-05 283_[+3]_205 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=11 36502 ( 158) GCACTGTCAATC 1 42473 ( 348) TCACTGACAATC 1 36253 ( 237) TCACTGTCACAC 1 41501 ( 21) TCACTGTCGATA 1 42977 ( 400) TCACTGTCGGTC 1 18404 ( 407) TCACGGTCAGTC 1 44258 ( 136) TCTCTGTCAAAA 1 42978 ( 212) CCACTGTCATTC 1 43465 ( 258) GCACAGTCAAAA 1 54064 ( 239) GCACCGACAATC 1 36726 ( 284) TCACTGTGACAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9291 bayes= 10.0759 E= 2.4e+002 -1010 -140 30 122 -1010 206 -1010 -1010 177 -1010 -1010 -158 -1010 206 -1010 -1010 -154 -140 -128 142 -1010 -1010 217 -1010 -55 -1010 -1010 159 -1010 192 -128 -1010 162 -1010 -29 -1010 104 -40 -29 -158 45 -1010 -1010 122 45 140 -1010 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 2.4e+002 0.000000 0.090909 0.272727 0.636364 0.000000 1.000000 0.000000 0.000000 0.909091 0.000000 0.000000 0.090909 0.000000 1.000000 0.000000 0.000000 0.090909 0.090909 0.090909 0.727273 0.000000 0.000000 1.000000 0.000000 0.181818 0.000000 0.000000 0.818182 0.000000 0.909091 0.090909 0.000000 0.818182 0.000000 0.181818 0.000000 0.545455 0.181818 0.181818 0.090909 0.363636 0.000000 0.000000 0.636364 0.363636 0.636364 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TG]CACTGTCAA[TA][CA] -------------------------------------------------------------------------------- Time 9.89 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42978 1.60e-08 195_[+1(2.50e-07)]_1_[+3(1.06e-05)]_\ 128_[+2(1.72e-07)]_137 54064 4.26e-06 238_[+3(1.40e-05)]_48_\ [+1(9.13e-09)]_187 46607 7.64e-04 415_[+1(1.09e-07)]_70 36726 8.76e-05 283_[+3(1.95e-05)]_141_\ [+2(2.32e-07)]_52 13994 6.62e-01 500 43465 2.82e-09 127_[+1(9.66e-08)]_94_\ [+2(5.87e-08)]_9_[+3(1.21e-05)]_231 18404 1.45e-03 406_[+3(4.59e-06)]_33_\ [+2(4.24e-05)]_37 49724 2.22e-04 55_[+1(1.01e-07)]_430 50050 2.03e-01 500 41501 7.34e-03 20_[+3(1.94e-06)]_468 44258 1.68e-02 135_[+3(9.70e-06)]_353 11792 3.68e-01 500 45902 3.32e-06 449_[+1(7.01e-10)]_36 36253 1.91e-05 133_[+2(9.17e-07)]_91_\ [+3(9.57e-07)]_252 42473 7.74e-03 347_[+3(8.39e-07)]_141 36502 3.15e-03 157_[+3(2.12e-07)]_331 31835 2.42e-04 259_[+2(1.12e-07)]_127_\ [+2(9.57e-05)]_90 42977 8.27e-04 399_[+3(2.53e-06)]_89 47941 7.06e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************