******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/419/419.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 48282 1.0000 500 36118 1.0000 500 39266 1.0000 500 32906 1.0000 500 40470 1.0000 500 35498 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/419/419.seqs.fa -oc motifs/419 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 6 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3000 N= 6 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.232 G 0.221 T 0.276 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.232 G 0.221 T 0.276 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 5 llr = 75 E-value = 7.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :68a:::2:6::::68 pos.-specific C :4::8a828::4::2: probability G 8:2:2:24::::aa:2 matrix T 2::::::224a6::2: bits 2.2 * ** 2.0 * * * ** 1.7 * * * ** 1.5 * * * ** Relative 1.3 * ***** * * ** * Entropy 1.1 ******* * **** * (21.7 bits) 0.9 ******* ****** * 0.7 ******* ******** 0.4 ******* ******** 0.2 **************** 0.0 ---------------- Multilevel GAAACCCGCATTGGAA consensus TCG G GATT C CG sequence C T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 35498 200 1.77e-08 TAATTGCCGT GAAACCCGTTTTGGAA TGCCTAGTCT 48282 236 2.49e-08 TTTTTTAAAG GAAAGCCTCATTGGAA GCACGAGCCA 36118 93 4.13e-08 ATCCGTTTGT GAGACCCCCTTCGGAA CAGGAATTGC 40470 477 5.61e-08 CGCAGTTGCG GCAACCGGCATCGGCA ACCGGATC 39266 291 3.94e-07 TGAGTGCTCG TCAACCCACATTGGTG TCGATCTCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35498 1.8e-08 199_[+1]_285 48282 2.5e-08 235_[+1]_249 36118 4.1e-08 92_[+1]_392 40470 5.6e-08 476_[+1]_8 39266 3.9e-07 290_[+1]_194 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=5 35498 ( 200) GAAACCCGTTTTGGAA 1 48282 ( 236) GAAAGCCTCATTGGAA 1 36118 ( 93) GAGACCCCCTTCGGAA 1 40470 ( 477) GCAACCGGCATCGGCA 1 39266 ( 291) TCAACCCACATTGGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 2910 bayes= 9.43433 E= 7.5e+001 -897 -897 186 -46 114 78 -897 -897 156 -897 -14 -897 188 -897 -897 -897 -897 178 -14 -897 -897 211 -897 -897 -897 178 -14 -897 -44 -21 86 -46 -897 178 -897 -46 114 -897 -897 53 -897 -897 -897 186 -897 78 -897 112 -897 -897 218 -897 -897 -897 218 -897 114 -21 -897 -46 156 -897 -14 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 7.5e+001 0.000000 0.000000 0.800000 0.200000 0.600000 0.400000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.200000 0.200000 0.400000 0.200000 0.000000 0.800000 0.000000 0.200000 0.600000 0.000000 0.000000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.200000 0.000000 0.200000 0.800000 0.000000 0.200000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GT][AC][AG]A[CG]C[CG][GACT][CT][AT]T[TC]GG[ACT][AG] -------------------------------------------------------------------------------- Time 0.33 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 17 sites = 6 llr = 84 E-value = 1.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::32:375:::22a pos.-specific C 8::72:::3::::::2: probability G 2a::83:82:28:73:: matrix T ::a3:3822332a357: bits 2.2 * 2.0 ** * * 1.7 ** * * 1.5 *** * * ** * Relative 1.3 *** * ** ** * Entropy 1.1 ***** ** *** * (20.2 bits) 0.9 ***** ** * *** * 0.7 ***** ** * *** ** 0.4 ******** ******** 0.2 ******** ******** 0.0 ----------------- Multilevel CGTCGATGAAAGTGTTA consensus T G CTT TG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ----------------- 39266 399 3.17e-08 TTGGAAACAA CGTTGGTGGTAGTGGTA GTGGTAGCAT 32906 481 4.44e-08 TCTCTTCCTG CGTCGATGAAAGTGAAA ATT 48282 193 2.43e-07 GAGTCTTTCA GGTCGTTGATTGTTTTA GGAGCTATGC 35498 453 2.63e-07 GAAAGGACAC CGTTCGAGCAAGTGTTA CGCCAAAGCA 40470 97 3.04e-07 AGTAACTCTC CGTCGTTGCAGTTGTCA GAAGAGCCGA 36118 223 3.78e-07 TCACGAATAT CGTCGATTTATGTTGTA CCGATACTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39266 3.2e-08 398_[+2]_85 32906 4.4e-08 480_[+2]_3 48282 2.4e-07 192_[+2]_291 35498 2.6e-07 452_[+2]_31 40470 3e-07 96_[+2]_387 36118 3.8e-07 222_[+2]_261 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=17 seqs=6 39266 ( 399) CGTTGGTGGTAGTGGTA 1 32906 ( 481) CGTCGATGAAAGTGAAA 1 48282 ( 193) GGTCGTTGATTGTTTTA 1 35498 ( 453) CGTTCGAGCAAGTGTTA 1 40470 ( 97) CGTCGTTGCAGTTGTCA 1 36118 ( 223) CGTCGATTTATGTTGTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 17 n= 2904 bayes= 8.91588 E= 1.7e+002 -923 184 -40 -923 -923 -923 218 -923 -923 -923 -923 186 -923 152 -923 27 -923 -48 192 -923 30 -923 59 27 -70 -923 -923 159 -923 -923 192 -73 30 52 -40 -73 130 -923 -923 27 88 -923 -40 27 -923 -923 192 -73 -923 -923 -923 186 -923 -923 159 27 -70 -923 59 86 -70 -48 -923 127 188 -923 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 17 nsites= 6 E= 1.7e+002 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.166667 0.833333 0.000000 0.333333 0.000000 0.333333 0.333333 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 0.833333 0.166667 0.333333 0.333333 0.166667 0.166667 0.666667 0.000000 0.000000 0.333333 0.500000 0.000000 0.166667 0.333333 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.166667 0.000000 0.333333 0.500000 0.166667 0.166667 0.000000 0.666667 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CGT[CT]G[AGT]TG[AC][AT][AT]GT[GT][TG]TA -------------------------------------------------------------------------------- Time 0.74 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 4 llr = 63 E-value = 2.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::8:3::35::::a pos.-specific C 3:::::8a:5a38a: probability G 8a53a:3:3::83:: matrix T ::5::8::5:::::: bits 2.2 * * * * * 2.0 * * * * ** 1.7 * * * * ** 1.5 * * * * ** Relative 1.3 ** * ** ***** Entropy 1.1 ******** ****** (22.5 bits) 0.9 ******** ****** 0.7 ******** ****** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel GGGAGTCCTACGCCA consensus C TG AG AC CG sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 32906 379 5.39e-09 TTTCAATTAT GGGGGTCCTCCGCCA TAGCATTCAT 39266 158 2.86e-08 ACAGCGACCT GGTAGACCGCCGCCA AGCCCCGCGT 40470 60 7.09e-08 TACTCTCGAG CGGAGTCCTACCCCA TCAACAACGC 35498 28 1.26e-07 TGCGATAAGG GGTAGTGCAACGGCA CCAGTCGGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32906 5.4e-09 378_[+3]_107 39266 2.9e-08 157_[+3]_328 40470 7.1e-08 59_[+3]_426 35498 1.3e-07 27_[+3]_458 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=4 32906 ( 379) GGGGGTCCTCCGCCA 1 39266 ( 158) GGTAGACCGCCGCCA 1 40470 ( 60) CGGAGTCCTACCCCA 1 35498 ( 28) GGTAGTGCAACGGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 2916 bayes= 9.50779 E= 2.2e+002 -865 11 176 -865 -865 -865 218 -865 -865 -865 118 86 146 -865 18 -865 -865 -865 218 -865 -12 -865 -865 144 -865 169 18 -865 -865 210 -865 -865 -12 -865 18 86 88 111 -865 -865 -865 210 -865 -865 -865 11 176 -865 -865 169 18 -865 -865 210 -865 -865 188 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 4 E= 2.2e+002 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.250000 0.500000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC]G[GT][AG]G[TA][CG]C[TAG][AC]C[GC][CG]CA -------------------------------------------------------------------------------- Time 1.04 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48282 2.80e-07 192_[+2(2.43e-07)]_26_\ [+1(2.49e-08)]_249 36118 2.73e-07 92_[+1(4.13e-08)]_114_\ [+2(3.78e-07)]_261 39266 2.07e-11 157_[+3(2.86e-08)]_118_\ [+1(3.94e-07)]_92_[+2(3.17e-08)]_85 32906 1.45e-08 378_[+3(5.39e-09)]_87_\ [+2(4.44e-08)]_3 40470 6.48e-11 59_[+3(7.09e-08)]_22_[+2(3.04e-07)]_\ 363_[+1(5.61e-08)]_8 35498 3.29e-11 27_[+3(1.26e-07)]_157_\ [+1(1.77e-08)]_237_[+2(2.63e-07)]_31 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************