******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/266/266.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42708 1.0000 500 14688 1.0000 500 12535 1.0000 500 35504 1.0000 500 42790 1.0000 500 47995 1.0000 500 34638 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/266/266.seqs.fa -oc motifs/266 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.270 C 0.229 G 0.229 T 0.272 Background letter frequencies (from dataset with add-one prior applied): A 0.270 C 0.229 G 0.229 T 0.272 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 102 E-value = 2.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 85:7222::::::322::::7 pos.-specific C 2:2:38:5:2:5:7:2::2a3 probability G :58:2:53a52:2:5::a::: matrix T :::33:32:3858:37a:8:: bits 2.1 * * * 1.9 * ** * 1.7 * ** * 1.5 * * * ** * Relative 1.3 * * * * * * **** Entropy 1.1 **** * * **** ***** (24.5 bits) 0.9 **** * * **** ***** 0.6 **** **************** 0.4 **** **************** 0.2 **** **************** 0.0 --------------------- Multilevel AAGACCGCGGTCTCGTTGTCA consensus G TT TG T T AT C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 47995 446 5.58e-10 CAAGGTGAAA AGGTCCGCGGTCTCGTTGCCC GTACTTCATA 14688 97 1.03e-09 CGCGCCGCGG AAGAGCGCGCTCTCTTTGTCC GTAGTACTCT 42708 447 3.58e-09 ACCTTGGTCA AACATCGGGGTTTATTTGTCA AATTTCCCTT 42790 357 6.93e-09 CATGGTATCC AGGATCTTGGTCTCACTGTCA GCAGCACATT 35504 87 3.52e-08 CACAAGCCAA AAGTACTGGTTTGCGATGTCA CTTCGGCCTA 34638 234 8.69e-08 TATAATTTGG CGGACAACGTGTTAGTTGTCA GTTAAATGGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47995 5.6e-10 445_[+1]_34 14688 1e-09 96_[+1]_383 42708 3.6e-09 446_[+1]_33 42790 6.9e-09 356_[+1]_123 35504 3.5e-08 86_[+1]_393 34638 8.7e-08 233_[+1]_246 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 47995 ( 446) AGGTCCGCGGTCTCGTTGCCC 1 14688 ( 97) AAGAGCGCGCTCTCTTTGTCC 1 42708 ( 447) AACATCGGGGTTTATTTGTCA 1 42790 ( 357) AGGATCTTGGTCTCACTGTCA 1 35504 ( 87) AAGTACTGGTTTGCGATGTCA 1 34638 ( 234) CGGACAACGTGTTAGTTGTCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3360 bayes= 9.57485 E= 2.9e+001 162 -45 -923 -923 89 -923 112 -923 -923 -45 186 -923 130 -923 -923 29 -69 54 -46 29 -69 186 -923 -923 -69 -923 112 29 -923 113 54 -71 -923 -923 212 -923 -923 -45 112 29 -923 -923 -46 161 -923 113 -923 88 -923 -923 -46 161 30 154 -923 -923 -69 -923 112 29 -69 -45 -923 129 -923 -923 -923 188 -923 -923 212 -923 -923 -45 -923 161 -923 213 -923 -923 130 54 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 2.9e+001 0.833333 0.166667 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.166667 0.833333 0.000000 0.666667 0.000000 0.000000 0.333333 0.166667 0.333333 0.166667 0.333333 0.166667 0.833333 0.000000 0.000000 0.166667 0.000000 0.500000 0.333333 0.000000 0.500000 0.333333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.500000 0.333333 0.000000 0.000000 0.166667 0.833333 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.166667 0.833333 0.333333 0.666667 0.000000 0.000000 0.166667 0.000000 0.500000 0.333333 0.166667 0.166667 0.000000 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[AG]G[AT][CT]C[GT][CG]G[GT]T[CT]T[CA][GT]TTGTC[AC] -------------------------------------------------------------------------------- Time 0.43 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 7 llr = 84 E-value = 2.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::17:a::39971: pos.-specific C 9:::9:1:::1143 probability G ::6:::9171::3: matrix T 1a331::9:::117 bits 2.1 1.9 * * 1.7 * * 1.5 ** *** Relative 1.3 ** ******* Entropy 1.1 ** ******** * (17.2 bits) 0.9 ** ********* * 0.6 ************ * 0.4 ************ * 0.2 ************** 0.0 -------------- Multilevel CTGACAGTGAAACT consensus TT A GC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 34638 365 8.33e-09 TTGACATAAA CTGACAGTGAAAGT GAGTCACAGT 47995 82 5.14e-07 TTGACACCGA CTGACACTGAAAGC CAAGAGAATC 35504 315 1.98e-06 ATATTTGTCC TTGTCAGTGAAATT GGTATAGCTA 12535 427 2.38e-06 GTTGAACTTC CTGACAGGGAACAT TGGAAATAGA 42790 172 2.84e-06 CTAATGTTAA CTTTCAGTAAATCT AAGAAAATCG 42708 331 2.96e-06 CGAAGTCATG CTAATAGTAAAACT AACAAATCAC 14688 27 3.85e-06 CGCAGCATCT CTTACAGTGGCACC CGCATCAGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34638 8.3e-09 364_[+2]_122 47995 5.1e-07 81_[+2]_405 35504 2e-06 314_[+2]_172 12535 2.4e-06 426_[+2]_60 42790 2.8e-06 171_[+2]_315 42708 3e-06 330_[+2]_156 14688 3.8e-06 26_[+2]_460 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=7 34638 ( 365) CTGACAGTGAAAGT 1 47995 ( 82) CTGACACTGAAAGC 1 35504 ( 315) TTGTCAGTGAAATT 1 12535 ( 427) CTGACAGGGAACAT 1 42790 ( 172) CTTTCAGTAAATCT 1 42708 ( 331) CTAATAGTAAAACT 1 14688 ( 27) CTTACAGTGGCACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 3409 bayes= 8.92481 E= 2.7e+002 -945 191 -945 -93 -945 -945 -945 188 -92 -945 132 7 140 -945 -945 7 -945 191 -945 -93 189 -945 -945 -945 -945 -68 190 -945 -945 -945 -68 165 8 -945 164 -945 167 -945 -68 -945 167 -68 -945 -945 140 -68 -945 -93 -92 91 32 -93 -945 32 -945 139 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 7 E= 2.7e+002 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 0.000000 1.000000 0.142857 0.000000 0.571429 0.285714 0.714286 0.000000 0.000000 0.285714 0.000000 0.857143 0.000000 0.142857 1.000000 0.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 0.142857 0.857143 0.285714 0.000000 0.714286 0.000000 0.857143 0.000000 0.142857 0.000000 0.857143 0.142857 0.000000 0.000000 0.714286 0.142857 0.000000 0.142857 0.142857 0.428571 0.285714 0.142857 0.000000 0.285714 0.000000 0.714286 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CT[GT][AT]CAGT[GA]AAA[CG][TC] -------------------------------------------------------------------------------- Time 0.86 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 14 sites = 4 llr = 62 E-value = 4.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::8:a:8::::3:: pos.-specific C :::a:3:::3a::: probability G aa:::3::a5:::: matrix T ::3::53a:3:8aa bits 2.1 ** * * * 1.9 ** ** ** * ** 1.7 ** ** ** * ** 1.5 ** ** ** * ** Relative 1.3 ** ** ** * ** Entropy 1.1 ***** *** **** (22.4 bits) 0.9 ***** *** **** 0.6 ***** ******** 0.4 ************** 0.2 ************** 0.0 -------------- Multilevel GGACATATGGCTTT consensus T CT C A sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 47995 356 1.13e-08 ACATTATTAT GGACAGATGGCTTT CGATGGACCT 42790 30 1.56e-08 CAGATTATAA GGACATATGCCTTT TAGCGAACAA 12535 191 7.03e-08 TATTAGAAGG GGACACATGGCATT ACTCGCATCG 35504 206 2.07e-07 GGACAAAAAT GGTCATTTGTCTTT GAATATTATA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47995 1.1e-08 355_[+3]_131 42790 1.6e-08 29_[+3]_457 12535 7e-08 190_[+3]_296 35504 2.1e-07 205_[+3]_281 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=14 seqs=4 47995 ( 356) GGACAGATGGCTTT 1 42790 ( 30) GGACATATGCCTTT 1 12535 ( 191) GGACACATGGCATT 1 35504 ( 206) GGTCATTTGTCTTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 3409 bayes= 9.73344 E= 4.6e+002 -865 -865 212 -865 -865 -865 212 -865 147 -865 -865 -12 -865 213 -865 -865 189 -865 -865 -865 -865 13 13 88 147 -865 -865 -12 -865 -865 -865 187 -865 -865 212 -865 -865 13 112 -12 -865 213 -865 -865 -11 -865 -865 146 -865 -865 -865 187 -865 -865 -865 187 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 4 E= 4.6e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.250000 0.500000 0.750000 0.000000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GG[AT]CA[TCG][AT]TG[GCT]C[TA]TT -------------------------------------------------------------------------------- Time 1.29 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42708 7.56e-08 330_[+2(2.96e-06)]_102_\ [+1(3.58e-09)]_33 14688 1.76e-07 26_[+2(3.85e-06)]_56_[+1(1.03e-09)]_\ 383 12535 5.90e-06 190_[+3(7.03e-08)]_222_\ [+2(2.38e-06)]_60 35504 6.52e-10 86_[+1(3.52e-08)]_98_[+3(2.07e-07)]_\ 95_[+2(1.98e-06)]_172 42790 1.79e-11 29_[+3(1.56e-08)]_128_\ [+2(2.84e-06)]_171_[+1(6.93e-09)]_123 47995 2.47e-13 81_[+2(5.14e-07)]_260_\ [+3(1.13e-08)]_76_[+1(5.58e-10)]_34 34638 2.20e-08 233_[+1(8.69e-08)]_110_\ [+2(8.33e-09)]_122 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************