******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/208/208.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42609 1.0000 500 42729 1.0000 500 43510 1.0000 500 43779 1.0000 500 48916 1.0000 500 40101 1.0000 500 44001 1.0000 500 44695 1.0000 500 45319 1.0000 500 19982 1.0000 500 46059 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/208/208.seqs.fa -oc motifs/208 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.278 C 0.226 G 0.229 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.278 C 0.226 G 0.229 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 9 llr = 97 E-value = 1.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a::12::1:21: pos.-specific C :1::413:6::: probability G :6:429794::a matrix T :3a41::::89: bits 2.1 * 1.9 * * * 1.7 * * * * 1.5 * * * * * Relative 1.3 * * *** ** Entropy 1.1 * * ******* (15.5 bits) 0.9 * * ******* 0.6 **** ******* 0.4 **** ******* 0.2 ************ 0.0 ------------ Multilevel AGTGCGGGCTTG consensus T TA C GA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 46059 127 4.37e-07 ACACTGAAAT AGTTCGCGCTTG GCAGACGTCA 43510 150 1.69e-06 GATGTGTGGG ACTGCGGGCTTG TTGCTAACTA 44695 59 1.75e-06 GATAACTCTA AGTTTGGGCTTG GAATTCTGAT 19982 228 2.35e-06 TCACGAAATG AGTACGGGGTTG CCGCCACCAG 45319 104 4.49e-06 GACAGACGAC AGTGCGGGGTAG GTCCCACGAC 43779 339 9.39e-06 CCATGCTTCG AGTGAGCGGATG AGCCCTTGAG 44001 247 1.05e-05 TCCATAATTT ATTGGGCGCATG TACTATTTAT 42609 23 1.57e-05 ATTATACCCA ATTTGGGACTTG ACCAAAGCAT 40101 83 1.89e-05 CAATATCTTC ATTTACGGGTTG ATCCCAATCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46059 4.4e-07 126_[+1]_362 43510 1.7e-06 149_[+1]_339 44695 1.7e-06 58_[+1]_430 19982 2.4e-06 227_[+1]_261 45319 4.5e-06 103_[+1]_385 43779 9.4e-06 338_[+1]_150 44001 1.1e-05 246_[+1]_242 42609 1.6e-05 22_[+1]_466 40101 1.9e-05 82_[+1]_406 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=9 46059 ( 127) AGTTCGCGCTTG 1 43510 ( 150) ACTGCGGGCTTG 1 44695 ( 59) AGTTTGGGCTTG 1 19982 ( 228) AGTACGGGGTTG 1 45319 ( 104) AGTGCGGGGTAG 1 43779 ( 339) AGTGAGCGGATG 1 44001 ( 247) ATTGGGCGCATG 1 42609 ( 23) ATTTGGGACTTG 1 40101 ( 83) ATTTACGGGTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5379 bayes= 9.3553 E= 1.3e+002 185 -982 -982 -982 -982 -102 128 32 -982 -982 -982 190 -132 -982 96 73 -32 98 -4 -126 -982 -102 196 -982 -982 56 154 -982 -132 -982 196 -982 -982 130 96 -982 -32 -982 -982 154 -132 -982 -982 173 -982 -982 213 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 1.3e+002 1.000000 0.000000 0.000000 0.000000 0.000000 0.111111 0.555556 0.333333 0.000000 0.000000 0.000000 1.000000 0.111111 0.000000 0.444444 0.444444 0.222222 0.444444 0.222222 0.111111 0.000000 0.111111 0.888889 0.000000 0.000000 0.333333 0.666667 0.000000 0.111111 0.000000 0.888889 0.000000 0.000000 0.555556 0.444444 0.000000 0.222222 0.000000 0.000000 0.777778 0.111111 0.000000 0.000000 0.888889 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[GT]T[GT][CAG]G[GC]G[CG][TA]TG -------------------------------------------------------------------------------- Time 1.10 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 18 sites = 4 llr = 75 E-value = 1.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 3:33a3::8::::3a::a pos.-specific C 8:3::5a5:aa::::3a: probability G :a5::::33::aa3:::: matrix T :::8:3:3:::::5:8:: bits 2.1 * * **** * 1.9 * * * **** * ** 1.7 * * * **** * ** 1.5 * * * **** * ** Relative 1.3 ** * * **** * ** Entropy 1.1 ** ** * ***** **** (27.2 bits) 0.9 ** ** * ***** **** 0.6 ***** ******* **** 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel CGGTACCCACCGGTATCA consensus A AA A GG A C sequence C T T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 45319 146 1.07e-10 TACCTGTATA CGGTACCTACCGGTATCA CTCCGCACAA 44695 441 1.53e-09 GCATCGGAGG CGATATCGACCGGTATCA TAGTTTATGT 40101 244 3.93e-09 TCGTTTCTGA AGCTACCCGCCGGGATCA TGGCGGCGGC 42729 290 4.97e-09 TGATGCCATA CGGAAACCACCGGAACCA CTAACTAGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45319 1.1e-10 145_[+2]_337 44695 1.5e-09 440_[+2]_42 40101 3.9e-09 243_[+2]_239 42729 5e-09 289_[+2]_193 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=18 seqs=4 45319 ( 146) CGGTACCTACCGGTATCA 1 44695 ( 441) CGATATCGACCGGTATCA 1 40101 ( 244) AGCTACCCGCCGGGATCA 1 42729 ( 290) CGGAAACCACCGGAACCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 5313 bayes= 10.3742 E= 1.8e+002 -15 173 -865 -865 -865 -865 213 -865 -15 15 113 -865 -15 -865 -865 149 184 -865 -865 -865 -15 114 -865 -10 -865 214 -865 -865 -865 114 13 -10 143 -865 13 -865 -865 214 -865 -865 -865 214 -865 -865 -865 -865 213 -865 -865 -865 213 -865 -15 -865 13 90 184 -865 -865 -865 -865 15 -865 149 -865 214 -865 -865 184 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 4 E= 1.8e+002 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.250000 0.500000 0.000000 0.250000 0.000000 0.000000 0.750000 1.000000 0.000000 0.000000 0.000000 0.250000 0.500000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.250000 0.500000 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CA]G[GAC][TA]A[CAT]C[CGT][AG]CCGG[TAG]A[TC]CA -------------------------------------------------------------------------------- Time 2.22 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 2 llr = 42 E-value = 6.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::5::a::5::::: pos.-specific C ::aa:aa:aa:aa5aa probability G :a::::::::5::5:: matrix T a:::5::::::::::: bits 2.1 *** ** ** ** ** 1.9 **** ***** ** ** 1.7 **** ***** ** ** 1.5 **** ***** ** ** Relative 1.3 **** ***** ** ** Entropy 1.1 **** *********** (30.3 bits) 0.9 **************** 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TGCCACCACCACCCCC consensus T G G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 19982 367 3.36e-10 CCGGGTCTAA TGCCTCCACCGCCGCC AGGAATGTAC 40101 347 7.43e-10 CCACAATATC TGCCACCACCACCCCC ACCAACACCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 19982 3.4e-10 366_[+3]_118 40101 7.4e-10 346_[+3]_138 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=2 19982 ( 367) TGCCTCCACCGCCGCC 1 40101 ( 347) TGCCACCACCACCCCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5335 bayes= 11.3807 E= 6.8e+002 -765 -765 -765 190 -765 -765 212 -765 -765 214 -765 -765 -765 214 -765 -765 84 -765 -765 90 -765 214 -765 -765 -765 214 -765 -765 184 -765 -765 -765 -765 214 -765 -765 -765 214 -765 -765 84 -765 112 -765 -765 214 -765 -765 -765 214 -765 -765 -765 114 112 -765 -765 214 -765 -765 -765 214 -765 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 2 E= 6.8e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TGCC[AT]CCACC[AG]CC[CG]CC -------------------------------------------------------------------------------- Time 3.23 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42609 5.31e-02 22_[+1(1.57e-05)]_466 42729 5.52e-05 289_[+2(4.97e-09)]_193 43510 4.67e-03 149_[+1(1.69e-06)]_339 43779 4.17e-02 338_[+1(9.39e-06)]_150 48916 9.42e-01 500 40101 3.57e-12 82_[+1(1.89e-05)]_149_\ [+2(3.93e-09)]_85_[+3(7.43e-10)]_138 44001 2.71e-02 246_[+1(1.05e-05)]_242 44695 8.42e-08 58_[+1(1.75e-06)]_370_\ [+2(1.53e-09)]_42 45319 8.52e-09 103_[+1(4.49e-06)]_30_\ [+2(1.07e-10)]_337 19982 3.52e-09 165_[+3(7.01e-06)]_46_\ [+1(2.35e-06)]_67_[+3(1.81e-05)]_44_[+3(3.36e-10)]_45_[+3(1.04e-05)]_57 46059 7.68e-03 126_[+1(4.37e-07)]_362 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************