******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/61/61.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 30335 1.0000 500 37880 1.0000 500 37959 1.0000 500 bd1112 1.0000 98 ThpsCp089 1.0000 500 ThpsCp126 1.0000 500 ThpsCs001 1.0000 500 ThpsCt006 1.0000 500 ThpsCt007 1.0000 500 ThpsCt018 1.0000 500 ThpsCt025 1.0000 500 ThpsCt027 1.0000 500 ThpsCt031 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/61/61.seqs.fa -oc motifs/61 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6098 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.336 C 0.158 G 0.156 T 0.350 Background letter frequencies (from dataset with add-one prior applied): A 0.336 C 0.158 G 0.156 T 0.350 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 7 llr = 139 E-value = 6.2e-008 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 9::::336:::1a::67:71: pos.-specific C :17:::3:::67:::136::a probability G 13:aa741:::1:963::::: matrix T :63::::3aa4::14::439: bits 2.7 ** * 2.4 ** * 2.1 ** * 1.9 ** * * Relative 1.6 *** ** ** * Entropy 1.3 **** ** *** * (28.7 bits) 1.1 * **** ******* ** * 0.8 * ***** ******* ***** 0.5 ******* ************* 0.3 ********************* 0.0 --------------------- Multilevel ATCGGGGATTCCAGGAACATC consensus GT AAT T TGCTT sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- ThpsCt031 39 6.79e-12 CTAATGGAAA AGTGGGCATTCCAGGAACATC TGGATAACCC ThpsCt018 37 6.79e-12 CTAATGGAAA AGTGGGCATTCCAGGAACATC TGGATAACCC ThpsCt007 132 1.26e-10 TTCTTATCGT ATCGGGGTTTTCAGGGCTTTC TATTCCGAAG 30335 364 6.13e-10 AAGTCACGCC ACCGGGGTTTCAAGGGACAAC GAGAGCGCAC ThpsCt027 236 1.37e-09 AATTAAATGC ATCGGAAATTTCAGTAATATC TCGAGGTAAA ThpsCp126 168 1.37e-09 AATTAAATGC ATCGGAAATTTCAGTAATATC TCGAGGTAAA ThpsCs001 418 7.75e-09 GATTCTGAAA GTCGGGGGTTCGATTCCCTTC ATGCTTATTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ThpsCt031 6.8e-12 38_[+1]_441 ThpsCt018 6.8e-12 36_[+1]_443 ThpsCt007 1.3e-10 131_[+1]_348 30335 6.1e-10 363_[+1]_116 ThpsCt027 1.4e-09 235_[+1]_244 ThpsCp126 1.4e-09 167_[+1]_312 ThpsCs001 7.8e-09 417_[+1]_62 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=7 ThpsCt031 ( 39) AGTGGGCATTCCAGGAACATC 1 ThpsCt018 ( 37) AGTGGGCATTCCAGGAACATC 1 ThpsCt007 ( 132) ATCGGGGTTTTCAGGGCTTTC 1 30335 ( 364) ACCGGGGTTTCAAGGGACAAC 1 ThpsCt027 ( 236) ATCGGAAATTTCAGTAATATC 1 ThpsCp126 ( 168) ATCGGAAATTTCAGTAATATC 1 ThpsCs001 ( 418) GTCGGGGGTTCGATTCCCTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5838 bayes= 9.54586 E= 6.2e-008 135 -945 -13 -945 -945 -15 87 71 -945 217 -945 -29 -945 -945 268 -945 -945 -945 268 -945 -23 -945 219 -945 -23 85 146 -945 77 -945 -13 -29 -945 -945 -945 151 -945 -945 -945 151 -945 185 -945 29 -123 217 -13 -945 157 -945 -945 -945 -945 -945 246 -129 -945 -945 187 29 77 -15 87 -945 109 85 -945 -945 -945 185 -945 29 109 -945 -945 -29 -123 -945 -945 129 -945 266 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 6.2e-008 0.857143 0.000000 0.142857 0.000000 0.000000 0.142857 0.285714 0.571429 0.000000 0.714286 0.000000 0.285714 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.285714 0.000000 0.714286 0.000000 0.285714 0.285714 0.428571 0.000000 0.571429 0.000000 0.142857 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.571429 0.000000 0.428571 0.142857 0.714286 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.571429 0.428571 0.571429 0.142857 0.285714 0.000000 0.714286 0.285714 0.000000 0.000000 0.000000 0.571429 0.000000 0.428571 0.714286 0.000000 0.000000 0.285714 0.142857 0.000000 0.000000 0.857143 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[TG][CT]GG[GA][GAC][AT]TT[CT]CAG[GT][AG][AC][CT][AT]TC -------------------------------------------------------------------------------- Time 1.16 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 5 llr = 101 E-value = 6.3e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 22:466::a:::6:82::: pos.-specific C 8:2:4::::282462::68 probability G ::86:4a8:228:::8a:2 matrix T :8:::::2:6:::4:::4: bits 2.7 * * 2.4 * * 2.1 * * 1.9 * * ** * * Relative 1.6 * * *** ** ** * Entropy 1.3 * ** *** ** * **** (29.2 bits) 1.1 * ******* ********* 0.8 ********* ********* 0.5 ******************* 0.3 ******************* 0.0 ------------------- Multilevel CTGGAAGGATCGACAGGCC consensus AACACG T CGCCTCA TG sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 37880 165 1.69e-11 AGTTGGATTT CTGGCAGGAGCGACCGGTC TCACACTCAG bd1112 69 4.09e-11 ATTCTAAAAA CTGAAAGGATCGATAGGCC ACGGTTTCCC 37959 471 4.09e-11 ATTCTAAAAA CTGAAAGGATCGATAGGCC ACGGTTTCCC 30335 199 1.53e-09 CGCAACATCA AACGAGGGACCCCCAGGCC CAACCCTCAG ThpsCt025 352 4.06e-09 AACAGTATTT CTGGCGGTATGGCCAAGTG GTAAGGCAGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37880 1.7e-11 164_[+2]_317 bd1112 4.1e-11 68_[+2]_11 37959 4.1e-11 470_[+2]_11 30335 1.5e-09 198_[+2]_283 ThpsCt025 4.1e-09 351_[+2]_130 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=5 37880 ( 165) CTGGCAGGAGCGACCGGTC 1 bd1112 ( 69) CTGAAAGGATCGATAGGCC 1 37959 ( 471) CTGAAAGGATCGATAGGCC 1 30335 ( 199) AACGAGGGACCCCCAGGCC 1 ThpsCt025 ( 352) CTGGCGGTATGGCCAAGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 5864 bayes= 10.4462 E= 6.3e-004 -74 234 -897 -897 -74 -897 -897 119 -897 34 236 -897 25 -897 194 -897 84 134 -897 -897 84 -897 136 -897 -897 -897 268 -897 -897 -897 236 -81 157 -897 -897 -897 -897 34 36 78 -897 234 36 -897 -897 34 236 -897 84 134 -897 -897 -897 192 -897 19 125 34 -897 -897 -74 -897 236 -897 -897 -897 268 -897 -897 192 -897 19 -897 234 36 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 5 E= 6.3e-004 0.200000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.200000 0.800000 0.000000 0.400000 0.000000 0.600000 0.000000 0.600000 0.400000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.200000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.200000 0.600000 0.000000 0.800000 0.200000 0.000000 0.000000 0.200000 0.800000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.800000 0.200000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.800000 0.200000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CA][TA][GC][GA][AC][AG]G[GT]A[TCG][CG][GC][AC][CT][AC][GA]G[CT][CG] -------------------------------------------------------------------------------- Time 2.34 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 6 llr = 111 E-value = 3.2e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :38:85:::35aa5:::a:: pos.-specific C 332:255732::::288::a probability G 73:7::5233:::3::::a: matrix T :::3:::2325::2822::: bits 2.7 ** 2.4 ** 2.1 ** 1.9 * ** ** Relative 1.6 * * ** ***** Entropy 1.3 * * ** ** ***** (26.7 bits) 1.1 * ****** ** ****** 0.8 ********* ** ****** 0.5 ********* ********** 0.3 ******************** 0.0 -------------------- Multilevel GAAGAACCCAAAAATCCAGC consensus CC T CG GGT G sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 37880 93 7.55e-11 TCTGGTGCCA GCAGCCGCGGTAATTCCAGC TCCAATAGCG 30335 146 2.09e-10 CCGTCCGAAG GAAGACCTGGAAAGTCCAGC GTGAGCCAGT ThpsCt031 61 4.72e-10 AGGAACATCT GGATAACCCAAAAATCCAGC GATGGAATCA ThpsCt018 59 4.72e-10 AGGAACATCT GGATAACCCAAAAATCCAGC GATGGAATCA 37959 276 9.10e-09 TAACCTGTCT CACGACGGTCTAAACCCAGC TCACGTTCCC ThpsCs001 22 3.36e-08 TGGTTTAGAT CCAGAAGCTTTAAGTTTAGC AACTCCACGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37880 7.5e-11 92_[+3]_388 30335 2.1e-10 145_[+3]_335 ThpsCt031 4.7e-10 60_[+3]_420 ThpsCt018 4.7e-10 58_[+3]_422 37959 9.1e-09 275_[+3]_205 ThpsCs001 3.4e-08 21_[+3]_459 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=6 37880 ( 93) GCAGCCGCGGTAATTCCAGC 1 30335 ( 146) GAAGACCTGGAAAGTCCAGC 1 ThpsCt031 ( 61) GGATAACCCAAAAATCCAGC 1 ThpsCt018 ( 59) GGATAACCCAAAAATCCAGC 1 37959 ( 276) CACGACGGTCTAAACCCAGC 1 ThpsCs001 ( 22) CCAGAAGCTTTAAGTTTAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5851 bayes= 10.3759 E= 3.2e-002 -923 107 209 -923 -1 107 109 -923 131 8 -923 -923 -923 -923 209 -7 131 8 -923 -923 57 166 -923 -923 -923 166 168 -923 -923 207 10 -107 -923 107 109 -7 -1 8 109 -107 57 -923 -923 51 157 -923 -923 -923 157 -923 -923 -923 57 -923 109 -107 -923 8 -923 125 -923 240 -923 -107 -923 240 -923 -107 157 -923 -923 -923 -923 -923 268 -923 -923 266 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 3.2e-002 0.000000 0.333333 0.666667 0.000000 0.333333 0.333333 0.333333 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.833333 0.166667 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.666667 0.166667 0.166667 0.000000 0.333333 0.333333 0.333333 0.333333 0.166667 0.333333 0.166667 0.500000 0.000000 0.000000 0.500000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.333333 0.166667 0.000000 0.166667 0.000000 0.833333 0.000000 0.833333 0.000000 0.166667 0.000000 0.833333 0.000000 0.166667 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC][ACG]A[GT]A[AC][CG]C[CGT][AG][AT]AA[AG]TCCAGC -------------------------------------------------------------------------------- Time 3.45 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 30335 2.34e-17 145_[+3(2.09e-10)]_33_\ [+2(1.53e-09)]_146_[+1(6.13e-10)]_116 37880 2.66e-14 92_[+3(7.55e-11)]_52_[+2(1.69e-11)]_\ 317 37959 2.45e-11 275_[+3(9.10e-09)]_175_\ [+2(4.09e-11)]_11 bd1112 1.85e-08 68_[+2(4.09e-11)]_11 ThpsCp089 4.61e-01 500 ThpsCp126 1.60e-05 167_[+1(1.37e-09)]_287_\ [+1(6.63e-06)]_4 ThpsCs001 1.75e-08 21_[+3(3.36e-08)]_376_\ [+1(7.75e-09)]_62 ThpsCt006 2.90e-01 500 ThpsCt007 9.18e-06 131_[+1(1.26e-10)]_348 ThpsCt018 4.13e-13 36_[+1(6.79e-12)]_1_[+3(4.72e-10)]_\ 422 ThpsCt025 6.87e-05 351_[+2(4.06e-09)]_130 ThpsCt027 9.08e-06 235_[+1(1.37e-09)]_244 ThpsCt031 4.13e-13 38_[+1(6.79e-12)]_1_[+3(4.72e-10)]_\ 420 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************