******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/81/81.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 8744 1.0000 500 31649 1.0000 500 1358 1.0000 500 43084 1.0000 500 36600 1.0000 500 21046 1.0000 500 13237 1.0000 500 47396 1.0000 500 29223 1.0000 500 18132 1.0000 500 42276 1.0000 500 23862 1.0000 500 44063 1.0000 500 10655 1.0000 500 34303 1.0000 500 19324 1.0000 500 44922 1.0000 500 11777 1.0000 500 45473 1.0000 500 12645 1.0000 500 46201 1.0000 500 47880 1.0000 500 47061 1.0000 500 44287 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/81/81.seqs.fa -oc motifs/81 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 24 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 12000 N= 24 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.265 C 0.240 G 0.224 T 0.271 Background letter frequencies (from dataset with add-one prior applied): A 0.265 C 0.241 G 0.224 T 0.271 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 12 llr = 172 E-value = 1.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :33:7:3:3:213:8539:1: pos.-specific C 4:5:21:7721::7214::13 probability G 5128:::2:::9:1:::18:4 matrix T 16132982:88:73:43:283 bits 2.2 1.9 1.7 * 1.5 * * ** Relative 1.3 * * * * * ** Entropy 1.1 * ** ** ** * *** (20.6 bits) 0.9 * * ********** *** 0.6 ** ************* *** 0.4 ** ****************** 0.2 ********************* 0.0 --------------------- Multilevel GTCGATTCCTTGTCAACAGTG consensus CAAT A A AT TT C sequence A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 47061 198 2.33e-10 TATAAAACCC GTCGATACCTTGACAAAAGTG TTTACGAAAC 31649 84 7.20e-10 TGCATCGATG GAAGATTCCTTGACATCAGTT GTGTACGACA 23862 420 1.78e-08 CGGAAAGGAG CTCTCTTCCTTGTCCACAGTT TATTCTTTAC 44063 187 6.00e-08 ATTCTATATT CTCGTTTCCCTATCAATAGTG AGGCTTATCA 19324 263 1.14e-07 TTGGATGTCG CTGTATTCACAGTCAACAGTC AAAGTTCGAT 47880 263 1.24e-07 TATACGACGA GGGGTTTCATTGACATAAGTC GGTCTGTCGT 42276 184 1.24e-07 TGGGGCAATC GACTCCTCATTGTCATTAGTG TTGTTGGGTG 34303 98 1.47e-07 TGGTTCTTGT CTTGATATCTTGTCAACAGAG CGGACCGTTT 21046 236 1.74e-07 GGTCCGTTTT TTCGATTGCTTGTTATTGGTG TTGTTCTATT 10655 100 3.29e-07 AAAATTCTGA CTCGATTGATTGTGAAAAGCT ACCCTGAAAA 36600 364 1.30e-06 TGGTTAATTT GAAGATTCCTAGTTCCTATTC GGATGGTGGA 43084 439 1.75e-06 AATTGAGAGT GAAGATATCTCGATATCATTC TCTCTTTCTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47061 2.3e-10 197_[+1]_282 31649 7.2e-10 83_[+1]_396 23862 1.8e-08 419_[+1]_60 44063 6e-08 186_[+1]_293 19324 1.1e-07 262_[+1]_217 47880 1.2e-07 262_[+1]_217 42276 1.2e-07 183_[+1]_296 34303 1.5e-07 97_[+1]_382 21046 1.7e-07 235_[+1]_244 10655 3.3e-07 99_[+1]_380 36600 1.3e-06 363_[+1]_116 43084 1.8e-06 438_[+1]_41 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=12 47061 ( 198) GTCGATACCTTGACAAAAGTG 1 31649 ( 84) GAAGATTCCTTGACATCAGTT 1 23862 ( 420) CTCTCTTCCTTGTCCACAGTT 1 44063 ( 187) CTCGTTTCCCTATCAATAGTG 1 19324 ( 263) CTGTATTCACAGTCAACAGTC 1 47880 ( 263) GGGGTTTCATTGACATAAGTC 1 42276 ( 184) GACTCCTCATTGTCATTAGTG 1 34303 ( 98) CTTGATATCTTGTCAACAGAG 1 21046 ( 236) TTCGATTGCTTGTTATTGGTG 1 10655 ( 100) CTCGATTGATTGTGAAAAGCT 1 36600 ( 364) GAAGATTCCTAGTTCCTATTC 1 43084 ( 439) GAAGATATCTCGATATCATTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 11520 bayes= 10.3532 E= 1.5e+000 -1023 79 116 -170 33 -1023 -143 111 -8 106 -43 -170 -1023 -1023 174 -11 133 -53 -1023 -70 -1023 -153 -1023 176 -8 -1023 -1023 147 -1023 147 -43 -70 33 147 -1023 -1023 -1023 -53 -1023 162 -67 -153 -1023 147 -167 -1023 203 -1023 33 -1023 -1023 130 -1023 147 -143 -11 165 -53 -1023 -1023 92 -153 -1023 62 -8 79 -1023 30 179 -1023 -143 -1023 -1023 -1023 189 -70 -167 -153 -1023 162 -1023 47 89 -11 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 12 E= 1.5e+000 0.000000 0.416667 0.500000 0.083333 0.333333 0.000000 0.083333 0.583333 0.250000 0.500000 0.166667 0.083333 0.000000 0.000000 0.750000 0.250000 0.666667 0.166667 0.000000 0.166667 0.000000 0.083333 0.000000 0.916667 0.250000 0.000000 0.000000 0.750000 0.000000 0.666667 0.166667 0.166667 0.333333 0.666667 0.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.166667 0.083333 0.000000 0.750000 0.083333 0.000000 0.916667 0.000000 0.333333 0.000000 0.000000 0.666667 0.000000 0.666667 0.083333 0.250000 0.833333 0.166667 0.000000 0.000000 0.500000 0.083333 0.000000 0.416667 0.250000 0.416667 0.000000 0.333333 0.916667 0.000000 0.083333 0.000000 0.000000 0.000000 0.833333 0.166667 0.083333 0.083333 0.000000 0.833333 0.000000 0.333333 0.416667 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GC][TA][CA][GT]AT[TA]C[CA]TTG[TA][CT]A[AT][CTA]AGT[GCT] -------------------------------------------------------------------------------- Time 5.05 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 9 llr = 133 E-value = 5.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :3:167:::17a941:3:1: pos.-specific C 114:1:9::31::::36238 probability G 8:1:13:1a32:11:71:6: matrix T 16492:19:2:::49::8:2 bits 2.2 * 1.9 * * 1.7 * * 1.5 *** ** Relative 1.3 * *** ** ** * Entropy 1.1 * * **** ** ** * * (21.4 bits) 0.9 * * **** *** ** * * 0.6 **** **** *** ****** 0.4 **** **** ********** 0.2 ******************** 0.0 -------------------- Multilevel GTCTAACTGCAAAATGCTGC consensus AT TG GG T CACCT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 19324 331 5.01e-09 ATAGTAGTTG GTCTTACTGGAAATTCGTGC CGAACGGGTC 47061 134 2.18e-08 CAGACTCTAC GTTTAACTGAAAAGTGACGC AGATGTCGTC 44063 343 2.41e-08 AGAAAAAACG TACTAACTGGCAATTGCTGC CTGTCAAGAC 1358 265 4.33e-08 ACTTCAGTAA GTTTAGTTGCAAGTTGCTGC CCTTAAAAGA 12645 41 7.41e-08 CTAACAGTGT GAGTCACTGTGAAATGCTCC TTTGGACCAA 13237 408 8.07e-08 AAACGATCAA GCCAAGCTGCAAAATGCTGT CCAGCTCTTA 44922 129 1.13e-07 GCTGTTAGTT CATTAACTGGAAAAAGATCC TTCTACCGCT 44287 6 2.24e-07 CTTTG GTCTTGCTGCAAAATCCCAT TGTCGACTCT 18132 123 2.77e-07 GCCAGCTGTT GTTTGACGGTGAATTCATCC TTGTGCTCCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 19324 5e-09 330_[+2]_150 47061 2.2e-08 133_[+2]_347 44063 2.4e-08 342_[+2]_138 1358 4.3e-08 264_[+2]_216 12645 7.4e-08 40_[+2]_440 13237 8.1e-08 407_[+2]_73 44922 1.1e-07 128_[+2]_352 44287 2.2e-07 5_[+2]_475 18132 2.8e-07 122_[+2]_358 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=9 19324 ( 331) GTCTTACTGGAAATTCGTGC 1 47061 ( 134) GTTTAACTGAAAAGTGACGC 1 44063 ( 343) TACTAACTGGCAATTGCTGC 1 1358 ( 265) GTTTAGTTGCAAGTTGCTGC 1 12645 ( 41) GAGTCACTGTGAAATGCTCC 1 13237 ( 408) GCCAAGCTGCAAAATGCTGT 1 44922 ( 129) CATTAACTGGAAAAAGATCC 1 44287 ( 6) GTCTTGCTGCAAAATCCCAT 1 18132 ( 123) GTTTGACGGTGAATTCATCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 11544 bayes= 9.62513 E= 5.9e+002 -982 -111 179 -128 33 -111 -982 104 -982 89 -101 72 -125 -982 -982 171 107 -111 -101 -28 133 -982 57 -982 -982 188 -982 -128 -982 -982 -101 171 -982 -982 216 -982 -125 47 57 -28 133 -111 -1 -982 192 -982 -982 -982 175 -982 -101 -982 75 -982 -101 72 -125 -982 -982 171 -982 47 157 -982 33 121 -101 -982 -982 -11 -982 152 -125 47 131 -982 -982 169 -982 -28 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 9 E= 5.9e+002 0.000000 0.111111 0.777778 0.111111 0.333333 0.111111 0.000000 0.555556 0.000000 0.444444 0.111111 0.444444 0.111111 0.000000 0.000000 0.888889 0.555556 0.111111 0.111111 0.222222 0.666667 0.000000 0.333333 0.000000 0.000000 0.888889 0.000000 0.111111 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 1.000000 0.000000 0.111111 0.333333 0.333333 0.222222 0.666667 0.111111 0.222222 0.000000 1.000000 0.000000 0.000000 0.000000 0.888889 0.000000 0.111111 0.000000 0.444444 0.000000 0.111111 0.444444 0.111111 0.000000 0.000000 0.888889 0.000000 0.333333 0.666667 0.000000 0.333333 0.555556 0.111111 0.000000 0.000000 0.222222 0.000000 0.777778 0.111111 0.333333 0.555556 0.000000 0.000000 0.777778 0.000000 0.222222 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[TA][CT]T[AT][AG]CTG[CGT][AG]AA[AT]T[GC][CA][TC][GC][CT] -------------------------------------------------------------------------------- Time 9.93 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 106 E-value = 7.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 8::2:5:::2227222:2::: pos.-specific C :2:22::8:25:3:::a::8: probability G ::a78:a:7522:528:782a matrix T 28:::5:23227:37::22:: bits 2.2 * * * * 1.9 * * * * 1.7 * * * * 1.5 * * * ** *** Relative 1.3 *** * ** ** *** Entropy 1.1 *** * *** * ** *** (25.5 bits) 0.9 ********* * ****** 0.6 ********* ********** 0.4 ********* ********** 0.2 ********************* 0.0 --------------------- Multilevel ATGGGAGCGGCTAGTGCGGCG consensus T T CT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 18132 255 6.02e-11 AAAATGAACC ATGGCAGCGCCTAGTGCGGCG TGGATACGGC 1358 453 6.28e-10 AGGAGTGGAG ATGCGTGCTGCTATTGCTGCG AAAGAGGCGG 36600 33 2.87e-09 ATAAGCGACA ATGGGAGCGGAGAGTACAGCG TTGGACAATC 13237 121 1.17e-08 GTCCGCCTGG TTGGGTGCGTGTCAAGCGGCG GAAGCGGTAC 29223 136 1.44e-08 TACCGACACA ACGAGAGCTACTCGTGCGGGG GAAGCCTGCC 31649 395 2.01e-08 TTGCCCAAGC ATGGGTGTGGTAATGGCGTCG GCCATGTTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 18132 6e-11 254_[+3]_225 1358 6.3e-10 452_[+3]_27 36600 2.9e-09 32_[+3]_447 13237 1.2e-08 120_[+3]_359 29223 1.4e-08 135_[+3]_344 31649 2e-08 394_[+3]_85 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 18132 ( 255) ATGGCAGCGCCTAGTGCGGCG 1 1358 ( 453) ATGCGTGCTGCTATTGCTGCG 1 36600 ( 33) ATGGGAGCGGAGAGTACAGCG 1 13237 ( 121) TTGGGTGCGTGTCAAGCGGCG 1 29223 ( 136) ACGAGAGCTACTCGTGCGGGG 1 31649 ( 395) ATGGGTGTGGTAATGGCGTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 11520 bayes= 11.3538 E= 7.2e+002 165 -923 -923 -70 -923 -53 -923 162 -923 -923 216 -923 -67 -53 157 -923 -923 -53 189 -923 92 -923 -923 88 -923 -923 216 -923 -923 179 -923 -70 -923 -923 157 30 -67 -53 116 -70 -67 105 -43 -70 -67 -923 -43 130 133 47 -923 -923 -67 -923 116 30 -67 -923 -43 130 -67 -923 189 -923 -923 205 -923 -923 -67 -923 157 -70 -923 -923 189 -70 -923 179 -43 -923 -923 -923 216 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 7.2e+002 0.833333 0.000000 0.000000 0.166667 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 1.000000 0.000000 0.166667 0.166667 0.666667 0.000000 0.000000 0.166667 0.833333 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.666667 0.333333 0.166667 0.166667 0.500000 0.166667 0.166667 0.500000 0.166667 0.166667 0.166667 0.000000 0.166667 0.666667 0.666667 0.333333 0.000000 0.000000 0.166667 0.000000 0.500000 0.333333 0.166667 0.000000 0.166667 0.666667 0.166667 0.000000 0.833333 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.000000 0.666667 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- ATGGG[AT]GC[GT]GCT[AC][GT]TGCGGCG -------------------------------------------------------------------------------- Time 14.61 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8744 5.52e-01 500 31649 8.74e-11 83_[+1(7.20e-10)]_290_\ [+3(2.01e-08)]_85 1358 1.21e-09 264_[+2(4.33e-08)]_168_\ [+3(6.28e-10)]_27 43084 8.75e-03 438_[+1(1.75e-06)]_41 36600 1.47e-07 32_[+3(2.87e-09)]_310_\ [+1(1.30e-06)]_116 21046 2.52e-04 235_[+1(1.74e-07)]_244 13237 5.88e-08 120_[+3(1.17e-08)]_96_\ [+2(8.55e-05)]_150_[+2(8.07e-08)]_73 47396 2.41e-01 228_[+2(9.06e-05)]_99_\ [+2(7.60e-05)]_133 29223 1.44e-04 135_[+3(1.44e-08)]_344 18132 1.03e-09 122_[+2(2.77e-07)]_112_\ [+3(6.02e-11)]_225 42276 1.13e-03 183_[+1(1.24e-07)]_296 23862 1.43e-04 419_[+1(1.78e-08)]_60 44063 8.65e-08 186_[+1(6.00e-08)]_135_\ [+2(2.41e-08)]_138 10655 9.04e-06 99_[+1(3.29e-07)]_380 34303 3.04e-05 29_[+2(9.32e-05)]_48_[+1(1.47e-07)]_\ 382 19324 9.76e-09 262_[+1(1.14e-07)]_47_\ [+2(5.01e-09)]_150 44922 1.48e-03 128_[+2(1.13e-07)]_352 11777 7.01e-01 500 45473 3.97e-01 500 12645 4.34e-04 40_[+2(7.41e-08)]_440 46201 6.11e-01 500 47880 1.17e-03 91_[+1(1.75e-06)]_150_\ [+1(1.24e-07)]_217 47061 2.39e-10 133_[+2(2.18e-08)]_44_\ [+1(2.33e-10)]_282 44287 7.74e-04 5_[+2(2.24e-07)]_139_[+2(9.06e-05)]_\ 316 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************