******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/482/482.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 48704 1.0000 500 2215 1.0000 500 18180 1.0000 500 39601 1.0000 500 49027 1.0000 500 49701 1.0000 500 44339 1.0000 500 44641 1.0000 500 34307 1.0000 500 34715 1.0000 500 11710 1.0000 500 35419 1.0000 500 32745 1.0000 500 40515 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/482/482.seqs.fa -oc motifs/482 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.276 C 0.247 G 0.229 T 0.247 Background letter frequencies (from dataset with add-one prior applied): A 0.276 C 0.247 G 0.229 T 0.247 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 12 llr = 123 E-value = 1.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 33:91a3::987 pos.-specific C ::a:5:::9::3 probability G 13::1:71:1:: matrix T 75:13:191:2: bits 2.1 1.9 * * 1.7 * * ** 1.5 ** * *** Relative 1.3 ** * **** Entropy 1.1 ** * ***** (14.8 bits) 0.8 * ** ******* 0.6 * ** ******* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTCACAGTCAAA consensus AA T A C sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 49701 458 3.18e-07 AGTCGTAACT TTCACAGTCAAC ACAGCGCTCG 2215 199 3.18e-07 GACGATACTA TGCACAGTCAAA TTGACAATCA 49027 258 4.11e-07 GTGAGTATCC TACACAGTCAAA AACTCTTCCT 35419 370 1.59e-06 TCTCTATTTA ATCACAGTCAAC AAAAGCTACC 44641 74 4.15e-06 TCTACCAGTT TGCAAAGTCAAA TATATTATCT 11710 61 4.95e-06 ATATTTCAAA TTCATAGTCGAA CTGGAACGCT 34307 213 1.41e-05 TCAACCGGTA TTCACATTCATA GATCAAAGCA 44339 274 1.50e-05 TGGCGTTGAC TACAGAATCAAA AACAGCTAGG 39601 326 1.50e-05 TGGTCATGGG GTCACAATCAAC GTGTCAGAAA 40515 91 3.25e-05 AAGCCGAATG AACATAGGCAAA CAGTGTCGTG 32745 60 3.85e-05 TGTCTTTGTG TGCATAGTTATA GAGGGTTGTC 18180 264 5.95e-05 CCGAAATATC ATCTTAATCAAC AAGTATCAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49701 3.2e-07 457_[+1]_31 2215 3.2e-07 198_[+1]_290 49027 4.1e-07 257_[+1]_231 35419 1.6e-06 369_[+1]_119 44641 4.2e-06 73_[+1]_415 11710 5e-06 60_[+1]_428 34307 1.4e-05 212_[+1]_276 44339 1.5e-05 273_[+1]_215 39601 1.5e-05 325_[+1]_163 40515 3.3e-05 90_[+1]_398 32745 3.9e-05 59_[+1]_429 18180 5.9e-05 263_[+1]_225 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=12 49701 ( 458) TTCACAGTCAAC 1 2215 ( 199) TGCACAGTCAAA 1 49027 ( 258) TACACAGTCAAA 1 35419 ( 370) ATCACAGTCAAC 1 44641 ( 74) TGCAAAGTCAAA 1 11710 ( 61) TTCATAGTCGAA 1 34307 ( 213) TTCACATTCATA 1 44339 ( 274) TACAGAATCAAA 1 39601 ( 326) GTCACAATCAAC 1 40515 ( 91) AACATAGGCAAA 1 32745 ( 60) TGCATAGTTATA 1 18180 ( 264) ATCTTAATCAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.60169 E= 1.5e+001 -14 -1023 -146 143 -14 -1023 12 101 -1023 201 -1023 -1023 173 -1023 -1023 -157 -172 101 -146 43 186 -1023 -1023 -1023 -14 -1023 154 -157 -1023 -1023 -146 189 -1023 189 -1023 -157 173 -1023 -146 -1023 159 -1023 -1023 -57 127 43 -1023 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 1.5e+001 0.250000 0.000000 0.083333 0.666667 0.250000 0.000000 0.250000 0.500000 0.000000 1.000000 0.000000 0.000000 0.916667 0.000000 0.000000 0.083333 0.083333 0.500000 0.083333 0.333333 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.666667 0.083333 0.000000 0.000000 0.083333 0.916667 0.000000 0.916667 0.000000 0.083333 0.916667 0.000000 0.083333 0.000000 0.833333 0.000000 0.000000 0.166667 0.666667 0.333333 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TA][TAG]CA[CT]A[GA]TCAA[AC] -------------------------------------------------------------------------------- Time 1.73 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 14 llr = 132 E-value = 3.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::3:6:431::: pos.-specific C 93:61a6:32:9 probability G :7:2::112::1 matrix T 1:712::648a: bits 2.1 1.9 * * 1.7 * * * 1.5 * * ** Relative 1.3 ** * *** Entropy 1.1 *** * *** (13.6 bits) 0.8 **** * *** 0.6 ******** *** 0.4 ******** *** 0.2 ************ 0.0 ------------ Multilevel CGTCACCTTTTC consensus CAGT AACC sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 18180 439 1.07e-07 GTCGAATCGA CGTCACCTCTTC CGAAGGAAAG 2215 95 2.16e-07 AAGTTGAATC CGTCACCTGTTC TCCGTCCGGA 34715 425 3.93e-07 AGCAGAGGTC CCTCACCTTTTC AAGAGGGATC 49701 281 3.59e-06 ACCGCCAACA CGTCACCATCTC CGTCGCCCAC 35419 427 1.58e-05 GTATCTTCAT CCTCACATTTTG ACTGAGACTC 49027 358 2.12e-05 TACGATGCGT CGACACATATTC ACAATGAAAC 48704 224 2.74e-05 TCGGCTTTTT CGTTTCCTTCTC TGGACGACAC 40515 200 3.24e-05 GCCGTTACGA CGTGACGATTTC TCAATATTTA 32745 270 3.75e-05 CTAAGGGGTT CGACTCAAGTTC GAATGTAGTC 44339 436 4.03e-05 CGGATAGACC CGACCCAACTTC CCCCAAGGGT 34307 29 4.72e-05 AGCTGTAAAG CGTGTCCTGTTG GAATAGTTGT 44641 177 5.07e-05 GTGGCAGAGC TCTGACCTTTTC AAATCAGCTC 11710 144 7.45e-05 TCAGCAATCG CCACACCGCCTC TCACAGTCAA 39601 302 8.35e-05 ATTGCAACCA CGTTCCAGCTTC ACTGGTCATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 18180 1.1e-07 438_[+2]_50 2215 2.2e-07 94_[+2]_394 34715 3.9e-07 424_[+2]_64 49701 3.6e-06 280_[+2]_208 35419 1.6e-05 426_[+2]_62 49027 2.1e-05 357_[+2]_131 48704 2.7e-05 223_[+2]_265 40515 3.2e-05 199_[+2]_289 32745 3.8e-05 269_[+2]_219 44339 4e-05 435_[+2]_53 34307 4.7e-05 28_[+2]_460 44641 5.1e-05 176_[+2]_312 11710 7.4e-05 143_[+2]_345 39601 8.4e-05 301_[+2]_187 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=14 18180 ( 439) CGTCACCTCTTC 1 2215 ( 95) CGTCACCTGTTC 1 34715 ( 425) CCTCACCTTTTC 1 49701 ( 281) CGTCACCATCTC 1 35419 ( 427) CCTCACATTTTG 1 49027 ( 358) CGACACATATTC 1 48704 ( 224) CGTTTCCTTCTC 1 40515 ( 200) CGTGACGATTTC 1 32745 ( 270) CGACTCAAGTTC 1 44339 ( 436) CGACCCAACTTC 1 34307 ( 29) CGTGTCCTGTTG 1 44641 ( 177) TCTGACCTTTTC 1 11710 ( 144) CCACACCGCCTC 1 39601 ( 302) CGTTCCAGCTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 8.93074 E= 3.5e+001 -1045 191 -1045 -179 -1045 21 164 -1045 5 -1045 -1045 153 -1045 138 -10 -79 122 -79 -1045 -21 -1045 201 -1045 -1045 37 121 -168 -1045 5 -1045 -68 121 -195 21 -10 79 -1045 -21 -1045 167 -1045 -1045 -1045 201 -1045 179 -68 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 3.5e+001 0.000000 0.928571 0.000000 0.071429 0.000000 0.285714 0.714286 0.000000 0.285714 0.000000 0.000000 0.714286 0.000000 0.642857 0.214286 0.142857 0.642857 0.142857 0.000000 0.214286 0.000000 1.000000 0.000000 0.000000 0.357143 0.571429 0.071429 0.000000 0.285714 0.000000 0.142857 0.571429 0.071429 0.285714 0.214286 0.428571 0.000000 0.214286 0.000000 0.785714 0.000000 0.000000 0.000000 1.000000 0.000000 0.857143 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[GC][TA][CG][AT]C[CA][TA][TCG][TC]TC -------------------------------------------------------------------------------- Time 3.56 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 11 llr = 142 E-value = 1.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :5aa74:13::::::2::5 pos.-specific C :5::235221235438453 probability G 9:::145252:1541::52 matrix T 1::::::51786:36:6:: bits 2.1 1.9 ** 1.7 * ** 1.5 * ** Relative 1.3 * ** * * Entropy 1.1 * ** * * * *** (18.6 bits) 0.8 ***** * **** **** 0.6 ***** * **** **** 0.4 ******** ********** 0.2 ******************* 0.0 ------------------- Multilevel GAAAAACTGTTTGCTCTGA consensus C GG A CCGC CCC sequence C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 39601 8 2.25e-09 CATTAAG GAAAAGCTCTTTGCTCTCA AATCGAAAAG 44641 423 1.17e-08 GTAACACAAA GAAAAAGTATTTCGCCTGA AGTGTAAGAG 34307 104 3.43e-08 TAATTGGAGA GAAAAGCTGTTGCGTCCGA GCTACCGCCT 34715 319 2.16e-07 ACAGGTCTTG GAAAAACAGGTTGGTCTGC CTGCCATACC 44339 96 2.67e-07 CGCGTGTCGT GCAAAGGCGTTTGCGCTGG GGGTATCTTG 18180 327 8.09e-07 GACGGCATCC GCAACGGCATTTGCTCCCG CGCAGACGGC 40515 263 1.30e-06 CGTTGCCTAC GCAAGCGTGTCCGTTCCGA ATTTCCTCTC 11710 249 1.40e-06 CCAGCGTTAC GAAAAACGATCCCCCCTCA GACCGCGGTC 32745 118 2.42e-06 GTAACTGTAA GCAAACGTTTTCGTCACGA GACTGGCGAG 48704 460 2.58e-06 TCGCTAACCA GAAACACTGCTTCTTATCC CTTGCGACTC 49701 307 4.11e-06 GCCCACATTT TCAAACCGCGTTCGTCTCC GAATCCACCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39601 2.2e-09 7_[+3]_474 44641 1.2e-08 422_[+3]_59 34307 3.4e-08 103_[+3]_378 34715 2.2e-07 318_[+3]_163 44339 2.7e-07 95_[+3]_386 18180 8.1e-07 326_[+3]_155 40515 1.3e-06 262_[+3]_219 11710 1.4e-06 248_[+3]_233 32745 2.4e-06 117_[+3]_364 48704 2.6e-06 459_[+3]_22 49701 4.1e-06 306_[+3]_175 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=11 39601 ( 8) GAAAAGCTCTTTGCTCTCA 1 44641 ( 423) GAAAAAGTATTTCGCCTGA 1 34307 ( 104) GAAAAGCTGTTGCGTCCGA 1 34715 ( 319) GAAAAACAGGTTGGTCTGC 1 44339 ( 96) GCAAAGGCGTTTGCGCTGG 1 18180 ( 327) GCAACGGCATTTGCTCCCG 1 40515 ( 263) GCAAGCGTGTCCGTTCCGA 1 11710 ( 249) GAAAAACGATCCCCCCTCA 1 32745 ( 118) GCAAACGTTTTCGTCACGA 1 48704 ( 460) GAAACACTGCTTCTTATCC 1 49701 ( 307) TCAAACCGCGTTCGTCTCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 6748 bayes= 9.61407 E= 1.1e+002 -1010 -1010 199 -144 98 88 -1010 -1010 186 -1010 -1010 -1010 186 -1010 -1010 -1010 140 -44 -133 -1010 40 14 66 -1010 -1010 114 99 -1010 -160 -44 -33 114 -2 -44 99 -144 -1010 -144 -33 155 -1010 -44 -1010 172 -1010 14 -133 136 -1010 88 125 -1010 -1010 56 66 14 -1010 14 -133 136 -60 172 -1010 -1010 -1010 56 -1010 136 -1010 88 125 -1010 98 14 -33 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 11 E= 1.1e+002 0.000000 0.000000 0.909091 0.090909 0.545455 0.454545 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.727273 0.181818 0.090909 0.000000 0.363636 0.272727 0.363636 0.000000 0.000000 0.545455 0.454545 0.000000 0.090909 0.181818 0.181818 0.545455 0.272727 0.181818 0.454545 0.090909 0.000000 0.090909 0.181818 0.727273 0.000000 0.181818 0.000000 0.818182 0.000000 0.272727 0.090909 0.636364 0.000000 0.454545 0.545455 0.000000 0.000000 0.363636 0.363636 0.272727 0.000000 0.272727 0.090909 0.636364 0.181818 0.818182 0.000000 0.000000 0.000000 0.363636 0.000000 0.636364 0.000000 0.454545 0.545455 0.000000 0.545455 0.272727 0.181818 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[AC]AAA[AGC][CG]T[GA]TT[TC][GC][CGT][TC]C[TC][GC][AC] -------------------------------------------------------------------------------- Time 5.11 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48704 8.99e-04 223_[+2(2.74e-05)]_224_\ [+3(2.58e-06)]_22 2215 1.71e-06 94_[+2(2.16e-07)]_92_[+1(3.18e-07)]_\ 290 18180 1.45e-07 263_[+1(5.95e-05)]_51_\ [+3(8.09e-07)]_93_[+2(1.07e-07)]_19_[+3(7.41e-05)]_12 39601 8.31e-08 7_[+3(2.25e-09)]_275_[+2(8.35e-05)]_\ 12_[+1(1.50e-05)]_163 49027 1.69e-04 257_[+1(4.11e-07)]_88_\ [+2(2.12e-05)]_131 49701 1.35e-07 280_[+2(3.59e-06)]_14_\ [+3(4.11e-06)]_132_[+1(3.18e-07)]_31 44339 3.26e-06 95_[+3(2.67e-07)]_159_\ [+1(1.50e-05)]_150_[+2(4.03e-05)]_53 44641 7.42e-08 73_[+1(4.15e-06)]_91_[+2(5.07e-05)]_\ 234_[+3(1.17e-08)]_59 34307 5.61e-07 28_[+2(4.72e-05)]_63_[+3(3.43e-08)]_\ 90_[+1(1.41e-05)]_276 34715 2.13e-06 318_[+3(2.16e-07)]_87_\ [+2(3.93e-07)]_64 11710 9.11e-06 60_[+1(4.95e-06)]_71_[+2(7.45e-05)]_\ 93_[+3(1.40e-06)]_233 35419 1.61e-04 369_[+1(1.59e-06)]_45_\ [+2(1.58e-05)]_62 32745 4.92e-05 59_[+1(3.85e-05)]_46_[+3(2.42e-06)]_\ 133_[+2(3.75e-05)]_219 40515 2.16e-05 90_[+1(3.25e-05)]_97_[+2(3.24e-05)]_\ 51_[+3(1.30e-06)]_219 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************