******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/20/20.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10336 1.0000 500 12729 1.0000 500 1785 1.0000 500 20745 1.0000 500 20923 1.0000 500 21507 1.0000 500 22548 1.0000 500 22712 1.0000 500 23363 1.0000 500 24970 1.0000 500 25167 1.0000 500 25570 1.0000 500 25867 1.0000 500 261176 1.0000 500 262078 1.0000 500 262367 1.0000 500 268301 1.0000 500 31108 1.0000 500 31287 1.0000 500 37493 1.0000 500 3770 1.0000 500 4049 1.0000 500 5874 1.0000 500 6731 1.0000 500 9373 1.0000 500 9688 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/20/20.seqs.fa -oc motifs/20 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 26 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 13000 N= 26 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.269 C 0.240 G 0.225 T 0.266 Background letter frequencies (from dataset with add-one prior applied): A 0.269 C 0.240 G 0.225 T 0.266 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 13 llr = 166 E-value = 4.0e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 12:32::1211::5:2 pos.-specific C :42::3::::5::1:: probability G 9:6552a:59:9a:88 matrix T :52245:93:51:42: bits 2.1 * * 1.9 * * 1.7 * * * ** 1.5 * ** * ** Relative 1.3 * ** * ** ** Entropy 1.1 * ** * ** ** (18.4 bits) 0.9 * * ** * ** ** 0.6 * * ************ 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GTGGGTGTGGCGGAGG consensus CCATC T T TTA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 31108 187 2.53e-09 TTGGTTGTTG GCGAGTGTGGTGGAGG TGACATGATG 9373 129 1.31e-08 GGCGACGGAG GCCGGTGTGGCGGTGG CGCTTGTGTG 9688 375 5.07e-08 CACCTAAGCC GCGTGCGTTGTGGAGG CAGGTGAACC 262078 334 6.60e-08 ACTAACGGCC GTCATTGTGGTGGTGG CGTAGCTTGT 262367 247 1.23e-07 TGAGGGGGAG GTGGAGGTGGTGGTGG TTGTGTGTGG 12729 33 4.99e-07 AGCCGGGTGA GTGTTTGTAGCGGAGA ACATGGTGTC 261176 382 8.28e-07 TTGACGTCAG GCGGGCGTGACGGAGA GTGACTTTTG 5874 158 8.97e-07 GTGATATTGT GTTTGTGTGGAGGTGG ATCGAAGTCA 21507 87 8.97e-07 GTTGCTTCAC GTGATGGTTGTGGTTG TGAGATAGGA 20745 68 1.87e-06 TTTCGTCGAA GCTGTTGTTGTTGAGG GAGGTTGAAC 37493 20 3.53e-06 GTGAGTTTGA AAGGGCGTTGCGGATG ATGCATGGTA 22712 79 4.20e-06 TTGGCAGATG GACGATGTGGCGGCGA CGGTAGGGAA 25867 169 4.44e-06 CCAGGTTGAT GTGATCGAAGCGGATG GTTCGTCCCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31108 2.5e-09 186_[+1]_298 9373 1.3e-08 128_[+1]_356 9688 5.1e-08 374_[+1]_110 262078 6.6e-08 333_[+1]_151 262367 1.2e-07 246_[+1]_238 12729 5e-07 32_[+1]_452 261176 8.3e-07 381_[+1]_103 5874 9e-07 157_[+1]_327 21507 9e-07 86_[+1]_398 20745 1.9e-06 67_[+1]_417 37493 3.5e-06 19_[+1]_465 22712 4.2e-06 78_[+1]_406 25867 4.4e-06 168_[+1]_316 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=13 31108 ( 187) GCGAGTGTGGTGGAGG 1 9373 ( 129) GCCGGTGTGGCGGTGG 1 9688 ( 375) GCGTGCGTTGTGGAGG 1 262078 ( 334) GTCATTGTGGTGGTGG 1 262367 ( 247) GTGGAGGTGGTGGTGG 1 12729 ( 33) GTGTTTGTAGCGGAGA 1 261176 ( 382) GCGGGCGTGACGGAGA 1 5874 ( 158) GTTTGTGTGGAGGTGG 1 21507 ( 87) GTGATGGTTGTGGTTG 1 20745 ( 68) GCTGTTGTTGTTGAGG 1 37493 ( 20) AAGGGCGTTGCGGATG 1 22712 ( 79) GACGATGTGGCGGCGA 1 25867 ( 169) GTGATCGAAGCGGATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 12610 bayes= 11.5514 E= 4.0e-003 -180 -1035 203 -1035 -80 68 -1035 80 -1035 -6 145 -79 19 -1035 103 -20 -80 -1035 103 53 -1035 36 -55 102 -1035 -1035 215 -1035 -180 -1035 -1035 180 -80 -1035 126 21 -180 -1035 203 -1035 -180 94 -1035 80 -1035 -1035 203 -179 -1035 -1035 215 -1035 100 -164 -1035 53 -1035 -1035 177 -20 -22 -1035 177 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 13 E= 4.0e-003 0.076923 0.000000 0.923077 0.000000 0.153846 0.384615 0.000000 0.461538 0.000000 0.230769 0.615385 0.153846 0.307692 0.000000 0.461538 0.230769 0.153846 0.000000 0.461538 0.384615 0.000000 0.307692 0.153846 0.538462 0.000000 0.000000 1.000000 0.000000 0.076923 0.000000 0.000000 0.923077 0.153846 0.000000 0.538462 0.307692 0.076923 0.000000 0.923077 0.000000 0.076923 0.461538 0.000000 0.461538 0.000000 0.000000 0.923077 0.076923 0.000000 0.000000 1.000000 0.000000 0.538462 0.076923 0.000000 0.384615 0.000000 0.000000 0.769231 0.230769 0.230769 0.000000 0.769231 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[TC][GC][GAT][GT][TC]GT[GT]G[CT]GG[AT][GT][GA] -------------------------------------------------------------------------------- Time 5.90 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 11 llr = 146 E-value = 1.5e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 9:179:37:72:69:8 pos.-specific C :17::911a:59::a2 probability G :323::42:31141:: matrix T 16::113:::3::::: bits 2.1 * * 1.9 * * 1.7 * * * 1.5 * ** * * ** Relative 1.3 * ** * * *** Entropy 1.1 * *** ** ***** (19.2 bits) 0.9 * **** *** ***** 0.6 ****** *** ***** 0.4 ****** *** ***** 0.2 **************** 0.0 ---------------- Multilevel ATCAACGACACCAACA consensus G G A GT G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 1785 456 3.08e-09 TCCATCAACC ATCAACGACGCCAACA CCTCGGCCGC 25570 149 1.27e-08 AGGTTGTGGC ATCAACAACAACAACA ACATCGTCAC 9373 16 6.67e-08 TGACACTGGC ATCAACTGCATCGACA GCTCCACCGT 25867 64 1.88e-07 GGGTGGAAGA ATAGACAACACCAACA CGACGTCGAA 5874 360 2.50e-07 ACGTTGACTG ACCGACGACGCCGACA CGTCAGCTGA 6731 473 3.23e-07 AATTATCAAT ATCAATTACAACAACA CAATTGAAAC 10336 256 5.73e-07 CATCAGACCA ATCAACCACGTCGACC AATCAAGATG 9688 112 1.03e-06 ACCGAATGGA AGCGACTCCACCAACC GGCGAGGAAC 261176 308 1.43e-06 ACGATGGAGA AGGAACGACAGCAGCA GTGCAGACCG 25167 424 1.43e-06 ACTTTTTACG AGGAACAACATGGACA CAGTTCTGAA 21507 374 1.62e-06 GCAGTTCTGA TTCATCGGCACCAACA GTCATACGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1785 3.1e-09 455_[+2]_29 25570 1.3e-08 148_[+2]_336 9373 6.7e-08 15_[+2]_469 25867 1.9e-07 63_[+2]_421 5874 2.5e-07 359_[+2]_125 6731 3.2e-07 472_[+2]_12 10336 5.7e-07 255_[+2]_229 9688 1e-06 111_[+2]_373 261176 1.4e-06 307_[+2]_177 25167 1.4e-06 423_[+2]_61 21507 1.6e-06 373_[+2]_111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=11 1785 ( 456) ATCAACGACGCCAACA 1 25570 ( 149) ATCAACAACAACAACA 1 9373 ( 16) ATCAACTGCATCGACA 1 25867 ( 64) ATAGACAACACCAACA 1 5874 ( 360) ACCGACGACGCCGACA 1 6731 ( 473) ATCAATTACAACAACA 1 10336 ( 256) ATCAACCACGTCGACC 1 9688 ( 112) AGCGACTCCACCAACC 1 261176 ( 308) AGGAACGACAGCAGCA 1 25167 ( 424) AGGAACAACATGGACA 1 21507 ( 374) TTCATCGGCACCAACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 12610 bayes= 10.517 E= 1.5e-001 176 -1010 -1010 -154 -1010 -140 27 126 -156 160 -31 -1010 143 -1010 27 -1010 176 -1010 -1010 -154 -1010 192 -1010 -154 2 -140 69 4 143 -140 -31 -1010 -1010 206 -1010 -1010 143 -1010 27 -1010 -56 92 -131 4 -1010 192 -131 -1010 124 -1010 69 -1010 176 -1010 -131 -1010 -1010 206 -1010 -1010 160 -40 -1010 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 1.5e-001 0.909091 0.000000 0.000000 0.090909 0.000000 0.090909 0.272727 0.636364 0.090909 0.727273 0.181818 0.000000 0.727273 0.000000 0.272727 0.000000 0.909091 0.000000 0.000000 0.090909 0.000000 0.909091 0.000000 0.090909 0.272727 0.090909 0.363636 0.272727 0.727273 0.090909 0.181818 0.000000 0.000000 1.000000 0.000000 0.000000 0.727273 0.000000 0.272727 0.000000 0.181818 0.454545 0.090909 0.272727 0.000000 0.909091 0.090909 0.000000 0.636364 0.000000 0.363636 0.000000 0.909091 0.000000 0.090909 0.000000 0.000000 1.000000 0.000000 0.000000 0.818182 0.181818 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[TG]C[AG]AC[GAT]AC[AG][CT]C[AG]ACA -------------------------------------------------------------------------------- Time 11.44 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 12 llr = 148 E-value = 3.3e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 312::7::::2::438 pos.-specific C 12::1:1:::31::2: probability G :6::8162:a2:9:32 matrix T 728a1338a:491621 bits 2.1 * 1.9 * ** 1.7 * ** * 1.5 * ** ** Relative 1.3 *** *** ** Entropy 1.1 *** *** ** (17.8 bits) 0.9 *** **** *** * 0.6 * ******** *** * 0.4 ********** *** * 0.2 ********** *** * 0.0 ---------------- Multilevel TGTTGAGTTGTTGTAA consensus A TT C AG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 31108 241 1.29e-08 TGATGTACCG TGTTGAGTTGGTGTCA AGGATGATGT 261176 194 3.16e-07 GAAAAAACAT AGATGAGTTGTTGACA TCTCTCTCAA 25867 13 3.84e-07 TATGACGTGT TGTTGGGTTGCTGTAG ATGCGACGAT 23363 144 5.65e-07 CATCGGTTTC AATTGTGTTGTTGTAA TTGATATTGA 4049 51 6.81e-07 TACGGGTGGC ATTTGATTTGTTGATA ACCAGCATCT 37493 63 7.45e-07 ATGTACGAAT TCTTGTGTTGATGATA CATTAAAGAA 262367 316 8.88e-07 TGTCTATTCA TGTTCACTTGTTGTAA CAACGAGATA 5874 442 1.15e-06 GAGAAGTAGA TGTTGAGGTGACGTAA CAACAACAAC 12729 204 1.15e-06 CGCTCCGGTC CGTTGATTTGCTGAGG TGTCAGCTGA 9373 292 1.25e-06 ATAGAGTAGG TCTTTATTTGTTGAGA CTTATTGGAC 3770 373 1.85e-06 AAAGTGGTTG TTTTGTGTTGCTTTGA GACGGTTGTG 22712 304 5.93e-06 AATCATGTCC TGATGATGTGGTGTGT CGTCTTCAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31108 1.3e-08 240_[+3]_244 261176 3.2e-07 193_[+3]_291 25867 3.8e-07 12_[+3]_472 23363 5.7e-07 143_[+3]_341 4049 6.8e-07 50_[+3]_434 37493 7.5e-07 62_[+3]_422 262367 8.9e-07 315_[+3]_169 5874 1.1e-06 441_[+3]_43 12729 1.1e-06 203_[+3]_281 9373 1.2e-06 291_[+3]_193 3770 1.9e-06 372_[+3]_112 22712 5.9e-06 303_[+3]_181 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=12 31108 ( 241) TGTTGAGTTGGTGTCA 1 261176 ( 194) AGATGAGTTGTTGACA 1 25867 ( 13) TGTTGGGTTGCTGTAG 1 23363 ( 144) AATTGTGTTGTTGTAA 1 4049 ( 51) ATTTGATTTGTTGATA 1 37493 ( 63) TCTTGTGTTGATGATA 1 262367 ( 316) TGTTCACTTGTTGTAA 1 5874 ( 442) TGTTGAGGTGACGTAA 1 12729 ( 204) CGTTGATTTGCTGAGG 1 9373 ( 292) TCTTTATTTGTTGAGA 1 3770 ( 373) TTTTGTGTTGCTTTGA 1 22712 ( 304) TGATGATGTGGTGTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 12610 bayes= 9.69454 E= 3.3e+000 -10 -152 -1023 133 -169 -53 137 -67 -69 -1023 -1023 165 -1023 -1023 -1023 191 -1023 -152 189 -167 131 -1023 -143 -9 -1023 -152 137 33 -1023 -1023 -44 165 -1023 -1023 -1023 191 -1023 -1023 215 -1023 -69 6 -44 65 -1023 -152 -1023 179 -1023 -1023 202 -167 63 -1023 -1023 113 31 -53 56 -67 148 -1023 -44 -167 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 12 E= 3.3e+000 0.250000 0.083333 0.000000 0.666667 0.083333 0.166667 0.583333 0.166667 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.083333 0.833333 0.083333 0.666667 0.000000 0.083333 0.250000 0.000000 0.083333 0.583333 0.333333 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.250000 0.166667 0.416667 0.000000 0.083333 0.000000 0.916667 0.000000 0.000000 0.916667 0.083333 0.416667 0.000000 0.000000 0.583333 0.333333 0.166667 0.333333 0.166667 0.750000 0.000000 0.166667 0.083333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TA]GTTG[AT][GT]TTG[TC]TG[TA][AG]A -------------------------------------------------------------------------------- Time 17.24 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10336 6.41e-03 255_[+2(5.73e-07)]_201_\ [+2(6.09e-05)]_12 12729 7.36e-06 32_[+1(4.99e-07)]_155_\ [+3(1.15e-06)]_281 1785 4.79e-05 455_[+2(3.08e-09)]_29 20745 6.07e-03 67_[+1(1.87e-06)]_417 20923 6.14e-01 500 21507 2.13e-05 86_[+1(8.97e-07)]_271_\ [+2(1.62e-06)]_111 22548 9.70e-01 500 22712 4.44e-04 78_[+1(4.20e-06)]_117_\ [+3(6.76e-05)]_76_[+3(5.93e-06)]_181 23363 3.94e-04 143_[+3(5.65e-07)]_341 24970 5.70e-01 500 25167 1.81e-03 423_[+2(1.43e-06)]_61 25570 7.94e-05 148_[+2(1.27e-08)]_199_\ [+2(4.80e-05)]_121 25867 1.15e-08 12_[+3(3.84e-07)]_35_[+2(1.88e-07)]_\ 89_[+1(4.44e-06)]_292_[+2(2.77e-05)]_8 261176 1.32e-08 193_[+3(3.16e-07)]_98_\ [+2(1.43e-06)]_58_[+1(8.28e-07)]_103 262078 2.09e-04 333_[+1(6.60e-08)]_151 262367 2.63e-06 1_[+1(2.16e-05)]_229_[+1(1.23e-07)]_\ 53_[+3(8.88e-07)]_169 268301 6.64e-01 500 31108 1.11e-09 186_[+1(2.53e-09)]_38_\ [+3(1.29e-08)]_244 31287 2.11e-01 500 37493 8.12e-06 19_[+1(3.53e-06)]_27_[+3(7.45e-07)]_\ 399_[+3(3.18e-05)]_7 3770 2.40e-03 372_[+3(1.85e-06)]_112 4049 9.29e-04 50_[+3(6.81e-07)]_434 5874 9.39e-09 157_[+1(8.97e-07)]_186_\ [+2(2.50e-07)]_66_[+3(1.15e-06)]_43 6731 1.79e-03 472_[+2(3.23e-07)]_12 9373 5.89e-11 15_[+2(6.67e-08)]_97_[+1(1.31e-08)]_\ 147_[+3(1.25e-06)]_193 9688 1.21e-06 111_[+2(1.03e-06)]_247_\ [+1(5.07e-08)]_110 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************