******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/440/440.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11087 1.0000 500 21271 1.0000 500 21868 1.0000 500 22681 1.0000 500 24211 1.0000 500 26365 1.0000 500 268226 1.0000 500 269316 1.0000 500 34809 1.0000 500 35710 1.0000 500 3627 1.0000 500 6416 1.0000 500 8055 1.0000 500 9630 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/440/440.seqs.fa -oc motifs/440 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.265 C 0.233 G 0.232 T 0.270 Background letter frequencies (from dataset with add-one prior applied): A 0.265 C 0.233 G 0.232 T 0.270 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 14 llr = 154 E-value = 6.8e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 966:398:541:1168 pos.-specific C 133:7::a2:324711 probability G :::6::::34::1:3: matrix T :114:12::26851:1 bits 2.1 * 1.9 * 1.7 * 1.5 * * * Relative 1.3 * **** * Entropy 1.1 * ***** * * (15.8 bits) 0.8 ******** * * * 0.6 ******** ** *** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel AAAGCAACAATTTCAA consensus CCTA T GGCCC G sequence CT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 26365 467 3.77e-08 AACAAACCAC AACGCAACCGTTTCAA CCATCGATAC 35710 422 1.34e-07 CCTCATAAAC AAAGCAACCATCCCAA CACAACTAGC 269316 473 1.80e-07 TCAATATCGG AACGAAACAACTCCAA CTACAATTCA 6416 314 2.77e-07 TTTCTTAACG ACAGCAACAGTTCAAA ACAGTAACAC 3627 83 6.67e-07 GATATTAGTA AAATAATCAGTTTCAA GAGGACAGTG 8055 5 1.17e-06 TTAG AATGCAACGACTTCGA GTGTGCTGAG 34809 481 1.76e-06 TTCAATCTCA AAATCAACCATTTCAT CAAC 21868 226 7.36e-06 CGGAGAAGCA AAATCAACAGCTGTGA CTTCGTCAGA 22681 432 9.99e-06 CACCGACGTC CCAGAAACGTCTTCAA TAAACAAGCT 11087 82 9.99e-06 TGGAGAGAAT ACAGCATCGGATACAA AATCTTTGAT 9630 57 1.15e-05 ACCTCTCACA ACAGCATCAATCTCCC AACGAACATC 24211 120 1.24e-05 AATTATGATA AAATCAACAATCCAGC AGTGTTGGGT 268226 460 1.63e-05 TACTCAATCC AACTCTACGTTTTCCA ACTCTACGCT 21271 194 5.93e-05 ACAAGTAATG ATCGAAACATATCTGA TGAACGCTGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 26365 3.8e-08 466_[+1]_18 35710 1.3e-07 421_[+1]_63 269316 1.8e-07 472_[+1]_12 6416 2.8e-07 313_[+1]_171 3627 6.7e-07 82_[+1]_402 8055 1.2e-06 4_[+1]_480 34809 1.8e-06 480_[+1]_4 21868 7.4e-06 225_[+1]_259 22681 1e-05 431_[+1]_53 11087 1e-05 81_[+1]_403 9630 1.2e-05 56_[+1]_428 24211 1.2e-05 119_[+1]_365 268226 1.6e-05 459_[+1]_25 21271 5.9e-05 193_[+1]_291 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=14 26365 ( 467) AACGCAACCGTTTCAA 1 35710 ( 422) AAAGCAACCATCCCAA 1 269316 ( 473) AACGAAACAACTCCAA 1 6416 ( 314) ACAGCAACAGTTCAAA 1 3627 ( 83) AAATAATCAGTTTCAA 1 8055 ( 5) AATGCAACGACTTCGA 1 34809 ( 481) AAATCAACCATTTCAT 1 21868 ( 226) AAATCAACAGCTGTGA 1 22681 ( 432) CCAGAAACGTCTTCAA 1 11087 ( 82) ACAGCATCGGATACAA 1 9630 ( 57) ACAGCATCAATCTCCC 1 24211 ( 120) AAATCAACAATCCAGC 1 268226 ( 460) AACTCTACGTTTTCCA 1 21271 ( 194) ATCGAAACATATCTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 9.52561 E= 6.8e-001 181 -170 -1045 -1045 128 29 -1045 -191 128 29 -1045 -191 -1045 -1045 147 40 11 162 -1045 -1045 181 -1045 -1045 -191 157 -1045 -1045 -33 -1045 210 -1045 -1045 91 -12 30 -1045 69 -1045 62 -33 -89 29 -1045 108 -1045 -12 -1045 154 -189 62 -170 89 -89 162 -1045 -92 111 -70 30 -1045 157 -70 -1045 -191 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 14 E= 6.8e-001 0.928571 0.071429 0.000000 0.000000 0.642857 0.285714 0.000000 0.071429 0.642857 0.285714 0.000000 0.071429 0.000000 0.000000 0.642857 0.357143 0.285714 0.714286 0.000000 0.000000 0.928571 0.000000 0.000000 0.071429 0.785714 0.000000 0.000000 0.214286 0.000000 1.000000 0.000000 0.000000 0.500000 0.214286 0.285714 0.000000 0.428571 0.000000 0.357143 0.214286 0.142857 0.285714 0.000000 0.571429 0.000000 0.214286 0.000000 0.785714 0.071429 0.357143 0.071429 0.500000 0.142857 0.714286 0.000000 0.142857 0.571429 0.142857 0.285714 0.000000 0.785714 0.142857 0.000000 0.071429 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[AC][AC][GT][CA]A[AT]C[AGC][AGT][TC][TC][TC]C[AG]A -------------------------------------------------------------------------------- Time 1.56 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 8 llr = 112 E-value = 4.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 133:8::6:::6::33 pos.-specific C 9:39::9:18::8:68 probability G :851:1:41:a1:a1: matrix T ::::391:83:33::: bits 2.1 * * 1.9 * * 1.7 * * 1.5 * * ** * * Relative 1.3 ** * ** ** ** * Entropy 1.1 ** ***** ** ** * (20.1 bits) 0.8 ** ******** **** 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CGGCATCATCGACGCC consensus AA T G T TT AA sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 34809 239 3.68e-09 GATCTCGGAT CGGCATCGTCGACGCA CCTTTCATTG 3627 403 6.33e-09 CATCACCTCG AGGCATCATCGACGCC AGAGAGCTTT 24211 294 4.32e-08 GTTCTGACAC CGACTTCATCGATGCC GTCGAGACTG 35710 362 2.63e-07 TTTCGTCTGC CACCATCAGTGACGCC TCGCGTCTTC 26365 262 2.87e-07 TGGAATCCGT CGCGATCGTCGTCGAC TGACTGAGTC 22681 199 4.49e-07 CCTCAAGATT CGGCAGTACCGACGCC TCTTGCAGTT 9630 216 6.32e-07 GCCATGGCAC CAACATCGTTGTCGAC CGTTTGGTGA 21868 350 1.17e-06 TCATCGAGGG CGGCTTCATCGGTGGA ACGAGAGGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34809 3.7e-09 238_[+2]_246 3627 6.3e-09 402_[+2]_82 24211 4.3e-08 293_[+2]_191 35710 2.6e-07 361_[+2]_123 26365 2.9e-07 261_[+2]_223 22681 4.5e-07 198_[+2]_286 9630 6.3e-07 215_[+2]_269 21868 1.2e-06 349_[+2]_135 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=8 34809 ( 239) CGGCATCGTCGACGCA 1 3627 ( 403) AGGCATCATCGACGCC 1 24211 ( 294) CGACTTCATCGATGCC 1 35710 ( 362) CACCATCAGTGACGCC 1 26365 ( 262) CGCGATCGTCGTCGAC 1 22681 ( 199) CGGCAGTACCGACGCC 1 9630 ( 216) CAACATCGTTGTCGAC 1 21868 ( 350) CGGCTTCATCGGTGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 10.4651 E= 4.6e+000 -108 191 -965 -965 -8 -965 169 -965 -8 10 111 -965 -965 191 -89 -965 150 -965 -965 -11 -965 -965 -89 170 -965 191 -965 -111 124 -965 69 -965 -965 -90 -89 147 -965 169 -965 -11 -965 -965 210 -965 124 -965 -89 -11 -965 169 -965 -11 -965 -965 210 -965 -8 142 -89 -965 -8 169 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 4.6e+000 0.125000 0.875000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.250000 0.250000 0.500000 0.000000 0.000000 0.875000 0.125000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 0.000000 0.125000 0.875000 0.000000 0.875000 0.000000 0.125000 0.625000 0.000000 0.375000 0.000000 0.000000 0.125000 0.125000 0.750000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.625000 0.000000 0.125000 0.250000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 0.250000 0.625000 0.125000 0.000000 0.250000 0.750000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[GA][GAC]C[AT]TC[AG]T[CT]G[AT][CT]G[CA][CA] -------------------------------------------------------------------------------- Time 3.23 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 8 llr = 103 E-value = 2.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::84:3::1:99149 pos.-specific C 4::11::::a1:51: probability G 68:388:a:::14:1 matrix T :3331:a:9::::5: bits 2.1 * * 1.9 ** * 1.7 ** * 1.5 ** *** * Relative 1.3 * ******* * Entropy 1.1 *** ******** * (18.7 bits) 0.8 *** ******** * 0.6 *** ********* * 0.4 *** *********** 0.2 *** *********** 0.0 --------------- Multilevel GGAAGGTGTCAACTA consensus CTTG A GA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 26365 365 8.06e-09 TCGCGTCACC GGAGGGTGTCAACAA CACAAAGCCA 11087 433 9.14e-08 GATGTGTGAA GGAAGGTGTCAGGTA CTTGCTGTGA 8055 283 2.41e-07 TTAACTTCAA CGAGGGTGTCAAGTG TCAAGACAAC 35710 397 8.19e-07 CAGGCTGACC GTACGATGTCAACAA CCTCATAAAC 21868 442 1.03e-06 GACGGCGAGA GGATCGTGTCCACTA ATTTGTCGTC 6416 235 1.11e-06 AAAAACGAAC GGTAGATGTCAAAAA GCTTATTTCT 34809 199 1.53e-06 CAAACGCCTG CTTTGGTGTCAACCA CCGCCGACGA 21271 176 1.53e-06 AAAGCCATGA CGAATGTGACAAGTA ATGATCGAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 26365 8.1e-09 364_[+3]_121 11087 9.1e-08 432_[+3]_53 8055 2.4e-07 282_[+3]_203 35710 8.2e-07 396_[+3]_89 21868 1e-06 441_[+3]_44 6416 1.1e-06 234_[+3]_251 34809 1.5e-06 198_[+3]_287 21271 1.5e-06 175_[+3]_310 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=8 26365 ( 365) GGAGGGTGTCAACAA 1 11087 ( 433) GGAAGGTGTCAGGTA 1 8055 ( 283) CGAGGGTGTCAAGTG 1 35710 ( 397) GTACGATGTCAACAA 1 21868 ( 442) GGATCGTGTCCACTA 1 6416 ( 235) GGTAGATGTCAAAAA 1 34809 ( 199) CTTTGGTGTCAACCA 1 21271 ( 176) CGAATGTGACAAGTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 6804 bayes= 9.73047 E= 2.5e+002 -965 69 143 -965 -965 -965 169 -11 150 -965 -965 -11 50 -90 11 -11 -965 -90 169 -111 -8 -965 169 -965 -965 -965 -965 189 -965 -965 210 -965 -108 -965 -965 170 -965 210 -965 -965 172 -90 -965 -965 172 -965 -89 -965 -108 110 69 -965 50 -90 -965 89 172 -965 -89 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 8 E= 2.5e+002 0.000000 0.375000 0.625000 0.000000 0.000000 0.000000 0.750000 0.250000 0.750000 0.000000 0.000000 0.250000 0.375000 0.125000 0.250000 0.250000 0.000000 0.125000 0.750000 0.125000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.125000 0.000000 0.000000 0.875000 0.000000 1.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.125000 0.500000 0.375000 0.000000 0.375000 0.125000 0.000000 0.500000 0.875000 0.000000 0.125000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC][GT][AT][AGT]G[GA]TGTCAA[CG][TA]A -------------------------------------------------------------------------------- Time 4.90 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11087 1.01e-05 81_[+1(9.99e-06)]_335_\ [+3(9.14e-08)]_53 21271 6.87e-04 175_[+3(1.53e-06)]_3_[+1(5.93e-05)]_\ 291 21868 2.39e-07 225_[+1(7.36e-06)]_108_\ [+2(1.17e-06)]_76_[+3(1.03e-06)]_44 22681 9.88e-05 198_[+2(4.49e-07)]_143_\ [+2(7.17e-05)]_58_[+1(9.99e-06)]_53 24211 1.70e-05 119_[+1(1.24e-05)]_158_\ [+2(4.32e-08)]_134_[+2(9.00e-05)]_41 26365 5.51e-12 261_[+2(2.87e-07)]_87_\ [+3(8.06e-09)]_10_[+2(5.44e-05)]_32_[+1(6.80e-05)]_13_[+1(3.77e-08)]_18 268226 3.09e-03 120_[+2(9.87e-05)]_323_\ [+1(1.63e-05)]_25 269316 1.02e-03 472_[+1(1.80e-07)]_12 34809 4.61e-10 198_[+3(1.53e-06)]_25_\ [+2(3.68e-09)]_226_[+1(1.76e-06)]_4 35710 1.25e-09 23_[+3(6.27e-05)]_323_\ [+2(2.63e-07)]_19_[+3(8.19e-07)]_10_[+1(1.34e-07)]_28_[+1(3.02e-05)]_19 3627 2.14e-07 82_[+1(6.67e-07)]_304_\ [+2(6.33e-09)]_82 6416 8.92e-06 234_[+3(1.11e-06)]_64_\ [+1(2.77e-07)]_171 8055 6.69e-06 4_[+1(1.17e-06)]_262_[+3(2.41e-07)]_\ 203 9630 1.35e-04 56_[+1(1.15e-05)]_78_[+2(7.66e-05)]_\ 49_[+2(6.32e-07)]_269 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************