******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/444/444.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10673 1.0000 500 1161 1.0000 500 11650 1.0000 500 11896 1.0000 500 1412 1.0000 500 21750 1.0000 500 22394 1.0000 500 24615 1.0000 500 264582 1.0000 500 30913 1.0000 500 35107 1.0000 500 4507 1.0000 500 5284 1.0000 500 6843 1.0000 500 7519 1.0000 500 7700 1.0000 500 7778 1.0000 500 9162 1.0000 500 9771 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/444/444.seqs.fa -oc motifs/444 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 19 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9500 N= 19 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.262 C 0.232 G 0.232 T 0.274 Background letter frequencies (from dataset with add-one prior applied): A 0.262 C 0.232 G 0.232 T 0.274 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 7 llr = 121 E-value = 1.4e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 7:3479:3::613719a13a: pos.-specific C 397611736a36439::67:a probability G :1::1:314:::::::::::: matrix T :::::::3::133::1:3::: bits 2.1 * * 1.9 * * ** 1.7 * * ** 1.5 * * * * ** Relative 1.3 ** ** * *** *** Entropy 1.1 **** ** ** **** *** (25.0 bits) 0.8 ******* ** **** *** 0.6 ******* **** ******** 0.4 ******* ************* 0.2 ******* ************* 0.0 --------------------- Multilevel ACCCAACACCACCACAACCAC consensus C AA GCG CTAC TA sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 4507 425 1.25e-13 AGAAGAGCCC ACCCAACCCCACCACAACCAC CTCCTCCCAG 30913 244 3.38e-09 CCCTAGCCCC ACCCAACTCCACACAAACAAC CAACGCCCTC 24615 459 3.38e-09 TCTCCAAGAA CGCCAACAGCAACACAACCAC ATCACTCCAA 21750 182 5.35e-09 TAATCACAAA ACAAAACAGCCTAACAACAAC CAAACAACGA 6843 274 7.53e-09 GTCCGTTGCA ACAAAAGGCCACTACAAACAC AGAGCAGCCC 11896 382 2.42e-08 AAGTTCGCCC ACCAGAGTGCCTCCCAATCAC GAATCTCTCG 9771 454 9.96e-08 CTCAAACCCT CCCCCCCCCCTCTACTATCAC CTGCCGATAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 4507 1.3e-13 424_[+1]_55 30913 3.4e-09 243_[+1]_236 24615 3.4e-09 458_[+1]_21 21750 5.3e-09 181_[+1]_298 6843 7.5e-09 273_[+1]_206 11896 2.4e-08 381_[+1]_98 9771 1e-07 453_[+1]_26 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=7 4507 ( 425) ACCCAACCCCACCACAACCAC 1 30913 ( 244) ACCCAACTCCACACAAACAAC 1 24615 ( 459) CGCCAACAGCAACACAACCAC 1 21750 ( 182) ACAAAACAGCCTAACAACAAC 1 6843 ( 274) ACAAAAGGCCACTACAAACAC 1 11896 ( 382) ACCAGAGTGCCTCCCAATCAC 1 9771 ( 454) CCCCCCCCCCTCTACTATCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 9120 bayes= 10.9525 E= 1.4e+001 145 30 -945 -945 -945 188 -70 -945 13 162 -945 -945 71 130 -945 -945 145 -70 -70 -945 171 -70 -945 -945 -945 162 30 -945 13 30 -70 6 -945 130 89 -945 -945 210 -945 -945 112 30 -945 -94 -87 130 -945 6 13 88 -945 6 145 30 -945 -945 -87 188 -945 -945 171 -945 -945 -94 193 -945 -945 -945 -87 130 -945 6 13 162 -945 -945 193 -945 -945 -945 -945 210 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 1.4e+001 0.714286 0.285714 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.285714 0.714286 0.000000 0.000000 0.428571 0.571429 0.000000 0.000000 0.714286 0.142857 0.142857 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 0.714286 0.285714 0.000000 0.285714 0.285714 0.142857 0.285714 0.000000 0.571429 0.428571 0.000000 0.000000 1.000000 0.000000 0.000000 0.571429 0.285714 0.000000 0.142857 0.142857 0.571429 0.000000 0.285714 0.285714 0.428571 0.000000 0.285714 0.714286 0.285714 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.857143 0.000000 0.000000 0.142857 1.000000 0.000000 0.000000 0.000000 0.142857 0.571429 0.000000 0.285714 0.285714 0.714286 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AC]C[CA][CA]AA[CG][ACT][CG]C[AC][CT][CAT][AC]CAA[CT][CA]AC -------------------------------------------------------------------------------- Time 3.22 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 7 llr = 88 E-value = 1.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 4::9:::a:393 pos.-specific C :1:1::3::::: probability G 49a:aa7:a7:6 matrix T 1:::::::::11 bits 2.1 * ** * 1.9 * ** ** 1.7 * ** ** 1.5 ** ** ** Relative 1.3 ********** Entropy 1.1 ********** (18.2 bits) 0.8 ********** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AGGAGGGAGGAG consensus G C A A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 6843 220 7.46e-08 GGGCGATGGA AGGAGGGAGGAG GAGAAAACAT 22394 181 1.14e-07 AGGTTTTGGT GGGAGGGAGGAA GGGGGACGCA 7519 4 2.73e-07 GAC GGGAGGGAGAAG GATTGGAGCT 35107 306 5.88e-07 GGCTTTCTGG GCGAGGGAGGAG AGGTTCCACA 10673 309 2.33e-06 TGGTTGGGTT TGGAGGGAGGTG TATCTCTGTC 264582 35 2.93e-06 GTGGGATTCA AGGCGGCAGGAA ACAGGGGAGA 9771 208 3.23e-06 ATTTGATTGT AGGAGGCAGAAT ATTGATCTAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 6843 7.5e-08 219_[+2]_269 22394 1.1e-07 180_[+2]_308 7519 2.7e-07 3_[+2]_485 35107 5.9e-07 305_[+2]_183 10673 2.3e-06 308_[+2]_180 264582 2.9e-06 34_[+2]_454 9771 3.2e-06 207_[+2]_281 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=7 6843 ( 220) AGGAGGGAGGAG 1 22394 ( 181) GGGAGGGAGGAA 1 7519 ( 4) GGGAGGGAGAAG 1 35107 ( 306) GCGAGGGAGGAG 1 10673 ( 309) TGGAGGGAGGTG 1 264582 ( 35) AGGCGGCAGGAA 1 9771 ( 208) AGGAGGCAGAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9291 bayes= 10.9793 E= 1.4e+002 71 -945 89 -94 -945 -70 188 -945 -945 -945 211 -945 171 -70 -945 -945 -945 -945 211 -945 -945 -945 211 -945 -945 30 162 -945 193 -945 -945 -945 -945 -945 211 -945 13 -945 162 -945 171 -945 -945 -94 13 -945 130 -94 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 1.4e+002 0.428571 0.000000 0.428571 0.142857 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.285714 0.714286 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.285714 0.000000 0.714286 0.000000 0.857143 0.000000 0.000000 0.142857 0.285714 0.000000 0.571429 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AG]GGAGG[GC]AG[GA]A[GA] -------------------------------------------------------------------------------- Time 6.59 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 14 llr = 146 E-value = 1.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1::::22::44: pos.-specific C :8::9:14::3: probability G 12a::7:6a::5 matrix T 8::a117::645 bits 2.1 * * 1.9 ** * 1.7 ** * 1.5 *** * Relative 1.3 **** * Entropy 1.1 ****** *** * (15.1 bits) 0.8 ********** * 0.6 ********** * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TCGTCGTGGTAG consensus G AAC ATT sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 5284 48 4.98e-07 ACCACTCTCC TCGTCGTCGTCG CGACCTGTTG 22394 460 4.98e-07 TACCTTTATT TCGTCGTCGTTG CAAAGCAGCC 1161 194 1.66e-06 CCGCCATCGT TCGTCATGGTAG AGATGCACAT 21750 127 2.00e-06 ATTAGTGTTC TCGTCGAGGTCT TGGAGTTCGT 1412 252 2.95e-06 TGTATGAAAA GCGTCGTGGTAT ATTGTACTCA 9771 114 4.56e-06 CGCATCATCA TCGTCGAGGATT TTCACCTGGT 9162 297 4.96e-06 GGCACATTTC GCGTCGTGGAAG TTTGGTTGTG 35107 154 7.04e-06 GTCGTCAGCG ACGTCGTGGTAG TAGTAGCACC 6843 172 8.61e-06 TTGGGCAAGC TCGTCGCCGTCG GAGGGAAGGG 4507 80 9.30e-06 CTGTTCTCTT TCGTCTTGGTAT CCTCGACGTG 7778 105 1.74e-05 GCAAAAAGTG TGGTTGTGGTTG GCTGACAAGG 10673 92 2.15e-05 GGCGAACTGG TGGTCGACGACT TGGGCATGTT 264582 83 2.44e-05 TCGATTGAAA TGGTCATCGATT TCTCCTTTCA 11896 67 3.70e-05 AGAGTATGTA TCGTTATCGATT ACTTTCGCCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5284 5e-07 47_[+3]_441 22394 5e-07 459_[+3]_29 1161 1.7e-06 193_[+3]_295 21750 2e-06 126_[+3]_362 1412 2.9e-06 251_[+3]_237 9771 4.6e-06 113_[+3]_375 9162 5e-06 296_[+3]_192 35107 7e-06 153_[+3]_335 6843 8.6e-06 171_[+3]_317 4507 9.3e-06 79_[+3]_409 7778 1.7e-05 104_[+3]_384 10673 2.1e-05 91_[+3]_397 264582 2.4e-05 82_[+3]_406 11896 3.7e-05 66_[+3]_422 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=14 5284 ( 48) TCGTCGTCGTCG 1 22394 ( 460) TCGTCGTCGTTG 1 1161 ( 194) TCGTCATGGTAG 1 21750 ( 127) TCGTCGAGGTCT 1 1412 ( 252) GCGTCGTGGTAT 1 9771 ( 114) TCGTCGAGGATT 1 9162 ( 297) GCGTCGTGGAAG 1 35107 ( 154) ACGTCGTGGTAG 1 6843 ( 172) TCGTCGCCGTCG 1 4507 ( 80) TCGTCTTGGTAT 1 7778 ( 105) TGGTTGTGGTTG 1 10673 ( 92) TGGTCGACGACT 1 264582 ( 83) TGGTCATCGATT 1 11896 ( 67) TCGTTATCGATT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 9291 bayes= 10.5957 E= 1.6e+000 -187 -1045 -70 152 -1045 176 -11 -1045 -1045 -1045 211 -1045 -1045 -1045 -1045 187 -1045 188 -1045 -94 -29 -1045 162 -194 -29 -170 -1045 138 -1045 88 130 -1045 -1045 -1045 211 -1045 45 -1045 -1045 123 45 30 -1045 38 -1045 -1045 111 87 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 1.6e+000 0.071429 0.000000 0.142857 0.785714 0.000000 0.785714 0.214286 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.857143 0.000000 0.142857 0.214286 0.000000 0.714286 0.071429 0.214286 0.071429 0.000000 0.714286 0.000000 0.428571 0.571429 0.000000 0.000000 0.000000 1.000000 0.000000 0.357143 0.000000 0.000000 0.642857 0.357143 0.285714 0.000000 0.357143 0.000000 0.000000 0.500000 0.500000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[CG]GTC[GA][TA][GC]G[TA][ATC][GT] -------------------------------------------------------------------------------- Time 9.61 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10673 4.21e-05 91_[+3(2.15e-05)]_205_\ [+2(2.33e-06)]_151_[+1(5.88e-05)]_8 1161 1.70e-02 193_[+3(1.66e-06)]_24_\ [+3(2.61e-05)]_259 11650 1.78e-01 254_[+2(7.71e-05)]_234 11896 1.30e-05 66_[+3(3.70e-05)]_303_\ [+1(2.42e-08)]_98 1412 1.40e-02 251_[+3(2.95e-06)]_237 21750 5.02e-07 126_[+3(2.00e-06)]_43_\ [+1(5.35e-09)]_298 22394 5.95e-07 180_[+2(1.14e-07)]_267_\ [+3(4.98e-07)]_29 24615 5.71e-05 458_[+1(3.38e-09)]_21 264582 6.33e-04 34_[+2(2.93e-06)]_36_[+3(2.44e-05)]_\ 406 30913 1.18e-04 243_[+1(3.38e-09)]_25_\ [+1(9.71e-06)]_190 35107 7.99e-05 153_[+3(7.04e-06)]_140_\ [+2(5.88e-07)]_183 4507 3.85e-11 79_[+3(9.30e-06)]_333_\ [+1(1.25e-13)]_55 5284 7.85e-03 47_[+3(4.98e-07)]_441 6843 2.37e-10 171_[+3(8.61e-06)]_36_\ [+2(7.46e-08)]_42_[+1(7.53e-09)]_2_[+1(6.16e-05)]_183 7519 3.29e-03 3_[+2(2.73e-07)]_70_[+2(2.82e-05)]_\ 403 7700 8.86e-01 500 7778 8.01e-02 104_[+3(1.74e-05)]_384 9162 2.66e-02 296_[+3(4.96e-06)]_192 9771 4.66e-08 113_[+3(4.56e-06)]_82_\ [+2(3.23e-06)]_234_[+1(9.96e-08)]_26 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************