******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/42/42.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11777 1.0000 500 13884 1.0000 500 20564 1.0000 500 20833 1.0000 500 22725 1.0000 500 25587 1.0000 500 26051 1.0000 500 260833 1.0000 500 263707 1.0000 500 268644 1.0000 500 26931 1.0000 500 273 1.0000 500 33044 1.0000 500 36576 1.0000 500 40554 1.0000 500 ThpsCp091 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/42/42.seqs.fa -oc motifs/42 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.284 C 0.243 G 0.216 T 0.256 Background letter frequencies (from dataset with add-one prior applied): A 0.284 C 0.243 G 0.217 T 0.256 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 12 llr = 162 E-value = 2.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::2:28:1:2:4325:3832 pos.-specific C :1123::3:::3::1::2:2 probability G :9544:37819377:87137 matrix T a:34238:381:12421:5: bits 2.2 2.0 * 1.8 ** * 1.5 ** * * Relative 1.3 ** * * * * Entropy 1.1 ** ** * * * (19.5 bits) 0.9 ** ****** ** *** * 0.7 ** * ****** ****** * 0.4 **** *************** 0.2 ******************** 0.0 -------------------- Multilevel TGGGGATGGTGAGGAGGATG consensus TTCTGCT GA T A A sequence C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 40554 256 5.51e-11 AGGTTGAAGA TGTTGATGGTGGGGAGGAGG TTGCTGTGTA 260833 227 2.61e-09 AGGAGGGACA TGGTTATGGAGAGGTGGAGG TGGTGTACCG 20833 294 5.47e-08 TCAACGTTAA TGGTAATGTTGAAGATGATG GAAGTGCTCA 26051 379 1.35e-07 TGTTGTCCTA TGTGGATAGTGAGGAGACAG CGAGACCCTC 268644 226 1.96e-07 ATGGAGATGC TGCCGATGTGGCGGTGGATG ACGAGACGGA 273 155 2.35e-07 AACGTTTGGC TCGGCAGCGTGAAGAGGAGG AACGTCTTGC 13884 255 3.06e-07 TGCGAGAGGG TGGGGTGCGTGCGTAGGAAC GAGAAAGGGG 263707 229 5.07e-07 GGCTGGGTTT TGGTTTTGGTGCATTGTATG ATTTTCATGT 11777 60 1.10e-06 GCTTCGTTTA TGACCATGGTTGGGTGAATA GCCGAGCACC 36576 14 1.46e-06 CAGAGCCTCT TGTGCTTGGTGAGATTGCTA GTGAGAACAT 33044 191 1.68e-06 AGACGAGAGA TGGGGAGCTTGGTGCGGGTG ATGTGTGATC 26931 405 2.19e-06 TGGTGAACTC TGATAATGGAGGGAAGAAAC GACAACGCCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40554 5.5e-11 255_[+1]_225 260833 2.6e-09 226_[+1]_254 20833 5.5e-08 293_[+1]_187 26051 1.3e-07 378_[+1]_102 268644 2e-07 225_[+1]_255 273 2.4e-07 154_[+1]_326 13884 3.1e-07 254_[+1]_226 263707 5.1e-07 228_[+1]_252 11777 1.1e-06 59_[+1]_421 36576 1.5e-06 13_[+1]_467 33044 1.7e-06 190_[+1]_290 26931 2.2e-06 404_[+1]_76 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=12 40554 ( 256) TGTTGATGGTGGGGAGGAGG 1 260833 ( 227) TGGTTATGGAGAGGTGGAGG 1 20833 ( 294) TGGTAATGTTGAAGATGATG 1 26051 ( 379) TGTGGATAGTGAGGAGACAG 1 268644 ( 226) TGCCGATGTGGCGGTGGATG 1 273 ( 155) TCGGCAGCGTGAAGAGGAGG 1 13884 ( 255) TGGGGTGCGTGCGTAGGAAC 1 263707 ( 229) TGGTTTTGGTGCATTGTATG 1 11777 ( 60) TGACCATGGTTGGGTGAATA 1 36576 ( 14) TGTGCTTGGTGAGATTGCTA 1 33044 ( 191) TGGGGAGCTTGGTGCGGGTG 1 26931 ( 405) TGATAATGGAGGGAAGAAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7696 bayes= 9.77074 E= 2.6e-001 -1023 -1023 -1023 196 -1023 -154 208 -1023 -77 -154 121 -3 -1023 -55 94 70 -77 4 94 -62 140 -1023 -1023 -3 -1023 -1023 21 155 -177 4 162 -1023 -1023 -1023 179 -3 -77 -1023 -138 155 -1023 -1023 208 -162 55 4 62 -1023 -18 -1023 162 -162 -77 -1023 162 -62 81 -154 -1023 70 -1023 -1023 194 -62 -18 -1023 162 -162 140 -55 -138 -1023 -18 -1023 21 97 -77 -55 162 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 12 E= 2.6e-001 0.000000 0.000000 0.000000 1.000000 0.000000 0.083333 0.916667 0.000000 0.166667 0.083333 0.500000 0.250000 0.000000 0.166667 0.416667 0.416667 0.166667 0.250000 0.416667 0.166667 0.750000 0.000000 0.000000 0.250000 0.000000 0.000000 0.250000 0.750000 0.083333 0.250000 0.666667 0.000000 0.000000 0.000000 0.750000 0.250000 0.166667 0.000000 0.083333 0.750000 0.000000 0.000000 0.916667 0.083333 0.416667 0.250000 0.333333 0.000000 0.250000 0.000000 0.666667 0.083333 0.166667 0.000000 0.666667 0.166667 0.500000 0.083333 0.000000 0.416667 0.000000 0.000000 0.833333 0.166667 0.250000 0.000000 0.666667 0.083333 0.750000 0.166667 0.083333 0.000000 0.250000 0.000000 0.250000 0.500000 0.166667 0.166667 0.666667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TG[GT][GT][GC][AT][TG][GC][GT]TG[AGC][GA]G[AT]G[GA]A[TAG]G -------------------------------------------------------------------------------- Time 2.08 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 9 llr = 121 E-value = 5.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 8:88:42999:8a:6a pos.-specific C 1722347:::21:14: probability G :3::7:111181:9:: matrix T 1::::1:::::::::: bits 2.2 2.0 1.8 ** * 1.5 ** * Relative 1.3 * **** ** * Entropy 1.1 **** **** ** * (19.4 bits) 0.9 ***** ********** 0.7 ***** ********** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel ACAAGACAAAGAAGAA consensus GCCCCA C C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 268644 334 1.87e-08 GTACAGAAGA AGCAGCCAAAGAAGAA GAAGCAGCTC 40554 112 3.13e-08 AACTAAAGGC TCAAGCCAAAGAAGCA GCCAAAAAGG 26931 171 3.75e-08 TTCATACACA ACAAGACGAAGAAGAA TGCGTGCTAT 273 365 2.27e-07 CGTAAGAAGC AGAAGCAAAAGGAGAA AGTATGTACC 11777 28 3.40e-07 CTCTCGATAA ACACGTAAAAGAAGCA GCTTAAGCTT 263707 48 4.50e-07 CTGAGATACC ACAACACAAGCAAGCA CTCTCCCAAT 36576 259 5.92e-07 CAGCTGAACA ACAACAGAGAGAAGAA GTCAAATGAT 20564 417 1.32e-06 GCCCAGGCAA CGCAGCCAAAGCAGCA ACAGTAGAGG 260833 440 1.53e-06 ACGTCCGTGG ACACCACAAACAACAA CTCTTCTCTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 268644 1.9e-08 333_[+2]_151 40554 3.1e-08 111_[+2]_373 26931 3.7e-08 170_[+2]_314 273 2.3e-07 364_[+2]_120 11777 3.4e-07 27_[+2]_457 263707 4.5e-07 47_[+2]_437 36576 5.9e-07 258_[+2]_226 20564 1.3e-06 416_[+2]_68 260833 1.5e-06 439_[+2]_45 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=9 268644 ( 334) AGCAGCCAAAGAAGAA 1 40554 ( 112) TCAAGCCAAAGAAGCA 1 26931 ( 171) ACAAGACGAAGAAGAA 1 273 ( 365) AGAAGCAAAAGGAGAA 1 11777 ( 28) ACACGTAAAAGAAGCA 1 263707 ( 48) ACAACACAAGCAAGCA 1 36576 ( 259) ACAACAGAGAGAAGAA 1 20564 ( 417) CGCAGCCAAAGCAGCA 1 260833 ( 440) ACACCACAAACAACAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7760 bayes= 9.88469 E= 5.9e+000 145 -113 -982 -120 -982 145 62 -982 145 -13 -982 -982 145 -13 -982 -982 -982 45 162 -982 64 87 -982 -120 -35 145 -96 -982 164 -982 -96 -982 164 -982 -96 -982 164 -982 -96 -982 -982 -13 184 -982 145 -113 -96 -982 181 -982 -982 -982 -982 -113 204 -982 97 87 -982 -982 181 -982 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 5.9e+000 0.777778 0.111111 0.000000 0.111111 0.000000 0.666667 0.333333 0.000000 0.777778 0.222222 0.000000 0.000000 0.777778 0.222222 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.444444 0.444444 0.000000 0.111111 0.222222 0.666667 0.111111 0.000000 0.888889 0.000000 0.111111 0.000000 0.888889 0.000000 0.111111 0.000000 0.888889 0.000000 0.111111 0.000000 0.000000 0.222222 0.777778 0.000000 0.777778 0.111111 0.111111 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.000000 0.555556 0.444444 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[CG][AC][AC][GC][AC][CA]AAA[GC]AAG[AC]A -------------------------------------------------------------------------------- Time 4.30 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 14 sites = 13 llr = 141 E-value = 4.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :94:35:814382: pos.-specific C a:3855a:86228a probability G ::::21:2::1::: matrix T :1321:::2:41:: bits 2.2 2.0 * * * 1.8 * * * 1.5 ** * * Relative 1.3 ** * ** * Entropy 1.1 ** * *** ** (15.7 bits) 0.9 ** * **** *** 0.7 ** * ***** *** 0.4 **** ***** *** 0.2 ************** 0.0 -------------- Multilevel CAACCACACCTACC consensus C AC AA A sequence T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 33044 359 4.92e-09 GTCCGACAGC CAACCCCACCTACC TCTACCTCCC 20564 390 2.56e-07 CAAGTCTATA CAACAACACCAACC AACGCCCAGG 263707 418 1.08e-06 CTCAGCCAAC CATCCCCATCAACC TCACAGTACA 13884 9 1.08e-06 CAATACTG CATCCCCACATAAC TTTCACTACT 25587 81 3.27e-06 TCTCACAAAA CACCGACGCCAACC TCAAGAGGTA 26051 447 3.91e-06 CCTGCTTTCC CACCACCGCACACC ACAACAACTC 26931 12 5.97e-06 CGACGATGGC CTCCCACACAAACC ACTCCTCACC 36576 469 6.50e-06 CTAACCTACA CATTCACATCTACC ATACTTCCTA 40554 391 8.25e-06 AACAAACAGG CAACTGCACCCACC GACGATGAAA 11777 155 9.66e-06 TTGAGAGTGT CAACGACAACCACC ACATCGTCCT 22725 483 1.12e-05 CTTCAACAGC CAACAACACATCAC AAAC 268644 424 1.21e-05 CTGTAGAAAC CACCCCCACCGCAC CCGGCGGAGG 20833 484 2.30e-05 CCTTTCTTCA CATTACCACATTCC AAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33044 4.9e-09 358_[+3]_128 20564 2.6e-07 389_[+3]_97 263707 1.1e-06 417_[+3]_69 13884 1.1e-06 8_[+3]_478 25587 3.3e-06 80_[+3]_406 26051 3.9e-06 446_[+3]_40 26931 6e-06 11_[+3]_475 36576 6.5e-06 468_[+3]_18 40554 8.2e-06 390_[+3]_96 11777 9.7e-06 154_[+3]_332 22725 1.1e-05 482_[+3]_4 268644 1.2e-05 423_[+3]_63 20833 2.3e-05 483_[+3]_3 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=14 seqs=13 33044 ( 359) CAACCCCACCTACC 1 20564 ( 390) CAACAACACCAACC 1 263707 ( 418) CATCCCCATCAACC 1 13884 ( 9) CATCCCCACATAAC 1 25587 ( 81) CACCGACGCCAACC 1 26051 ( 447) CACCACCGCACACC 1 26931 ( 12) CTCCCACACAAACC 1 36576 ( 469) CATTCACATCTACC 1 40554 ( 391) CAACTGCACCCACC 1 11777 ( 155) CAACGACAACCACC 1 22725 ( 483) CAACAACACATCAC 1 268644 ( 424) CACCCCCACCGCAC 1 20833 ( 484) CATTACCACATTCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 7792 bayes= 9.75619 E= 4.6e-001 -1035 204 -1035 -1035 170 -1035 -1035 -173 44 34 -1035 27 -1035 180 -1035 -73 11 92 -49 -173 70 92 -149 -1035 -1035 204 -1035 -1035 157 -1035 -49 -1035 -188 166 -1035 -73 44 134 -1035 -1035 11 -8 -149 59 144 -66 -1035 -173 -30 166 -1035 -1035 -1035 204 -1035 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 13 E= 4.6e-001 0.000000 1.000000 0.000000 0.000000 0.923077 0.000000 0.000000 0.076923 0.384615 0.307692 0.000000 0.307692 0.000000 0.846154 0.000000 0.153846 0.307692 0.461538 0.153846 0.076923 0.461538 0.461538 0.076923 0.000000 0.000000 1.000000 0.000000 0.000000 0.846154 0.000000 0.153846 0.000000 0.076923 0.769231 0.000000 0.153846 0.384615 0.615385 0.000000 0.000000 0.307692 0.230769 0.076923 0.384615 0.769231 0.153846 0.000000 0.076923 0.230769 0.769231 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CA[ACT]C[CA][AC]CAC[CA][TAC]A[CA]C -------------------------------------------------------------------------------- Time 6.35 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11777 1.05e-07 27_[+2(3.40e-07)]_16_[+1(1.10e-06)]_\ 7_[+2(6.14e-05)]_52_[+3(9.66e-06)]_332 13884 5.28e-06 8_[+3(1.08e-06)]_232_[+1(3.06e-07)]_\ 226 20564 9.51e-06 389_[+3(2.56e-07)]_13_\ [+2(1.32e-06)]_68 20833 2.61e-05 293_[+1(5.47e-08)]_170_\ [+3(2.30e-05)]_3 22725 8.95e-02 358_[+3(2.87e-05)]_110_\ [+3(1.12e-05)]_4 25587 2.84e-02 80_[+3(3.27e-06)]_406 26051 1.09e-05 378_[+1(1.35e-07)]_48_\ [+3(3.91e-06)]_40 260833 1.59e-08 226_[+1(2.61e-09)]_193_\ [+2(1.53e-06)]_45 263707 9.01e-09 47_[+2(4.50e-07)]_165_\ [+1(5.07e-07)]_169_[+3(1.08e-06)]_69 268644 1.84e-09 225_[+1(1.96e-07)]_88_\ [+2(1.87e-08)]_8_[+2(7.60e-06)]_50_[+3(1.21e-05)]_63 26931 1.69e-08 11_[+3(5.97e-06)]_64_[+2(3.93e-05)]_\ 10_[+2(5.52e-05)]_39_[+2(3.75e-08)]_218_[+1(2.19e-06)]_76 273 2.30e-06 60_[+2(2.69e-05)]_78_[+1(2.35e-07)]_\ 190_[+2(2.27e-07)]_120 33044 4.13e-07 190_[+1(1.68e-06)]_148_\ [+3(4.92e-09)]_52_[+3(3.36e-05)]_62 36576 1.57e-07 13_[+1(1.46e-06)]_225_\ [+2(5.92e-07)]_194_[+3(6.50e-06)]_18 40554 9.93e-13 111_[+2(3.13e-08)]_128_\ [+1(5.51e-11)]_4_[+1(8.13e-07)]_91_[+3(8.25e-06)]_44_[+3(9.99e-05)]_38 ThpsCp091 6.94e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************