******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/330/330.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9653 1.0000 500 42730 1.0000 500 42848 1.0000 500 48219 1.0000 500 14899 1.0000 500 16711 1.0000 500 50310 1.0000 500 800 1.0000 500 44370 1.0000 500 54476 1.0000 500 12940 1.0000 500 48761 1.0000 500 36525 1.0000 500 34751 1.0000 500 45483 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/330/330.seqs.fa -oc motifs/330 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.232 G 0.224 T 0.273 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.232 G 0.224 T 0.273 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 14 llr = 151 E-value = 4.6e-007 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::a:61::a51: pos.-specific C 12:8:::4:114 probability G :8::19:6:426 matrix T 9::23:a:::5: bits 2.2 1.9 * * * 1.7 * ** * 1.5 * * ** * Relative 1.3 **** **** Entropy 1.1 **** **** * (15.6 bits) 0.9 **** **** * 0.6 ********** * 0.4 ********** * 0.2 ************ 0.0 ------------ Multilevel TGACAGTGAATG consensus C TT C GGC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 34751 278 1.17e-07 AACAAAGCCT TGACAGTGAGTG AAGTCCCTCT 54476 386 2.38e-07 TAAATCTGAC TGACAGTGAATC CTTGCTCACA 14899 77 1.46e-06 GGGAAATGGT TGACAGTGAGCC CCCAGGATAA 42848 19 2.17e-06 GATCTACGAC TGACAGTCAAGC AATGCCTTGG 36525 388 2.54e-06 TATTGAAGGT TGATAGTGAGTC AATAGAGCGA 48219 236 2.69e-06 AGATGCTGTC TGACAGTGACTG TGGTCCTCAT 42730 230 2.94e-06 AGTCCTTTAT TCACAGTCAATG CCAATGTCTT 45483 181 3.25e-06 AAGCCCTGGT TGACTGTGAGCG CTCATGCGGG 12940 70 4.25e-06 TTGATGTCAA TCACAGTCAGTC AAGCATAAAA 48761 45 7.54e-06 TGAGTCCCCT TGACGGTGAGGG AATACGTCTG 50310 160 7.54e-06 GGTGAAGTGA TCACTGTCAATG CCGTCGAGTC 16711 97 1.77e-05 CTTCCCTGAC TGATTGTGAAAC AAATAGCTGA 44370 375 2.21e-05 CGTGGTGGAG TGATTGTCAAAG GCCAAGGCGT 800 116 4.98e-05 GATTCGATTC CGACAATGAAGG AATTAGGACC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34751 1.2e-07 277_[+1]_211 54476 2.4e-07 385_[+1]_103 14899 1.5e-06 76_[+1]_412 42848 2.2e-06 18_[+1]_470 36525 2.5e-06 387_[+1]_101 48219 2.7e-06 235_[+1]_253 42730 2.9e-06 229_[+1]_259 45483 3.2e-06 180_[+1]_308 12940 4.2e-06 69_[+1]_419 48761 7.5e-06 44_[+1]_444 50310 7.5e-06 159_[+1]_329 16711 1.8e-05 96_[+1]_392 44370 2.2e-05 374_[+1]_114 800 5e-05 115_[+1]_373 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=14 34751 ( 278) TGACAGTGAGTG 1 54476 ( 386) TGACAGTGAATC 1 14899 ( 77) TGACAGTGAGCC 1 42848 ( 19) TGACAGTCAAGC 1 36525 ( 388) TGATAGTGAGTC 1 48219 ( 236) TGACAGTGACTG 1 42730 ( 230) TCACAGTCAATG 1 45483 ( 181) TGACTGTGAGCG 1 12940 ( 70) TCACAGTCAGTC 1 48761 ( 45) TGACGGTGAGGG 1 50310 ( 160) TCACTGTCAATG 1 16711 ( 97) TGATTGTGAAAC 1 44370 ( 375) TGATTGTCAAAG 1 800 ( 116) CGACAATGAAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 10.2544 E= 4.6e-007 -1045 -170 -1045 177 -1045 -11 181 -1045 188 -1045 -1045 -1045 -1045 176 -1045 -35 125 -1045 -165 7 -192 -1045 205 -1045 -1045 -1045 -1045 187 -1045 62 152 -1045 188 -1045 -1045 -1045 88 -170 93 -1045 -92 -70 -7 87 -1045 88 135 -1045 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 4.6e-007 0.000000 0.071429 0.000000 0.928571 0.000000 0.214286 0.785714 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.785714 0.000000 0.214286 0.642857 0.000000 0.071429 0.285714 0.071429 0.000000 0.928571 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.357143 0.642857 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.071429 0.428571 0.000000 0.142857 0.142857 0.214286 0.500000 0.000000 0.428571 0.571429 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[GC]A[CT][AT]GT[GC]A[AG][TG][GC] -------------------------------------------------------------------------------- Time 1.93 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 9 llr = 141 E-value = 8.7e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 7::11679323::42:a137: pos.-specific C 37a:311:6328::7::9::a probability G :::92321:1317:17::73: matrix T :3::3:::131136:3::::: bits 2.2 * * 1.9 * * * 1.7 ** * * 1.5 ** * ** * Relative 1.3 ** * ** * Entropy 1.1 **** * ** ****** (22.6 bits) 0.9 **** * ********** 0.6 **** **** ********** 0.4 **** **** ********** 0.2 ********* *********** 0.0 --------------------- Multilevel ACCGCAAACCACGTCGACGAC consensus CT TGG ATG TAAT AG sequence G AC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 36525 350 8.94e-10 AACGCAAACG ACCGCAAAATACTTCGACAAC AGCATAATAT 12940 423 1.39e-09 TTCCATTGCT ACCGTAGACAGCGTAGACGAC CACTCTCGCA 800 135 1.59e-09 AGGAATTAGG ACCGGAAACGGCGAAGACGAC GAGTATATTG 48219 330 6.29e-09 CTCTCGGTCT CTCGTGAACACCGACGACAAC CATTCATAAG 16711 248 2.85e-08 CCAATTGTAT ATCGCGCACCACTTCTACGGC CTCTAAGACC 48761 182 3.11e-08 TCTGGGTTGA ACCGGAAGACATGTCGACGAC GTGACGTACC 42848 315 1.73e-07 AGTGACATTG ACCGTGAACTTGGAGTACGGC AGCCAGTCTA 45483 423 2.92e-07 GAAGTCGCCA CTCGCCAAATCCGACGAAAGC GCATCATCGA 9653 288 3.79e-07 TGAAGTTGAC CCCAAAGATCGCTTCTACGAC GACAAGCTGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36525 8.9e-10 349_[+2]_130 12940 1.4e-09 422_[+2]_57 800 1.6e-09 134_[+2]_345 48219 6.3e-09 329_[+2]_150 16711 2.8e-08 247_[+2]_232 48761 3.1e-08 181_[+2]_298 42848 1.7e-07 314_[+2]_165 45483 2.9e-07 422_[+2]_57 9653 3.8e-07 287_[+2]_192 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=9 36525 ( 350) ACCGCAAAATACTTCGACAAC 1 12940 ( 423) ACCGTAGACAGCGTAGACGAC 1 800 ( 135) ACCGGAAACGGCGAAGACGAC 1 48219 ( 330) CTCGTGAACACCGACGACAAC 1 16711 ( 248) ATCGCGCACCACTTCTACGGC 1 48761 ( 182) ACCGGAAGACATGTCGACGAC 1 42848 ( 315) ACCGTGAACTTGGAGTACGGC 1 45483 ( 423) CTCGCCAAATCCGACGAAAGC 1 9653 ( 288) CCCAAAGATCGCTTCTACGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 10.4909 E= 8.7e-002 130 52 -982 -982 -982 152 -982 29 -982 211 -982 -982 -128 -982 199 -982 -128 52 -1 29 103 -106 57 -982 130 -106 -1 -982 171 -982 -101 -982 30 126 -982 -129 -29 52 -101 29 30 -6 57 -129 -982 174 -101 -129 -982 -982 157 29 71 -982 -982 103 -29 152 -101 -982 -982 -982 157 29 188 -982 -982 -982 -128 194 -982 -982 30 -982 157 -982 130 -982 57 -982 -982 211 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 8.7e-002 0.666667 0.333333 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.111111 0.000000 0.888889 0.000000 0.111111 0.333333 0.222222 0.333333 0.555556 0.111111 0.333333 0.000000 0.666667 0.111111 0.222222 0.000000 0.888889 0.000000 0.111111 0.000000 0.333333 0.555556 0.000000 0.111111 0.222222 0.333333 0.111111 0.333333 0.333333 0.222222 0.333333 0.111111 0.000000 0.777778 0.111111 0.111111 0.000000 0.000000 0.666667 0.333333 0.444444 0.000000 0.000000 0.555556 0.222222 0.666667 0.111111 0.000000 0.000000 0.000000 0.666667 0.333333 1.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AC][CT]CG[CTG][AG][AG]A[CA][CTA][AGC]C[GT][TA][CA][GT]AC[GA][AG]C -------------------------------------------------------------------------------- Time 4.09 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 7 llr = 96 E-value = 1.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::a::1::344a69: pos.-specific C :9:391:a3:4:::: probability G a1:3::::14::41a matrix T :::417a:311:::: bits 2.2 * * * 1.9 * * ** * * 1.7 * * ** * * 1.5 *** * ** * * Relative 1.3 *** * ** * ** Entropy 1.1 *** * ** **** (19.8 bits) 0.9 *** **** **** 0.6 *** **** * **** 0.4 ******** ****** 0.2 ******** ****** 0.0 --------------- Multilevel GCATCTTCAAAAAAG consensus C CGC G sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 9653 478 8.29e-10 CTCCTCGAGA GCATCTTCCGCAAAG AAGTCTGG 14899 218 4.61e-08 TGTATATTTT GCAGCTTCCGAAGAG CGTGGCCAAC 34751 260 2.46e-07 CCTTCGCCAC GCACCCTCAACAAAG CCTTGACAGT 48219 213 2.46e-07 GAAAGCTACG GCAGCTTCAATAGAG ATGCTGTCTG 800 389 4.06e-07 TTTCCGAAAA GGACCTTCTGAAAAG TGCAGAGCTT 50310 80 9.83e-07 CGTTGCAAGC GCATTTTCTTCAGAG AATCCACAGT 45483 444 1.50e-06 CGACGAAAGC GCATCATCGAAAAGG GGGGAACATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9653 8.3e-10 477_[+3]_8 14899 4.6e-08 217_[+3]_268 34751 2.5e-07 259_[+3]_226 48219 2.5e-07 212_[+3]_273 800 4.1e-07 388_[+3]_97 50310 9.8e-07 79_[+3]_406 45483 1.5e-06 443_[+3]_42 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=7 9653 ( 478) GCATCTTCCGCAAAG 1 14899 ( 218) GCAGCTTCCGAAGAG 1 34751 ( 260) GCACCCTCAACAAAG 1 48219 ( 213) GCAGCTTCAATAGAG 1 800 ( 389) GGACCTTCTGAAAAG 1 50310 ( 80) GCATTTTCTTCAGAG 1 45483 ( 444) GCATCATCGAAAAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 7290 bayes= 9.86668 E= 1.3e+002 -945 -945 216 -945 -945 188 -65 -945 188 -945 -945 -945 -945 30 35 65 -945 188 -945 -93 -92 -70 -945 139 -945 -945 -945 187 -945 211 -945 -945 8 30 -65 7 66 -945 93 -93 66 88 -945 -93 188 -945 -945 -945 107 -945 93 -945 166 -945 -65 -945 -945 -945 216 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 7 E= 1.3e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.857143 0.142857 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.285714 0.285714 0.428571 0.000000 0.857143 0.000000 0.142857 0.142857 0.142857 0.000000 0.714286 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.285714 0.285714 0.142857 0.285714 0.428571 0.000000 0.428571 0.142857 0.428571 0.428571 0.000000 0.142857 1.000000 0.000000 0.000000 0.000000 0.571429 0.000000 0.428571 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GCA[TCG]CTTC[ACT][AG][AC]A[AG]AG -------------------------------------------------------------------------------- Time 5.83 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9653 7.12e-09 287_[+2(3.79e-07)]_169_\ [+3(8.29e-10)]_8 42730 8.17e-03 229_[+1(2.94e-06)]_259 42848 1.27e-05 18_[+1(2.17e-06)]_225_\ [+1(6.90e-05)]_47_[+2(1.73e-07)]_165 48219 2.05e-10 212_[+3(2.46e-07)]_8_[+1(2.69e-06)]_\ 82_[+2(6.29e-09)]_150 14899 1.28e-06 76_[+1(1.46e-06)]_73_[+1(1.77e-05)]_\ 44_[+3(4.61e-08)]_268 16711 1.45e-05 96_[+1(1.77e-05)]_139_\ [+2(2.85e-08)]_232 50310 1.24e-04 79_[+3(9.83e-07)]_65_[+1(7.54e-06)]_\ 329 800 1.36e-09 115_[+1(4.98e-05)]_7_[+2(1.59e-09)]_\ 233_[+3(4.06e-07)]_97 44370 1.14e-01 374_[+1(2.21e-05)]_114 54476 3.12e-03 385_[+1(2.38e-07)]_103 12940 7.55e-08 69_[+1(4.25e-06)]_341_\ [+2(1.39e-09)]_57 48761 6.22e-06 44_[+1(7.54e-06)]_125_\ [+2(3.11e-08)]_298 36525 1.08e-07 234_[+1(9.48e-05)]_103_\ [+2(8.94e-10)]_17_[+1(2.54e-06)]_101 34751 1.93e-07 259_[+3(2.46e-07)]_3_[+1(1.17e-07)]_\ 75_[+1(5.30e-05)]_124 45483 4.51e-08 180_[+1(3.25e-06)]_230_\ [+2(2.92e-07)]_[+3(1.50e-06)]_42 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************