******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/388/388.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10661 1.0000 500 20707 1.0000 500 21002 1.0000 500 21586 1.0000 500 22382 1.0000 500 23510 1.0000 500 23694 1.0000 500 24204 1.0000 500 24319 1.0000 500 261437 1.0000 500 262644 1.0000 500 262924 1.0000 500 264740 1.0000 500 31762 1.0000 500 36631 1.0000 500 5580 1.0000 500 8942 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/388/388.seqs.fa -oc motifs/388 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.256 C 0.230 G 0.234 T 0.281 Background letter frequencies (from dataset with add-one prior applied): A 0.256 C 0.230 G 0.234 T 0.281 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 18 sites = 13 llr = 180 E-value = 2.4e-009 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 578:65865:67:86:95 pos.-specific C 52:632221a4:9:4a:4 probability G 122413:22::21::::: matrix T ::::::::2::2:2::11 bits 2.1 * * 1.9 * * 1.7 * * * 1.5 * * ** Relative 1.3 * * * ** ** Entropy 1.1 ** * ** ***** (20.0 bits) 0.8 **** * ******** 0.6 ***** ** ********* 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel AAACAAAAACAACAACAA consensus C GGCGCCG C C C sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 20707 476 2.06e-09 GAATACAACC AAACAACAACAACAACAC ACCGGCC 24319 469 1.58e-08 CGATAGACCC CAACGAAAACAACACCAC ACTACAACAT 5580 459 3.80e-08 GAGCATCGAA CAAGAGAGGCCACAACAA ACGCTCTACA 23694 411 4.28e-08 GCTTTGCTGA CAACAGACACAACAACAT AGCAACACAA 10661 458 4.79e-08 GATCAATTCT ACACACACACAACACCAA CTATCAAGCA 262924 248 6.70e-08 TCTAAAGACA AAAGCAAAACCACTCCAA TCACTCTCAT 31762 174 7.45e-08 TTGTTTTGTA CAGCAACAACAGCAACAA CCTGATGGGC 36631 342 1.13e-07 TTATCTGAAT CAACCGCATCCACAACAC CCTCCCAGTC 24204 479 5.26e-07 TCACACTCAT AAAGCAAAACATCAACTC CAAC 8942 215 5.66e-07 GACTCATGAG GAGGCCAAGCCACAACAA TTTGCTGTGA 264740 58 1.49e-06 CGTTGTTCGG CGACAAAGTCAAGACCAA GATTCGTTCT 21586 222 1.58e-06 CTCTCTCCCA ACACACAAGCCTCTCCAA CCCAACGATA 21002 48 2.66e-06 TGCCCGACGG AGGGAGACCCAGCAACAC CGATTTGACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 20707 2.1e-09 475_[+1]_7 24319 1.6e-08 468_[+1]_14 5580 3.8e-08 458_[+1]_24 23694 4.3e-08 410_[+1]_72 10661 4.8e-08 457_[+1]_25 262924 6.7e-08 247_[+1]_235 31762 7.4e-08 173_[+1]_309 36631 1.1e-07 341_[+1]_141 24204 5.3e-07 478_[+1]_4 8942 5.7e-07 214_[+1]_268 264740 1.5e-06 57_[+1]_425 21586 1.6e-06 221_[+1]_261 21002 2.7e-06 47_[+1]_435 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=18 seqs=13 20707 ( 476) AAACAACAACAACAACAC 1 24319 ( 469) CAACGAAAACAACACCAC 1 5580 ( 459) CAAGAGAGGCCACAACAA 1 23694 ( 411) CAACAGACACAACAACAT 1 10661 ( 458) ACACACACACAACACCAA 1 262924 ( 248) AAAGCAAAACCACTCCAA 1 31762 ( 174) CAGCAACAACAGCAACAA 1 36631 ( 342) CAACCGCATCCACAACAC 1 24204 ( 479) AAAGCAAAACATCAACTC 1 8942 ( 215) GAGGCCAAGCCACAACAA 1 264740 ( 58) CGACAAAGTCAAGACCAA 1 21586 ( 222) ACACACAAGCCTCTCCAA 1 21002 ( 48) AGGGAGACCCAGCAACAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 8211 bayes= 9.83184 E= 2.4e-009 85 101 -160 -1035 144 -58 -60 -1035 159 -1035 -2 -1035 -1035 142 72 -1035 127 42 -160 -1035 85 1 39 -1035 159 1 -1035 -1035 127 1 -60 -1035 107 -158 -2 -87 -1035 212 -1035 -1035 127 74 -1035 -1035 144 -1035 -60 -87 -1035 201 -160 -1035 173 -1035 -1035 -87 127 74 -1035 -1035 -1035 212 -1035 -1035 185 -1035 -1035 -187 107 74 -1035 -187 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 13 E= 2.4e-009 0.461538 0.461538 0.076923 0.000000 0.692308 0.153846 0.153846 0.000000 0.769231 0.000000 0.230769 0.000000 0.000000 0.615385 0.384615 0.000000 0.615385 0.307692 0.076923 0.000000 0.461538 0.230769 0.307692 0.000000 0.769231 0.230769 0.000000 0.000000 0.615385 0.230769 0.153846 0.000000 0.538462 0.076923 0.230769 0.153846 0.000000 1.000000 0.000000 0.000000 0.615385 0.384615 0.000000 0.000000 0.692308 0.000000 0.153846 0.153846 0.000000 0.923077 0.076923 0.000000 0.846154 0.000000 0.000000 0.153846 0.615385 0.384615 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.923077 0.000000 0.000000 0.076923 0.538462 0.384615 0.000000 0.076923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AC]A[AG][CG][AC][AGC][AC][AC][AG]C[AC]ACA[AC]CA[AC] -------------------------------------------------------------------------------- Time 2.65 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 13 llr = 133 E-value = 2.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 1:::::::1::2 pos.-specific C :51291:5:3:1 probability G ::73:9:28::8 matrix T 95251:a417a: bits 2.1 1.9 * * 1.7 *** * 1.5 * *** * Relative 1.3 * *** * * Entropy 1.1 ** *** **** (14.8 bits) 0.8 *** *** **** 0.6 *** ******** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTGTCGTCGTTG consensus CTG T C sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 264740 31 6.39e-08 CCTATTTTGG TCGTCGTCGTTG TCGGTCGTTG 23694 476 6.39e-08 CTATATAATG TCGTCGTCGTTG GCAACGTCAT 5580 249 1.96e-06 TGTGAGTCAT TCTGCGTCGTTG TCTATGCAGG 21002 235 3.16e-06 ACGAGGATCA TCGGCGTGGCTG TTTTCTTCGT 23510 220 3.38e-06 GGACGCCATG TTGGCGTCGTTA TTTTTTCTGG 22382 217 5.48e-06 AGGTCGTCGT TCGTCGTCATTG TATCTCACGG 24319 42 6.66e-06 TTGAAACTCC TTCTCGTTGTTG GCCAAAAAAG 31762 120 1.22e-05 AGGAGAAGGG TTGCCGTCTTTG AGATGAAAGC 261437 96 1.89e-05 TCGTAAGGCC TTGTCCTGGTTG TGACGATATG 20707 240 2.00e-05 CCCCTCCTTC TTTGCGTTGTTA TTGCCCACAC 262644 25 2.24e-05 TCGGCTGGCA TTGCCGTTGCTC ATGCTCTGCA 8942 63 2.85e-05 TATCAACTTT TTGCTGTTGCTG CTGCTGCTGT 36631 393 3.76e-05 ATACATCGGT ACTTCGTTGCTG TGTCACTTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 264740 6.4e-08 30_[+2]_458 23694 6.4e-08 475_[+2]_13 5580 2e-06 248_[+2]_240 21002 3.2e-06 234_[+2]_254 23510 3.4e-06 219_[+2]_269 22382 5.5e-06 216_[+2]_272 24319 6.7e-06 41_[+2]_447 31762 1.2e-05 119_[+2]_369 261437 1.9e-05 95_[+2]_393 20707 2e-05 239_[+2]_249 262644 2.2e-05 24_[+2]_464 8942 2.9e-05 62_[+2]_426 36631 3.8e-05 392_[+2]_96 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=13 264740 ( 31) TCGTCGTCGTTG 1 23694 ( 476) TCGTCGTCGTTG 1 5580 ( 249) TCTGCGTCGTTG 1 21002 ( 235) TCGGCGTGGCTG 1 23510 ( 220) TTGGCGTCGTTA 1 22382 ( 217) TCGTCGTCATTG 1 24319 ( 42) TTCTCGTTGTTG 1 31762 ( 120) TTGCCGTCTTTG 1 261437 ( 96) TTGTCCTGGTTG 1 20707 ( 240) TTTGCGTTGTTA 1 262644 ( 25) TTGCCGTTGCTC 1 8942 ( 63) TTGCTGTTGCTG 1 36631 ( 393) ACTTCGTTGCTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 10.4841 E= 2.3e+001 -173 -1035 -1035 172 -1035 101 -1035 94 -1035 -158 156 -28 -1035 1 39 72 -1035 201 -1035 -187 -1035 -158 198 -1035 -1035 -1035 -1035 183 -1035 101 -60 45 -173 -1035 185 -187 -1035 42 -1035 130 -1035 -1035 -1035 183 -73 -158 172 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 2.3e+001 0.076923 0.000000 0.000000 0.923077 0.000000 0.461538 0.000000 0.538462 0.000000 0.076923 0.692308 0.230769 0.000000 0.230769 0.307692 0.461538 0.000000 0.923077 0.000000 0.076923 0.000000 0.076923 0.923077 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.461538 0.153846 0.384615 0.076923 0.000000 0.846154 0.076923 0.000000 0.307692 0.000000 0.692308 0.000000 0.000000 0.000000 1.000000 0.153846 0.076923 0.769231 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[TC][GT][TGC]CGT[CT]G[TC]TG -------------------------------------------------------------------------------- Time 5.85 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 94 E-value = 3.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 242822:a:8:8826:2:::: pos.-specific C :28::6a:8::22::248:a6 probability G 82::8:::22a::828422:4 matrix T :2:2:2::::::::2:::8:: bits 2.1 * * * 1.9 ** * * 1.7 ** * * 1.5 *** * * * Relative 1.3 * *** ******** * * * Entropy 1.1 * *** ******** * **** (27.1 bits) 0.8 * *** ******** * **** 0.6 * ******************* 0.4 * ******************* 0.2 * ******************* 0.0 --------------------- Multilevel GACAGCCACAGAAGAGCCTCC consensus ACATAA GG CCAGCGGG G sequence G T T A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 36631 13 3.35e-10 GTCTTCTCTT GCCAGCCAGAGAAGAGACGCC CAAACAAAGG 23510 449 3.70e-10 TGGAATAGTT GACTACCACAGAAGGGGCTCC CATTGACACG 31762 351 5.34e-10 TTGACCTGTG GACAGACACAGACAAGCCTCG TCGTCTCAAC 10661 190 3.28e-09 AAACCCGTTT GGAAGCCACGGAAGTCCCTCC AAACATAGAG 20707 393 6.04e-09 AGTGCTTGAG ATCAGTCACAGCAGAGGGTCG GTTCATTCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36631 3.3e-10 12_[+3]_467 23510 3.7e-10 448_[+3]_31 31762 5.3e-10 350_[+3]_129 10661 3.3e-09 189_[+3]_290 20707 6e-09 392_[+3]_87 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 36631 ( 13) GCCAGCCAGAGAAGAGACGCC 1 23510 ( 449) GACTACCACAGAAGGGGCTCC 1 31762 ( 351) GACAGACACAGACAAGCCTCG 1 10661 ( 190) GGAAGCCACGGAAGTCCCTCC 1 20707 ( 393) ATCAGTCACAGCAGAGGGTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8160 bayes= 10.9232 E= 3.8e+002 -35 -897 177 -897 65 -20 -23 -49 -35 180 -897 -897 164 -897 -897 -49 -35 -897 177 -897 -35 138 -897 -49 -897 212 -897 -897 197 -897 -897 -897 -897 180 -23 -897 164 -897 -23 -897 -897 -897 209 -897 164 -20 -897 -897 164 -20 -897 -897 -35 -897 177 -897 123 -897 -23 -49 -897 -20 177 -897 -35 80 77 -897 -897 180 -23 -897 -897 -897 -23 151 -897 212 -897 -897 -897 138 77 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 3.8e+002 0.200000 0.000000 0.800000 0.000000 0.400000 0.200000 0.200000 0.200000 0.200000 0.800000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.200000 0.000000 0.800000 0.000000 0.200000 0.600000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.600000 0.000000 0.200000 0.200000 0.000000 0.200000 0.800000 0.000000 0.200000 0.400000 0.400000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.600000 0.400000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA][ACGT][CA][AT][GA][CAT]CA[CG][AG]G[AC][AC][GA][AGT][GC][CGA][CG][TG]C[CG] -------------------------------------------------------------------------------- Time 8.77 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10661 1.94e-09 189_[+3(3.28e-09)]_247_\ [+1(4.79e-08)]_25 20707 1.46e-11 239_[+2(2.00e-05)]_141_\ [+3(6.04e-09)]_62_[+1(2.06e-09)]_7 21002 2.58e-05 47_[+1(2.66e-06)]_169_\ [+2(3.16e-06)]_3_[+2(2.68e-05)]_239 21586 8.33e-03 221_[+1(1.58e-06)]_261 22382 4.37e-03 216_[+2(5.48e-06)]_272 23510 6.90e-08 219_[+2(3.38e-06)]_217_\ [+3(3.70e-10)]_31 23694 6.18e-08 410_[+1(4.28e-08)]_47_\ [+2(6.39e-08)]_13 24204 1.81e-03 478_[+1(5.26e-07)]_4 24319 3.92e-06 41_[+2(6.66e-06)]_415_\ [+1(1.58e-08)]_14 261437 2.71e-02 95_[+2(1.89e-05)]_393 262644 1.23e-01 24_[+2(2.24e-05)]_464 262924 4.90e-04 247_[+1(6.70e-08)]_235 264740 7.57e-07 30_[+2(6.39e-08)]_15_[+1(1.49e-06)]_\ 425 31762 2.72e-11 119_[+2(1.22e-05)]_42_\ [+1(7.45e-08)]_159_[+3(5.34e-10)]_129 36631 7.41e-11 12_[+3(3.35e-10)]_262_\ [+1(7.80e-05)]_28_[+1(1.13e-07)]_33_[+2(3.76e-05)]_96 5580 1.99e-07 248_[+2(1.96e-06)]_198_\ [+1(3.80e-08)]_24 8942 8.40e-05 62_[+2(2.85e-05)]_140_\ [+1(5.66e-07)]_11_[+1(9.52e-05)]_239 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************