******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/375/375.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10441 1.0000 500 1863 1.0000 500 20600 1.0000 500 22634 1.0000 500 22648 1.0000 500 22821 1.0000 500 2331 1.0000 500 24656 1.0000 500 264413 1.0000 500 264431 1.0000 500 268206 1.0000 500 5217 1.0000 500 5688 1.0000 500 7119 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/375/375.seqs.fa -oc motifs/375 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.259 C 0.229 G 0.230 T 0.282 Background letter frequencies (from dataset with add-one prior applied): A 0.259 C 0.229 G 0.230 T 0.282 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 6 llr = 95 E-value = 1.8e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a2583:2a:a:7:75: pos.-specific C :82:2a7:8:83a3:a probability G :::25:2:2:2::::: matrix T ::3:::::::::::5: bits 2.1 * * * 1.9 * * * * * * 1.7 * * * * * * 1.5 ** * **** * * Relative 1.3 ** * * **** * * Entropy 1.1 ** * * ******* * (23.0 bits) 0.9 ** * *********** 0.6 ** ************* 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel ACAAGCCACACACAAC consensus T A C CT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 22821 439 1.60e-09 ACTCCATTAC ACTAGCCACACACATC AATACTATCA 5688 405 2.95e-09 CGGCAGGCAG ACTAACCACACACAAC GCCTCACCTC 5217 411 1.16e-08 CTTACAAATG ACAAACGACACACATC AAACACTCTC 2331 47 3.44e-08 TCCTTTGGCA AAAAGCCACACCCCAC CGGACTCACA 24656 443 1.12e-07 TATATTTACA ACCAGCAAGACACATC GCAGTCTTTT 10441 462 2.15e-07 CTGCCGCATC ACAGCCCACAGCCCAC GGAGCAATCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 22821 1.6e-09 438_[+1]_46 5688 2.9e-09 404_[+1]_80 5217 1.2e-08 410_[+1]_74 2331 3.4e-08 46_[+1]_438 24656 1.1e-07 442_[+1]_42 10441 2.2e-07 461_[+1]_23 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=6 22821 ( 439) ACTAGCCACACACATC 1 5688 ( 405) ACTAACCACACACAAC 1 5217 ( 411) ACAAACGACACACATC 1 2331 ( 47) AAAAGCCACACCCCAC 1 24656 ( 443) ACCAGCAAGACACATC 1 10441 ( 462) ACAGCCCACAGCCCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 10.5908 E= 1.8e+000 195 -923 -923 -923 -63 186 -923 -923 95 -46 -923 24 168 -923 -47 -923 36 -46 112 -923 -923 213 -923 -923 -63 154 -47 -923 195 -923 -923 -923 -923 186 -47 -923 195 -923 -923 -923 -923 186 -47 -923 136 54 -923 -923 -923 213 -923 -923 136 54 -923 -923 95 -923 -923 83 -923 213 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 1.8e+000 1.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.500000 0.166667 0.000000 0.333333 0.833333 0.000000 0.166667 0.000000 0.333333 0.166667 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.666667 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- AC[AT]A[GA]CCACAC[AC]C[AC][AT]C -------------------------------------------------------------------------------- Time 1.93 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 13 llr = 129 E-value = 2.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::617a21:a pos.-specific C :22::8::2::: probability G 4:1a213:388: matrix T 687:2:::322: bits 2.1 * 1.9 * * * 1.7 * * * 1.5 * * * Relative 1.3 * * * ** Entropy 1.1 ** * *** *** (14.4 bits) 0.9 **** *** *** 0.6 ******** *** 0.4 ******** *** 0.2 ******** *** 0.0 ------------ Multilevel TTTGACAAGGGA consensus GCC G G T T sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 5688 213 2.61e-07 GCGATGTGCA TTTGACAACGGA GCTGTGGCGA 22648 457 6.89e-07 GCCCAACATT TTTGGCAAGGGA GTTGCAACAT 24656 356 1.41e-06 ATTTGTCGTA TCTGACAATGGA CTCTGCCAAC 264413 105 2.56e-06 TGCGGTGTCT TTTGTCAATGGA GGATTCCGTA 264431 3 4.71e-06 TC GTCGACGACGGA CGGGAGAGGA 2331 109 8.56e-06 CTGATACGTC TCCGACGAGGGA TCGTATTGGA 20600 211 1.26e-05 AGCCGGGTGA GTTGGCAAGTGA TGTATCCTGT 10441 121 1.56e-05 AATCAACAAG TTTGACAAAAGA ATGAATATTG 7119 264 1.85e-05 ATACATTTGC GTCGACGATGTA TTGGAGTGAG 5217 321 3.11e-05 TGTGTCCCCT TTGGACAATTGA GTCTGTTTTG 1863 444 3.58e-05 TAGATATCAC TTTGAAAACGTA GTGGGCTGAC 22821 201 3.84e-05 TACCGATGTG GCTGTCGAAGGA CGCTACAGCC 22634 4 6.19e-05 TGT GTTGGGAAGGTA GACATCAGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5688 2.6e-07 212_[+2]_276 22648 6.9e-07 456_[+2]_32 24656 1.4e-06 355_[+2]_133 264413 2.6e-06 104_[+2]_384 264431 4.7e-06 2_[+2]_486 2331 8.6e-06 108_[+2]_380 20600 1.3e-05 210_[+2]_278 10441 1.6e-05 120_[+2]_368 7119 1.9e-05 263_[+2]_225 5217 3.1e-05 320_[+2]_168 1863 3.6e-05 443_[+2]_45 22821 3.8e-05 200_[+2]_288 22634 6.2e-05 3_[+2]_485 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=13 5688 ( 213) TTTGACAACGGA 1 22648 ( 457) TTTGGCAAGGGA 1 24656 ( 356) TCTGACAATGGA 1 264413 ( 105) TTTGTCAATGGA 1 264431 ( 3) GTCGACGACGGA 1 2331 ( 109) TCCGACGAGGGA 1 20600 ( 211) GTTGGCAAGTGA 1 10441 ( 121) TTTGACAAAAGA 1 7119 ( 264) GTCGACGATGTA 1 5217 ( 321) TTGGACAATTGA 1 1863 ( 444) TTTGAAAACGTA 1 22821 ( 201) GCTGTCGAAGGA 1 22634 ( 4) GTTGGGAAGGTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 8.95154 E= 2.9e+000 -1035 -1035 74 113 -1035 1 -1035 145 -1035 1 -158 130 -1035 -1035 212 -1035 125 -1035 0 -87 -175 189 -158 -1035 142 -1035 42 -1035 195 -1035 -1035 -1035 -75 1 42 13 -175 -1035 174 -87 -1035 -1035 174 -29 195 -1035 -1035 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 2.9e+000 0.000000 0.000000 0.384615 0.615385 0.000000 0.230769 0.000000 0.769231 0.000000 0.230769 0.076923 0.692308 0.000000 0.000000 1.000000 0.000000 0.615385 0.000000 0.230769 0.153846 0.076923 0.846154 0.076923 0.000000 0.692308 0.000000 0.307692 0.000000 1.000000 0.000000 0.000000 0.000000 0.153846 0.230769 0.307692 0.307692 0.076923 0.000000 0.769231 0.153846 0.000000 0.000000 0.769231 0.230769 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TG][TC][TC]G[AG]C[AG]A[GTC]G[GT]A -------------------------------------------------------------------------------- Time 3.81 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 10 llr = 141 E-value = 4.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :a:28255:42a257269:a8 pos.-specific C 5:22212185::5112::5:1 probability G 2:74:61:218:241:415:1 matrix T 3:12:124::::1:16::::: bits 2.1 1.9 * * * 1.7 * * * 1.5 * * * * * Relative 1.3 * * * ** * * Entropy 1.1 * * * ** ***** (20.3 bits) 0.9 ** * * ** ***** 0.6 *** * ***** ******** 0.4 *** ** ***** ******** 0.2 *** ***************** 0.0 --------------------- Multilevel CAGGAGAACCGACAATAACAA consensus T CACACTGAA AG AG G sequence G C T G C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 268206 382 4.65e-10 TTCAAACGGA TAGTAGAACCGAGAATAAGAA TGTGGAAATC 5688 148 1.38e-09 TAGACTACCT CATGAGAACCGACAACAACAA CTGATGTTCC 22634 170 5.37e-08 ACACTGATAC CAGCACCACAGAGGACAACAA TATTCCTCTT 1863 402 1.26e-07 GTCAATGCAA TACACACTCCGACAATAACAA ATACTGACCA 7119 315 1.78e-07 TATCGAACGT CAGGAGAACAGATGATGGGAG TTGCAATCAC 20600 289 2.92e-07 GATTATATAT GAGGAGAAGAGACGTTGACAC GGTTGCAACG 22821 56 3.98e-07 ACAGCAGCAG TAGGATTTGCAACGATGAGAA CTGCCTCATG 5217 199 5.34e-07 ATTGCAAGTG CAGCAGTCCAGAACGTAACAA AGATCTTCAT 22648 166 6.16e-07 ACTATGTCGG GAGTCAATCCAAAAAAAAGAA CCTCGCCTCC 264431 288 9.83e-07 GCGGGAGATA CACAAGGTCGGACACAGAGAA GTGGTGATGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 268206 4.6e-10 381_[+3]_98 5688 1.4e-09 147_[+3]_332 22634 5.4e-08 169_[+3]_310 1863 1.3e-07 401_[+3]_78 7119 1.8e-07 314_[+3]_165 20600 2.9e-07 288_[+3]_191 22821 4e-07 55_[+3]_424 5217 5.3e-07 198_[+3]_281 22648 6.2e-07 165_[+3]_314 264431 9.8e-07 287_[+3]_192 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=10 268206 ( 382) TAGTAGAACCGAGAATAAGAA 1 5688 ( 148) CATGAGAACCGACAACAACAA 1 22634 ( 170) CAGCACCACAGAGGACAACAA 1 1863 ( 402) TACACACTCCGACAATAACAA 1 7119 ( 315) CAGGAGAACAGATGATGGGAG 1 20600 ( 289) GAGGAGAAGAGACGTTGACAC 1 22821 ( 56) TAGGATTTGCAACGATGAGAA 1 5217 ( 199) CAGCAGTCCAGAACGTAACAA 1 22648 ( 166) GAGTCAATCCAAAAAAAAGAA 1 264431 ( 288) CACAAGGTCGGACACAGAGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6720 bayes= 9.64205 E= 4.0e+001 -997 113 -20 9 195 -997 -997 -997 -997 -19 160 -149 -37 -19 80 -49 163 -19 -997 -997 -37 -119 138 -149 95 -19 -120 -49 95 -119 -997 50 -997 180 -20 -997 63 113 -120 -997 -37 -997 180 -997 195 -997 -997 -997 -37 113 -20 -149 95 -119 80 -997 143 -119 -120 -149 -37 -19 -997 109 121 -997 80 -997 180 -997 -120 -997 -997 113 112 -997 195 -997 -997 -997 163 -119 -120 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 4.0e+001 0.000000 0.500000 0.200000 0.300000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.700000 0.100000 0.200000 0.200000 0.400000 0.200000 0.800000 0.200000 0.000000 0.000000 0.200000 0.100000 0.600000 0.100000 0.500000 0.200000 0.100000 0.200000 0.500000 0.100000 0.000000 0.400000 0.000000 0.800000 0.200000 0.000000 0.400000 0.500000 0.100000 0.000000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.500000 0.200000 0.100000 0.500000 0.100000 0.400000 0.000000 0.700000 0.100000 0.100000 0.100000 0.200000 0.200000 0.000000 0.600000 0.600000 0.000000 0.400000 0.000000 0.900000 0.000000 0.100000 0.000000 0.000000 0.500000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.100000 0.100000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CTG]A[GC][GACT][AC][GA][ACT][AT][CG][CA][GA]A[CAG][AG]A[TAC][AG]A[CG]AA -------------------------------------------------------------------------------- Time 5.65 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10441 5.21e-05 120_[+2(1.56e-05)]_329_\ [+1(2.15e-07)]_23 1863 6.22e-05 401_[+3(1.26e-07)]_21_\ [+2(3.58e-05)]_45 20600 7.13e-05 210_[+2(1.26e-05)]_66_\ [+3(2.92e-07)]_191 22634 4.21e-05 3_[+2(6.19e-05)]_154_[+3(5.37e-08)]_\ 310 22648 8.31e-06 165_[+3(6.16e-07)]_270_\ [+2(6.89e-07)]_20_[+2(3.11e-05)] 22821 1.05e-09 55_[+3(3.98e-07)]_124_\ [+2(3.84e-05)]_226_[+1(1.60e-09)]_46 2331 9.53e-06 46_[+1(3.44e-08)]_46_[+2(8.56e-06)]_\ 380 24656 4.38e-06 355_[+2(1.41e-06)]_75_\ [+1(1.12e-07)]_42 264413 4.20e-03 104_[+2(2.56e-06)]_384 264431 9.91e-05 2_[+2(4.71e-06)]_273_[+3(9.83e-07)]_\ 192 268206 6.22e-06 381_[+3(4.65e-10)]_98 5217 7.11e-09 73_[+1(8.34e-05)]_109_\ [+3(5.34e-07)]_101_[+2(3.11e-05)]_78_[+1(1.16e-08)]_74 5688 8.58e-14 147_[+3(1.38e-09)]_44_\ [+2(2.61e-07)]_180_[+1(2.95e-09)]_80 7119 8.75e-05 263_[+2(1.85e-05)]_39_\ [+3(1.78e-07)]_165 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************