******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/35/35.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 20950 1.0000 500 22378 1.0000 500 23889 1.0000 500 25100 1.0000 500 25433 1.0000 500 263428 1.0000 500 263510 1.0000 500 264391 1.0000 500 29728 1.0000 500 38139 1.0000 500 3897 1.0000 500 41655 1.0000 500 4891 1.0000 500 5513 1.0000 500 6123 1.0000 500 7401 1.0000 500 799 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/35/35.seqs.fa -oc motifs/35 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.255 C 0.230 G 0.241 T 0.274 Background letter frequencies (from dataset with add-one prior applied): A 0.255 C 0.230 G 0.241 T 0.274 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 11 llr = 145 E-value = 2.3e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :94:75:9557159:1 pos.-specific C a:4a::9114355:89 probability G ::1::31:2::3::2: matrix T :12:32::31:1:1:: bits 2.1 * * 1.9 * * 1.7 * * * * 1.5 ** * ** *** Relative 1.3 ** * ** *** Entropy 1.1 ** ** ** * **** (19.0 bits) 0.8 ** ** ** * **** 0.6 ** ***** ** **** 0.4 ** ***** ******* 0.2 **************** 0.0 ---------------- Multilevel CAACAACAAAACCACC consensus C TG TCCGA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 38139 472 5.77e-10 TAAAGCAACA CACCAACAAAACAACC GACTACGAAC 25433 471 5.77e-10 TAAAGCAACA CACCAACAAAACAACC GACTACGAAC 799 319 2.48e-08 GAATCAGACG CAACAGCAACCCCACC ACCATCCACA 41655 106 2.14e-07 CCCATCTCTG CACCAACAAACCATCC ATCCGCCACT 263510 460 3.40e-07 GAGCTTGCTG CACCTGCACACCCACC AGAGGCAGTT 7401 219 5.19e-07 TCATCATCAA CAACAACATTAACACC TTTGCGACAG 25100 415 1.18e-06 CAGTACAACA CAACAACATCATCACA CTTTGCTCAT 29728 385 1.35e-06 ACCACGTGAT CAACAAGAACAGAAGC CGTCTCATCC 264391 34 1.54e-06 GTCGTAGTAG CAGCAGCAGCAGCAGC GGTGGTAGTA 22378 433 1.87e-06 CTACTTTTCA CTTCTTCATAACCACC GCCACGATAT 3897 248 4.37e-06 CTAGTGCTTT CATCTTCCGAAGAACC GGATCTCCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38139 5.8e-10 471_[+1]_13 25433 5.8e-10 470_[+1]_14 799 2.5e-08 318_[+1]_166 41655 2.1e-07 105_[+1]_379 263510 3.4e-07 459_[+1]_25 7401 5.2e-07 218_[+1]_266 25100 1.2e-06 414_[+1]_70 29728 1.4e-06 384_[+1]_100 264391 1.5e-06 33_[+1]_451 22378 1.9e-06 432_[+1]_52 3897 4.4e-06 247_[+1]_237 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=11 38139 ( 472) CACCAACAAAACAACC 1 25433 ( 471) CACCAACAAAACAACC 1 799 ( 319) CAACAGCAACCCCACC 1 41655 ( 106) CACCAACAAACCATCC 1 263510 ( 460) CACCTGCACACCCACC 1 7401 ( 219) CAACAACATTAACACC 1 25100 ( 415) CAACAACATCATCACA 1 29728 ( 385) CAACAAGAACAGAAGC 1 264391 ( 34) CAGCAGCAGCAGCAGC 1 22378 ( 433) CTTCTTCATAACCACC 1 3897 ( 248) CATCTTCCGAAGAACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8245 bayes= 10.5754 E= 2.3e-004 -1010 212 -1010 -1010 183 -1010 -1010 -159 51 66 -140 -59 -1010 212 -1010 -1010 151 -1010 -1010 0 109 -1010 18 -59 -1010 198 -140 -1010 183 -134 -1010 -1010 83 -134 -41 0 109 66 -1010 -159 151 24 -1010 -1010 -149 124 18 -159 83 124 -1010 -1010 183 -1010 -1010 -159 -1010 183 -41 -1010 -149 198 -1010 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 2.3e-004 0.000000 1.000000 0.000000 0.000000 0.909091 0.000000 0.000000 0.090909 0.363636 0.363636 0.090909 0.181818 0.000000 1.000000 0.000000 0.000000 0.727273 0.000000 0.000000 0.272727 0.545455 0.000000 0.272727 0.181818 0.000000 0.909091 0.090909 0.000000 0.909091 0.090909 0.000000 0.000000 0.454545 0.090909 0.181818 0.272727 0.545455 0.363636 0.000000 0.090909 0.727273 0.272727 0.000000 0.000000 0.090909 0.545455 0.272727 0.090909 0.454545 0.545455 0.000000 0.000000 0.909091 0.000000 0.000000 0.090909 0.000000 0.818182 0.181818 0.000000 0.090909 0.909091 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CA[AC]C[AT][AG]CA[AT][AC][AC][CG][CA]ACC -------------------------------------------------------------------------------- Time 2.72 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 13 llr = 162 E-value = 9.8e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :5218::72:4::5:5 pos.-specific C :11111:22:35:1a1 probability G a2:8:2a11a22a::4 matrix T :28:18::5:23:4:: bits 2.1 * * * * * 1.9 * * * * * 1.7 * * * * * 1.5 * * * * * Relative 1.3 * ** * * * * Entropy 1.1 * ** * * * * (18.0 bits) 0.8 * ****** * * * 0.6 * ****** * ***** 0.4 * ****** * ***** 0.2 **************** 0.0 ---------------- Multilevel GATGATGATGACGACA consensus G CA CT T G sequence T C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 38139 333 7.69e-09 ACCACCTAAG GTTGATGATGACGACG GAGCGTTGCC 25433 332 7.69e-09 ACCACCTAAG GTTGATGATGACGACG GAGCGTTGCC 3897 172 1.12e-08 TGCAGACACG GATGATGATGAGGACA GATCAACTTC 41655 366 1.01e-07 ACGTCAGATT GGTGATGAAGCTGTCA GAGTTGAACC 20950 107 3.77e-07 CGTCTCCGAC GATGTTGAAGCCGACG ACGTGGCGGA 263510 191 5.06e-07 TTTGGGCTTG GATAATGATGCTGTCA AGTCACAGTT 23889 5 5.06e-07 CTTT GATGACGAAGATGACG AACTCAATGG 264391 130 7.29e-07 GTTGCCGTTT GTTGAGGACGTCGTCA GGTCACCATC 22378 301 1.53e-06 TCCTCTTTAT GGACATGATGACGTCA TTGAGGCATT 4891 143 3.34e-06 TTCATTTGAC GCCGATGCCGCCGACG AAGCCATCGT 5513 66 6.47e-06 TGGACTGTCG GATGATGCGGTGGCCA TGTTGTTGTT 25100 91 7.22e-06 TTGCTCGGCT GATGCTGGCGGTGTCA TTCCCTCATT 6123 86 8.88e-06 CGATATGATG GGAGAGGCTGGCGACC ATTGGTAGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38139 7.7e-09 332_[+2]_152 25433 7.7e-09 331_[+2]_153 3897 1.1e-08 171_[+2]_313 41655 1e-07 365_[+2]_119 20950 3.8e-07 106_[+2]_378 263510 5.1e-07 190_[+2]_294 23889 5.1e-07 4_[+2]_480 264391 7.3e-07 129_[+2]_355 22378 1.5e-06 300_[+2]_184 4891 3.3e-06 142_[+2]_342 5513 6.5e-06 65_[+2]_419 25100 7.2e-06 90_[+2]_394 6123 8.9e-06 85_[+2]_399 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=13 38139 ( 333) GTTGATGATGACGACG 1 25433 ( 332) GTTGATGATGACGACG 1 3897 ( 172) GATGATGATGAGGACA 1 41655 ( 366) GGTGATGAAGCTGTCA 1 20950 ( 107) GATGTTGAAGCCGACG 1 263510 ( 191) GATAATGATGCTGTCA 1 23889 ( 5) GATGACGAAGATGACG 1 264391 ( 130) GTTGAGGACGTCGTCA 1 22378 ( 301) GGACATGATGACGTCA 1 4891 ( 143) GCCGATGCCGCCGACG 1 5513 ( 66) GATGATGCGGTGGCCA 1 25100 ( 91) GATGCTGGCGGTGTCA 1 6123 ( 86) GGAGAGGCTGGCGACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8245 bayes= 9.8378 E= 9.8e-006 -1035 -1035 205 -1035 85 -158 -6 -25 -73 -158 -1035 149 -173 -158 181 -1035 173 -158 -1035 -183 -1035 -158 -65 149 -1035 -1035 205 -1035 144 0 -164 -1035 -15 0 -164 75 -1035 -1035 205 -1035 59 42 -65 -83 -1035 123 -65 17 -1035 -1035 205 -1035 108 -158 -1035 49 -1035 212 -1035 -1035 108 -158 67 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 13 E= 9.8e-006 0.000000 0.000000 1.000000 0.000000 0.461538 0.076923 0.230769 0.230769 0.153846 0.076923 0.000000 0.769231 0.076923 0.076923 0.846154 0.000000 0.846154 0.076923 0.000000 0.076923 0.000000 0.076923 0.153846 0.769231 0.000000 0.000000 1.000000 0.000000 0.692308 0.230769 0.076923 0.000000 0.230769 0.230769 0.076923 0.461538 0.000000 0.000000 1.000000 0.000000 0.384615 0.307692 0.153846 0.153846 0.000000 0.538462 0.153846 0.307692 0.000000 0.000000 1.000000 0.000000 0.538462 0.076923 0.000000 0.384615 0.000000 1.000000 0.000000 0.000000 0.538462 0.076923 0.384615 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[AGT]TGATG[AC][TAC]G[AC][CT]G[AT]C[AG] -------------------------------------------------------------------------------- Time 5.29 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 117 E-value = 6.4e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 5275:3:a2:::3::a22258 pos.-specific C 57:58:a:3a:3:aa:87:52 probability G :23:27:::::33::::2::: matrix T ::::::::5:a33:::::8:: bits 2.1 * * ** 1.9 ** ** *** 1.7 ** ** *** 1.5 * ** ** **** Relative 1.3 * ** ** **** * * Entropy 1.1 * ****** ** **** *** (28.1 bits) 0.8 ******** ** ******** 0.6 ******** ** ******** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel ACAACGCATCTCACCACCTAA consensus C GC A C GG C sequence TT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 41655 55 1.87e-11 CACGTGTAGA ACAACGCACCTTGCCACCTCA CGATGAAGTC 23889 477 1.87e-11 CACGTGTAGA ACAACGCACCTTGCCACCTCA CGA 38139 311 7.10e-11 TGCTATGTTC CCGCCACATCTCACCACCTAA GGTTGATGAT 25433 310 7.10e-11 TGCTATGTTC CCGCCACATCTCACCACCTAA GGTTGATGAT 7401 402 9.16e-09 CAGTGCGAAG CAACCGCATCTGTCCAAGAAA CTTGCTCGTT 799 70 1.28e-08 GCAAACAACA AGAAGGCAACTGTCCACATCC ACGAGGCAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41655 1.9e-11 54_[+3]_425 23889 1.9e-11 476_[+3]_3 38139 7.1e-11 310_[+3]_169 25433 7.1e-11 309_[+3]_170 7401 9.2e-09 401_[+3]_78 799 1.3e-08 69_[+3]_410 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 41655 ( 55) ACAACGCACCTTGCCACCTCA 1 23889 ( 477) ACAACGCACCTTGCCACCTCA 1 38139 ( 311) CCGCCACATCTCACCACCTAA 1 25433 ( 310) CCGCCACATCTCACCACCTAA 1 7401 ( 402) CAACCGCATCTGTCCAAGAAA 1 799 ( 70) AGAAGGCAACTGTCCACATCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8160 bayes= 10.067 E= 6.4e-002 97 112 -923 -923 -61 153 -53 -923 138 -923 47 -923 97 112 -923 -923 -923 185 -53 -923 38 -923 147 -923 -923 212 -923 -923 197 -923 -923 -923 -61 53 -923 87 -923 212 -923 -923 -923 -923 -923 187 -923 53 47 28 38 -923 47 28 -923 212 -923 -923 -923 212 -923 -923 197 -923 -923 -923 -61 185 -923 -923 -61 153 -53 -923 -61 -923 -923 161 97 112 -923 -923 170 -46 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 6.4e-002 0.500000 0.500000 0.000000 0.000000 0.166667 0.666667 0.166667 0.000000 0.666667 0.000000 0.333333 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.333333 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.333333 0.333333 0.333333 0.000000 0.333333 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.166667 0.666667 0.166667 0.000000 0.166667 0.000000 0.000000 0.833333 0.500000 0.500000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AC]C[AG][AC]C[GA]CA[TC]CT[CGT][AGT]CCACCT[AC]A -------------------------------------------------------------------------------- Time 7.60 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 20950 5.96e-03 106_[+2(3.77e-07)]_378 22378 4.33e-05 300_[+2(1.53e-06)]_116_\ [+1(1.87e-06)]_52 23889 5.25e-10 4_[+2(5.06e-07)]_23_[+2(8.49e-05)]_\ 50_[+2(7.74e-05)]_351_[+3(1.87e-11)]_3 25100 5.35e-05 90_[+2(7.22e-06)]_308_\ [+1(1.18e-06)]_24_[+1(8.55e-05)]_30 25433 3.73e-17 309_[+3(7.10e-11)]_1_[+2(7.69e-09)]_\ 123_[+1(5.77e-10)]_14 263428 8.34e-01 500 263510 5.43e-06 190_[+2(5.06e-07)]_253_\ [+1(3.40e-07)]_25 264391 2.28e-05 33_[+1(1.54e-06)]_80_[+2(7.29e-07)]_\ 355 29728 3.33e-03 384_[+1(1.35e-06)]_100 38139 3.73e-17 310_[+3(7.10e-11)]_1_[+2(7.69e-09)]_\ 123_[+1(5.77e-10)]_13 3897 6.64e-07 79_[+2(5.80e-05)]_76_[+2(1.12e-08)]_\ 60_[+1(4.37e-06)]_63_[+2(2.55e-06)]_3_[+2(9.34e-06)]_139 41655 3.42e-14 54_[+3(1.87e-11)]_30_[+1(2.14e-07)]_\ 50_[+1(3.17e-05)]_1_[+3(6.38e-05)]_156_[+2(1.01e-07)]_119 4891 2.92e-02 142_[+2(3.34e-06)]_342 5513 2.14e-02 65_[+2(6.47e-06)]_419 6123 8.26e-02 85_[+2(8.88e-06)]_399 7401 1.89e-07 218_[+1(5.19e-07)]_167_\ [+3(9.16e-09)]_78 799 1.98e-08 69_[+3(1.28e-08)]_58_[+3(8.19e-06)]_\ 149_[+1(2.48e-08)]_166 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************