******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/319/319.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1526 1.0000 500 16136 1.0000 500 21866 1.0000 500 23500 1.0000 500 268202 1.0000 500 31447 1.0000 500 5101 1.0000 500 5141 1.0000 500 bd908 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/319/319.seqs.fa -oc motifs/319 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.240 C 0.222 G 0.268 T 0.270 Background letter frequencies (from dataset with add-one prior applied): A 0.240 C 0.222 G 0.268 T 0.270 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 9 llr = 98 E-value = 2.4e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::::2:::1:7 pos.-specific C 6::::19:669: probability G 1:a217::431: matrix T 3a:89:1a:::3 bits 2.2 2.0 ** * 1.7 ** ** * 1.5 ** ** * Relative 1.3 ** * ** * Entropy 1.1 **** *** ** (15.8 bits) 0.9 **** *** ** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CTGTTGCTCCCA consensus T G A GG T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ bd908 351 4.94e-08 TTGTTGTTGT CTGTTGCTCCCA TCGTGGCTTG 21866 486 8.91e-07 CTCTCACTTC CTGTTCCTCCCA ACC 1526 181 1.11e-06 TGTTGCACCG CTGGTGCTGCCA CCTGATTTGA 268202 65 2.69e-06 GCTTTCAACG TTGTTGCTCACA TCTTTCGGAC 5141 353 3.28e-06 GAGTGCGTTG TTGTTGCTGGCT GACTGCTGGT 23500 238 7.48e-06 TCGGCCATTG CTGTTGCTGGGA GACGATGATG 16136 363 1.06e-05 AAAGTTGCTT TTGTTGTTCCCT GTTCCTTTTC 5101 387 1.37e-05 CGCTATCTCG CTGGTACTGGCT ATGAGCCAAG 31447 337 2.36e-05 TAGAAATAAG GTGTGACTCCCA TGTTGGGATA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- bd908 4.9e-08 350_[+1]_138 21866 8.9e-07 485_[+1]_3 1526 1.1e-06 180_[+1]_308 268202 2.7e-06 64_[+1]_424 5141 3.3e-06 352_[+1]_136 23500 7.5e-06 237_[+1]_251 16136 1.1e-05 362_[+1]_126 5101 1.4e-05 386_[+1]_102 31447 2.4e-05 336_[+1]_152 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=9 bd908 ( 351) CTGTTGCTCCCA 1 21866 ( 486) CTGTTCCTCCCA 1 1526 ( 181) CTGGTGCTGCCA 1 268202 ( 65) TTGTTGCTCACA 1 5141 ( 353) TTGTTGCTGGCT 1 23500 ( 238) CTGTTGCTGGGA 1 16136 ( 363) TTGTTGTTCCCT 1 5101 ( 387) CTGGTACTGGCT 1 31447 ( 337) GTGTGACTCCCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4401 bayes= 9.0653 E= 2.4e+000 -982 132 -127 31 -982 -982 -982 189 -982 -982 190 -982 -982 -982 -27 153 -982 -982 -127 172 -11 -100 132 -982 -982 200 -982 -128 -982 -982 -982 189 -982 132 73 -982 -111 132 32 -982 -982 200 -127 -982 147 -982 -982 31 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 2.4e+000 0.000000 0.555556 0.111111 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.222222 0.777778 0.000000 0.000000 0.111111 0.888889 0.222222 0.111111 0.666667 0.000000 0.000000 0.888889 0.000000 0.111111 0.000000 0.000000 0.000000 1.000000 0.000000 0.555556 0.444444 0.000000 0.111111 0.555556 0.333333 0.000000 0.000000 0.888889 0.111111 0.000000 0.666667 0.000000 0.000000 0.333333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CT]TG[TG]T[GA]CT[CG][CG]C[AT] -------------------------------------------------------------------------------- Time 0.84 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 6 llr = 74 E-value = 4.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 5:352::::a:2 pos.-specific C 287:87:aa:a8 probability G :2:3::2::::: matrix T 3::2:38::::: bits 2.2 ** * 2.0 **** 1.7 **** 1.5 * * ***** Relative 1.3 ** * ****** Entropy 1.1 ** ******** (17.7 bits) 0.9 ** ******** 0.7 *** ******** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel ACCACCTCCACC consensus T AG T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 268202 353 2.48e-07 GGCCGACGCT TCAACCTCCACC CTCCAACTCC bd908 203 4.87e-07 CTCCATCGTA ACCAACTCCACC GATTGTTGTC 5141 29 6.21e-07 AAGATAGACA ACATCCTCCACC TGAAGTAATT 1526 404 1.03e-06 CGGTGGACGA CCCGCTTCCACC ACAACGACGT 21866 245 1.79e-06 TTTTGACGCC ACCGCTGCCACC GCGGTTGATT 31447 470 4.45e-06 CGGTGCACGT TGCACCTCCACA GACGATACGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 268202 2.5e-07 352_[+2]_136 bd908 4.9e-07 202_[+2]_286 5141 6.2e-07 28_[+2]_460 1526 1e-06 403_[+2]_85 21866 1.8e-06 244_[+2]_244 31447 4.5e-06 469_[+2]_19 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=6 268202 ( 353) TCAACCTCCACC 1 bd908 ( 203) ACCAACTCCACC 1 5141 ( 29) ACATCCTCCACC 1 1526 ( 404) CCCGCTTCCACC 1 21866 ( 245) ACCGCTGCCACC 1 31447 ( 470) TGCACCTCCACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4401 bayes= 9.96467 E= 4.6e+001 105 -42 -923 31 -923 190 -68 -923 47 158 -923 -923 105 -923 32 -69 -53 190 -923 -923 -923 158 -923 31 -923 -923 -68 163 -923 217 -923 -923 -923 217 -923 -923 205 -923 -923 -923 -923 217 -923 -923 -53 190 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 4.6e+001 0.500000 0.166667 0.000000 0.333333 0.000000 0.833333 0.166667 0.000000 0.333333 0.666667 0.000000 0.000000 0.500000 0.000000 0.333333 0.166667 0.166667 0.833333 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.166667 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AT]C[CA][AG]C[CT]TCCACC -------------------------------------------------------------------------------- Time 1.67 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 6 llr = 98 E-value = 7.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :a725235::2::2:2:22: pos.-specific C 8:37375288:82853a5:a probability G 2:::::2::2::::::::3: matrix T :::222:32:828:55:35: bits 2.2 * * 2.0 * * * 1.7 * * * 1.5 ** ** * * * * Relative 1.3 ** ****** * * Entropy 1.1 *** ******* * * (23.7 bits) 0.9 **** * ******* * * 0.7 ****************** * 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel CAACACCACCTCTCCTCCTC consensus C C AT TC TG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 16136 389 3.15e-12 CTTTTCAACA CAACACCACCTCTCCTCTTC GTCGTCCCAG 5101 166 9.39e-09 CTTTCTGGTG CAACCCATCCATTCTCCCTC ATTTTCATCA 1526 438 1.03e-08 CGTCGTCGTT CACTTCCTCCTCTCCTCATC GACCTACCCT 268202 399 4.42e-08 ACCCTCGACA CACCCTCCCCTCCCTACCGC GAGGCCACTA bd908 224 6.54e-08 CGATTGTTGT CAACACGATGTCTACTCCGC TCGACGAGCA 21866 449 7.87e-08 CCAACAGCCA GAAAAAAACCTCTCTCCTAC CCAGCCCCTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 16136 3.2e-12 388_[+3]_92 5101 9.4e-09 165_[+3]_315 1526 1e-08 437_[+3]_43 268202 4.4e-08 398_[+3]_82 bd908 6.5e-08 223_[+3]_257 21866 7.9e-08 448_[+3]_32 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=6 16136 ( 389) CAACACCACCTCTCCTCTTC 1 5101 ( 166) CAACCCATCCATTCTCCCTC 1 1526 ( 438) CACTTCCTCCTCTCCTCATC 1 268202 ( 399) CACCCTCCCCTCCCTACCGC 1 bd908 ( 224) CAACACGATGTCTACTCCGC 1 21866 ( 449) GAAAAAAACCTCTCTCCTAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4329 bayes= 9.15128 E= 7.0e+001 -923 190 -68 -923 205 -923 -923 -923 147 58 -923 -923 -53 158 -923 -69 105 58 -923 -69 -53 158 -923 -69 47 117 -68 -923 105 -42 -923 31 -923 190 -923 -69 -923 190 -68 -923 -53 -923 -923 163 -923 190 -923 -69 -923 -42 -923 163 -53 190 -923 -923 -923 117 -923 89 -53 58 -923 89 -923 217 -923 -923 -53 117 -923 31 -53 -923 32 89 -923 217 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 7.0e+001 0.000000 0.833333 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.166667 0.666667 0.000000 0.166667 0.500000 0.333333 0.000000 0.166667 0.166667 0.666667 0.000000 0.166667 0.333333 0.500000 0.166667 0.000000 0.500000 0.166667 0.000000 0.333333 0.000000 0.833333 0.000000 0.166667 0.000000 0.833333 0.166667 0.000000 0.166667 0.000000 0.000000 0.833333 0.000000 0.833333 0.000000 0.166667 0.000000 0.166667 0.000000 0.833333 0.166667 0.833333 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.166667 0.333333 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.166667 0.500000 0.000000 0.333333 0.166667 0.000000 0.333333 0.500000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CA[AC]C[AC]C[CA][AT]CCTCTC[CT][TC]C[CT][TG]C -------------------------------------------------------------------------------- Time 2.40 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1526 5.42e-10 110_[+1(2.50e-05)]_58_\ [+1(1.11e-06)]_211_[+2(1.03e-06)]_22_[+3(1.03e-08)]_43 16136 5.18e-10 362_[+1(1.06e-05)]_14_\ [+3(3.15e-12)]_92 21866 4.86e-09 244_[+2(1.79e-06)]_192_\ [+3(7.87e-08)]_17_[+1(8.91e-07)]_3 23500 6.93e-02 237_[+1(7.48e-06)]_251 268202 1.28e-09 64_[+1(2.69e-06)]_276_\ [+2(2.48e-07)]_15_[+2(3.32e-05)]_7_[+3(4.42e-08)]_31_[+3(2.12e-05)]_31 31447 7.14e-04 336_[+1(2.36e-05)]_121_\ [+2(4.45e-06)]_19 5101 9.71e-07 165_[+3(9.39e-09)]_201_\ [+1(1.37e-05)]_102 5141 1.39e-05 28_[+2(6.21e-07)]_312_\ [+1(3.28e-06)]_136 bd908 8.34e-11 202_[+2(4.87e-07)]_9_[+3(6.54e-08)]_\ 107_[+1(4.94e-08)]_138 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************