******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/285/285.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 14708 1.0000 500 2332 1.0000 500 24989 1.0000 500 27236 1.0000 500 337 1.0000 500 34197 1.0000 500 5039 1.0000 500 6258 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/285/285.seqs.fa -oc motifs/285 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4000 N= 8 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.249 C 0.240 G 0.224 T 0.286 Background letter frequencies (from dataset with add-one prior applied): A 0.250 C 0.240 G 0.224 T 0.286 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 18 sites = 8 llr = 119 E-value = 1.9e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :31:1:1::143:3:::: pos.-specific C :::::3::81:1:4:::: probability G a41a319::6::a::933 matrix T :48:66:a3166:4a188 bits 2.2 * * * 1.9 * * * 1.7 * * * * * 1.5 * * ** * ** Relative 1.3 * * ** * ** Entropy 1.1 * * *** * **** (21.5 bits) 0.9 * ** *** * * **** 0.6 * ******* *** **** 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel GGTGTTGTCGTTGCTGTT consensus T GC T AA T GG sequence A A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 6258 51 2.95e-09 GATCCATATT GTTGTTGTTGTTGTTGTT TGGTTCACCG 2332 120 4.61e-09 CATCATTGAG GTTGTTGTTGTTGATGTT GTCGTTCGCT 14708 131 9.04e-09 CAAATTTGTC GGTGTTGTCCATGATGTT GACCACTTTC 337 237 1.08e-07 GGCTGTACAG GGAGGTGTCGAAGTTGTG TGGATTGTGA 24989 410 1.17e-07 GACTGTGGGC GATGTGGTCTTTGCTGGT GGTAATTTGG 27236 125 2.30e-07 TGATTACAGA GGTGTTGTCGTCGCTTGG GATCCGAGGC 34197 453 2.46e-07 TAGTCCTGTA GTTGACATCGTAGCTGTT TATTACTTTA 5039 78 3.19e-07 CGGCGAGGGA GAGGGCGTCAATGTTGTT GTTGTCACTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 6258 3e-09 50_[+1]_432 2332 4.6e-09 119_[+1]_363 14708 9e-09 130_[+1]_352 337 1.1e-07 236_[+1]_246 24989 1.2e-07 409_[+1]_73 27236 2.3e-07 124_[+1]_358 34197 2.5e-07 452_[+1]_30 5039 3.2e-07 77_[+1]_405 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=18 seqs=8 6258 ( 51) GTTGTTGTTGTTGTTGTT 1 2332 ( 120) GTTGTTGTTGTTGATGTT 1 14708 ( 131) GGTGTTGTCCATGATGTT 1 337 ( 237) GGAGGTGTCGAAGTTGTG 1 24989 ( 410) GATGTGGTCTTTGCTGGT 1 27236 ( 125) GGTGTTGTCGTCGCTTGG 1 34197 ( 453) GTTGACATCGTAGCTGTT 1 5039 ( 78) GAGGGCGTCAATGTTGTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 3864 bayes= 8.91289 E= 1.9e-004 -965 -965 216 -965 0 -965 74 39 -100 -965 -84 139 -965 -965 216 -965 -100 -965 16 113 -965 6 -84 113 -100 -965 196 -965 -965 -965 -965 180 -965 164 -965 -19 -100 -94 148 -119 59 -965 -965 113 0 -94 -965 113 -965 -965 216 -965 0 64 -965 39 -965 -965 -965 180 -965 -965 196 -119 -965 -965 16 139 -965 -965 16 139 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 8 E= 1.9e-004 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.375000 0.375000 0.125000 0.000000 0.125000 0.750000 0.000000 0.000000 1.000000 0.000000 0.125000 0.000000 0.250000 0.625000 0.000000 0.250000 0.125000 0.625000 0.125000 0.000000 0.875000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.000000 0.250000 0.125000 0.125000 0.625000 0.125000 0.375000 0.000000 0.000000 0.625000 0.250000 0.125000 0.000000 0.625000 0.000000 0.000000 1.000000 0.000000 0.250000 0.375000 0.000000 0.375000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.250000 0.750000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[GTA]TG[TG][TC]GT[CT]G[TA][TA]G[CTA]TG[TG][TG] -------------------------------------------------------------------------------- Time 0.60 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 8 llr = 127 E-value = 5.7e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 641:6::63:5::81::6:38 pos.-specific C :16a:3a188:3a:318:651 probability G :43::::3:3:::::::1:31 matrix T 41::48::::58:369334:: bits 2.2 * * * 1.9 * * * 1.7 * * * 1.5 * * * Relative 1.3 * * ** * * Entropy 1.1 * **** ** *** ** * (22.8 bits) 0.9 * ***** ****** ** * * 0.6 * ******************* 0.4 * ******************* 0.2 ********************* 0.0 --------------------- Multilevel AACCATCACCATCATTCACCA consensus TGG TC GAGTC TC TTTA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 5039 441 5.47e-12 TGTTCTCCTC AGCCTTCACCTTCATTCACCA CCGCTCAACT 34197 19 5.47e-12 TGTTCTCCTC AGCCTTCACCTTCATTCACCA CCGCTCAACT 14708 401 1.85e-09 GAGGTGCTCA AACCATCACCATCTATCATCA CATGAACATC 2332 68 8.53e-09 GCAGGCATCT TGCCACCACCATCATTTTCAA TGGCATCGTC 337 478 2.08e-08 AGCCGCCATC AACCATCGACATCACCCATCA CC 6258 305 3.83e-07 GCAAAAATAG TAACACCCCCACCATTCATGC TATCAAACTA 27236 70 5.50e-07 AGCACAAATC ACGCATCAAGTTCTCTCTCGG GCAAGTGGTG 24989 10 6.12e-07 ACCATCACT TTGCTTCGCGTCCATTTGCAA ATATCGTCTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5039 5.5e-12 440_[+2]_39 34197 5.5e-12 18_[+2]_461 14708 1.8e-09 400_[+2]_79 2332 8.5e-09 67_[+2]_412 337 2.1e-08 477_[+2]_2 6258 3.8e-07 304_[+2]_175 27236 5.5e-07 69_[+2]_410 24989 6.1e-07 9_[+2]_470 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=8 5039 ( 441) AGCCTTCACCTTCATTCACCA 1 34197 ( 19) AGCCTTCACCTTCATTCACCA 1 14708 ( 401) AACCATCACCATCTATCATCA 1 2332 ( 68) TGCCACCACCATCATTTTCAA 1 337 ( 478) AACCATCGACATCACCCATCA 1 6258 ( 305) TAACACCCCCACCATTCATGC 1 27236 ( 70) ACGCATCAAGTTCTCTCTCGG 1 24989 ( 10) TTGCTTCGCGTCCATTTGCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3840 bayes= 8.90388 E= 5.7e-002 132 -965 -965 39 59 -94 74 -119 -100 138 16 -965 -965 206 -965 -965 132 -965 -965 39 -965 6 -965 139 -965 206 -965 -965 132 -94 16 -965 0 164 -965 -965 -965 164 16 -965 100 -965 -965 80 -965 6 -965 139 -965 206 -965 -965 159 -965 -965 -19 -100 6 -965 113 -965 -94 -965 161 -965 164 -965 -19 132 -965 -84 -19 -965 138 -965 39 0 106 16 -965 159 -94 -84 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 5.7e-002 0.625000 0.000000 0.000000 0.375000 0.375000 0.125000 0.375000 0.125000 0.125000 0.625000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.625000 0.000000 0.000000 0.375000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.625000 0.125000 0.250000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.125000 0.250000 0.000000 0.625000 0.000000 0.125000 0.000000 0.875000 0.000000 0.750000 0.000000 0.250000 0.625000 0.000000 0.125000 0.250000 0.000000 0.625000 0.000000 0.375000 0.250000 0.500000 0.250000 0.000000 0.750000 0.125000 0.125000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AT][AG][CG]C[AT][TC]C[AG][CA][CG][AT][TC]C[AT][TC]T[CT][AT][CT][CAG]A -------------------------------------------------------------------------------- Time 1.18 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 4 llr = 88 E-value = 5.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::35a::::::38338:8: pos.-specific C a:aa85:5:a88a838::3:a probability G :::::::3::33::::83::: matrix T :a:::::3a:::::::::83: bits 2.2 * ** * * * 1.9 * ** * * * * 1.7 **** * ** * * 1.5 **** * ** * * Relative 1.3 ***** * ********** * Entropy 1.1 ******* ************* (31.7 bits) 0.9 ******* ************* 0.6 ******* ************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CTCCCAACTCCCCCACGATAC consensus AC G GG ACAAGCT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 5039 470 3.66e-13 CACCGCTCAA CTCCCAACTCCCCCACGATAC CCTGTGCACC 34197 48 3.66e-13 CACCGCTCAA CTCCCAACTCCCCCACGATAC CCTGTGCACC 6258 107 4.06e-10 TGTGGTGCCT CTCCACATTCCCCAACGACTC CCTCCATTAT 337 383 8.12e-10 ACAAACCGGT CTCCCCAGTCGGCCCAAGTAC TCTCGCAATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5039 3.7e-13 469_[+3]_10 34197 3.7e-13 47_[+3]_432 6258 4.1e-10 106_[+3]_373 337 8.1e-10 382_[+3]_97 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=4 5039 ( 470) CTCCCAACTCCCCCACGATAC 1 34197 ( 48) CTCCCAACTCCCCCACGATAC 1 6258 ( 107) CTCCACATTCCCCAACGACTC 1 337 ( 383) CTCCCCAGTCGGCCCAAGTAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3840 bayes= 9.90539 E= 5.9e+000 -865 206 -865 -865 -865 -865 -865 180 -865 206 -865 -865 -865 206 -865 -865 0 164 -865 -865 100 106 -865 -865 200 -865 -865 -865 -865 106 16 -19 -865 -865 -865 180 -865 206 -865 -865 -865 164 16 -865 -865 164 16 -865 -865 206 -865 -865 0 164 -865 -865 159 6 -865 -865 0 164 -865 -865 0 -865 174 -865 159 -865 16 -865 -865 6 -865 139 159 -865 -865 -19 -865 206 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 4 E= 5.9e+000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.250000 0.000000 0.750000 0.750000 0.000000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CTCC[CA][AC]A[CGT]TC[CG][CG]C[CA][AC][CA][GA][AG][TC][AT]C -------------------------------------------------------------------------------- Time 1.70 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14708 6.47e-10 130_[+1(9.04e-09)]_252_\ [+2(1.85e-09)]_79 2332 1.51e-09 67_[+2(8.53e-09)]_31_[+1(4.61e-09)]_\ 9_[+1(6.20e-05)]_84_[+1(1.73e-07)]_234 24989 2.41e-06 9_[+2(6.12e-07)]_379_[+1(1.17e-07)]_\ 73 27236 2.45e-06 69_[+2(5.50e-07)]_34_[+1(2.30e-07)]_\ 358 337 1.40e-13 236_[+1(1.08e-07)]_128_\ [+3(8.12e-10)]_74_[+2(2.08e-08)]_2 34197 7.48e-20 18_[+2(5.47e-12)]_8_[+3(3.66e-13)]_\ 19_[+2(7.38e-05)]_344_[+1(2.46e-07)]_30 5039 9.61e-20 25_[+2(2.81e-05)]_31_[+1(3.19e-07)]_\ 345_[+2(5.47e-12)]_8_[+3(3.66e-13)]_10 6258 3.80e-14 50_[+1(2.95e-09)]_38_[+3(4.06e-10)]_\ 177_[+2(3.83e-07)]_175 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************