******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/278/278.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42467 1.0000 500 42538 1.0000 500 48253 1.0000 500 40174 1.0000 500 40278 1.0000 500 16375 1.0000 500 1769 1.0000 500 45060 1.0000 500 35647 1.0000 500 44269 1.0000 500 49758 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/278/278.seqs.fa -oc motifs/278 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.280 C 0.250 G 0.213 T 0.256 Background letter frequencies (from dataset with add-one prior applied): A 0.280 C 0.250 G 0.213 T 0.256 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 7 llr = 93 E-value = 1.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::117::31::1:: pos.-specific C a193:::7:1:a49: probability G :::6::a:63::3:9 matrix T :91:93:314a:111 bits 2.2 * 2.0 * * ** 1.8 * * ** 1.6 * * ** * Relative 1.3 *** * * ** ** Entropy 1.1 *** **** ** ** (19.3 bits) 0.9 *** **** ** ** 0.7 ********* ** ** 0.4 ********* ** ** 0.2 *************** 0.0 --------------- Multilevel CTCGTAGCGTTCCCG consensus C T TAG G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 49758 120 7.44e-09 GCAATTGACT CTCGTAGCGCTCCCG ACCGAGAATG 48253 306 2.52e-08 TTCCGCGTTG CTCGTAGCGTTCCTG ACGCTGCTGT 35647 420 8.99e-08 CACTTGTTCG CTCATTGCGTTCGCG TTTTCTTGCC 16375 412 1.26e-07 TCCAGAAGCC CTTCTAGCGTTCGCG CGGCTGGAGG 42467 289 5.87e-07 TGCACTTGAC CTCGAAGCAGTCTCG CAGGCTGATT 40278 218 2.40e-06 TTTCATGGGT CCCGTTGTAGTCACG TGCGTATATT 42538 136 3.00e-06 ATCTGAATCA CTCCTAGTTATCCCT ATCAGAATAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49758 7.4e-09 119_[+1]_366 48253 2.5e-08 305_[+1]_180 35647 9e-08 419_[+1]_66 16375 1.3e-07 411_[+1]_74 42467 5.9e-07 288_[+1]_197 40278 2.4e-06 217_[+1]_268 42538 3e-06 135_[+1]_350 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=7 49758 ( 120) CTCGTAGCGCTCCCG 1 48253 ( 306) CTCGTAGCGTTCCTG 1 35647 ( 420) CTCATTGCGTTCGCG 1 16375 ( 412) CTTCTAGCGTTCGCG 1 42467 ( 289) CTCGAAGCAGTCTCG 1 40278 ( 218) CCCGTTGTAGTCACG 1 42538 ( 136) CTCCTAGTTATCCCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 5346 bayes= 9.41866 E= 1.8e+001 -945 200 -945 -945 -945 -81 -945 174 -945 178 -945 -84 -97 19 142 -945 -97 -945 -945 174 135 -945 -945 16 -945 -945 223 -945 -945 151 -945 16 3 -945 142 -84 -97 -81 42 74 -945 -945 -945 196 -945 200 -945 -945 -97 78 42 -84 -945 178 -945 -84 -945 -945 201 -84 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 7 E= 1.8e+001 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.857143 0.000000 0.142857 0.142857 0.285714 0.571429 0.000000 0.142857 0.000000 0.000000 0.857143 0.714286 0.000000 0.000000 0.285714 0.000000 0.000000 1.000000 0.000000 0.000000 0.714286 0.000000 0.285714 0.285714 0.000000 0.571429 0.142857 0.142857 0.142857 0.285714 0.428571 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.428571 0.285714 0.142857 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 0.857143 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CTC[GC]T[AT]G[CT][GA][TG]TC[CG]CG -------------------------------------------------------------------------------- Time 1.18 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 9 llr = 96 E-value = 6.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::21::11::: pos.-specific C 2::329:2::a2 probability G :7124::7:::8 matrix T 839221a:9a:: bits 2.2 2.0 * ** 1.8 * ** 1.6 * ** ** Relative 1.3 * ** **** Entropy 1.1 *** ** **** (15.4 bits) 0.9 *** ******* 0.7 *** ******* 0.4 *** ******* 0.2 *** ******** 0.0 ------------ Multilevel TGTCGCTGTTCG consensus CT AC C C sequence GT T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 40174 10 6.55e-08 GCCTCCGCG TGTGGCTGTTCG CCTGCACTCG 42538 383 6.69e-07 GACCGCCAGC TGTTTCTGTTCG AGAAAATCTT 45060 250 1.89e-06 CGTAGATTGC CTTCGCTGTTCG ACATCCTTCG 44269 244 2.78e-06 TGAGAAAGCA TGGAGCTGTTCG GGATTGCAAC 1769 442 2.78e-06 CTTCGCCCTG CGTCGCTCTTCG CCGCTGGAGC 16375 320 7.47e-06 GACCCGACTT TTTGCCTGTTCC CTTCAACAAG 49758 309 1.67e-05 ATTCCATGTA TGTAACTATTCG ATATTTTATA 42467 255 2.72e-05 CGCAATCTTG TGTTTTTGTTCC TCGAGGTTCG 35647 246 2.97e-05 CTTTCGGAGC TTTCCCTCATCG TCTTCATTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40174 6.5e-08 9_[+2]_479 42538 6.7e-07 382_[+2]_106 45060 1.9e-06 249_[+2]_239 44269 2.8e-06 243_[+2]_245 1769 2.8e-06 441_[+2]_47 16375 7.5e-06 319_[+2]_169 49758 1.7e-05 308_[+2]_180 42467 2.7e-05 254_[+2]_234 35647 3e-05 245_[+2]_243 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=9 40174 ( 10) TGTGGCTGTTCG 1 42538 ( 383) TGTTTCTGTTCG 1 45060 ( 250) CTTCGCTGTTCG 1 44269 ( 244) TGGAGCTGTTCG 1 1769 ( 442) CGTCGCTCTTCG 1 16375 ( 320) TTTGCCTGTTCC 1 49758 ( 309) TGTAACTATTCG 1 42467 ( 255) TGTTTTTGTTCC 1 35647 ( 246) TTTCCCTCATCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5379 bayes= 10.0699 E= 6.7e+001 -982 -17 -982 160 -982 -982 164 38 -982 -982 -94 179 -33 41 6 -20 -133 -17 106 -20 -982 183 -982 -120 -982 -982 -982 196 -133 -17 164 -982 -133 -982 -982 179 -982 -982 -982 196 -982 200 -982 -982 -982 -17 187 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 6.7e+001 0.000000 0.222222 0.000000 0.777778 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.111111 0.888889 0.222222 0.333333 0.222222 0.222222 0.111111 0.222222 0.444444 0.222222 0.000000 0.888889 0.000000 0.111111 0.000000 0.000000 0.000000 1.000000 0.111111 0.222222 0.666667 0.000000 0.111111 0.000000 0.000000 0.888889 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.222222 0.777778 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TC][GT]T[CAGT][GCT]CT[GC]TTC[GC] -------------------------------------------------------------------------------- Time 2.28 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 5 llr = 80 E-value = 7.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 22a::28:2::242:a pos.-specific C :4:::6::2a:::::: probability G 84:aa:::::2:48a: matrix T :::::22a6:882::: bits 2.2 ** * 2.0 ** * * * 1.8 *** * * ** 1.6 *** * * ** Relative 1.3 * *** * ** *** Entropy 1.1 * *** ** *** *** (23.0 bits) 0.9 * *** ** *** *** 0.7 * ********** *** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GCAGGCATTCTTAGGA consensus AG AT A GAGA sequence A T C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 45060 69 2.84e-09 ACGTCCCCAT GCAGGAATTCTTGGGA ACCATCCCCG 48253 238 8.84e-09 GCGAAACGAC GCAGGCATCCTTTGGA TTCCGTTGGC 44269 281 2.57e-08 GGTCCCCAGC GGAGGCATACTAAGGA GCGTACACTT 40278 310 5.35e-08 TGGAAATATA GAAGGTATTCTTGAGA ATTCATGAAA 42467 347 1.08e-07 CCGCGGTGAG AGAGGCTTTCGTAGGA GTGTGTGTAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45060 2.8e-09 68_[+3]_416 48253 8.8e-09 237_[+3]_247 44269 2.6e-08 280_[+3]_204 40278 5.3e-08 309_[+3]_175 42467 1.1e-07 346_[+3]_138 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=5 45060 ( 69) GCAGGAATTCTTGGGA 1 48253 ( 238) GCAGGCATCCTTTGGA 1 44269 ( 281) GGAGGCATACTAAGGA 1 40278 ( 310) GAAGGTATTCTTGAGA 1 42467 ( 347) AGAGGCTTTCGTAGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5335 bayes= 10.3097 E= 7.6e+001 -49 -897 191 -897 -49 68 91 -897 183 -897 -897 -897 -897 -897 223 -897 -897 -897 223 -897 -49 126 -897 -36 151 -897 -897 -36 -897 -897 -897 196 -49 -32 -897 123 -897 200 -897 -897 -897 -897 -9 164 -49 -897 -897 164 51 -897 91 -36 -49 -897 191 -897 -897 -897 223 -897 183 -897 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 7.6e+001 0.200000 0.000000 0.800000 0.000000 0.200000 0.400000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.600000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.200000 0.200000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.200000 0.000000 0.000000 0.800000 0.400000 0.000000 0.400000 0.200000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA][CGA]AGG[CAT][AT]T[TAC]C[TG][TA][AGT][GA]GA -------------------------------------------------------------------------------- Time 3.36 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42467 5.38e-08 254_[+2(2.72e-05)]_22_\ [+1(5.87e-07)]_43_[+3(1.08e-07)]_138 42538 2.28e-05 135_[+1(3.00e-06)]_232_\ [+2(6.69e-07)]_106 48253 1.35e-08 237_[+3(8.84e-09)]_52_\ [+1(2.52e-08)]_180 40174 2.76e-04 9_[+2(6.55e-08)]_479 40278 2.42e-06 217_[+1(2.40e-06)]_38_\ [+3(3.58e-05)]_23_[+3(5.35e-08)]_175 16375 2.02e-05 319_[+2(7.47e-06)]_80_\ [+1(1.26e-07)]_74 1769 1.96e-02 441_[+2(2.78e-06)]_47 45060 8.17e-08 13_[+2(3.61e-05)]_43_[+3(2.84e-09)]_\ 165_[+2(1.89e-06)]_239 35647 5.80e-05 245_[+2(2.97e-05)]_162_\ [+1(8.99e-08)]_66 44269 1.07e-07 31_[+1(5.14e-05)]_197_\ [+2(2.78e-06)]_25_[+3(2.57e-08)]_204 49758 1.88e-06 119_[+1(7.44e-09)]_174_\ [+2(1.67e-05)]_180 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************