******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/107/107.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31927 1.0000 500 54019 1.0000 500 46460 1.0000 500 47702 1.0000 500 49038 1.0000 500 49339 1.0000 500 55079 1.0000 500 31068 1.0000 500 33525 1.0000 500 50577 1.0000 500 55230 1.0000 500 12586 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/107/107.seqs.fa -oc motifs/107 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.258 C 0.244 G 0.245 T 0.254 Background letter frequencies (from dataset with add-one prior applied): A 0.258 C 0.244 G 0.245 T 0.254 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 5 llr = 97 E-value = 6.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::2:2::a::::::::42:: pos.-specific C 2aa2::22:626:826::8:a probability G :::6:284:2:2:::::::2: matrix T 8:::a6:4:282a284a6:8: bits 2.0 ** * * * * * 1.8 ** * * * * * 1.6 ** * * * * * 1.4 ** * * * * * Relative 1.2 *** * * * * *** * *** Entropy 1.0 *** * * * * ********* (28.1 bits) 0.8 *** * * * * ********* 0.6 ******* ************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel TCCGTTGGACTCTCTCTTCTC consensus C A ACT GCG TCT AAG sequence C G C T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 50577 351 4.63e-11 CGTGGAGAGG TCCGTTGCACTTTCTCTACTC TCTCGATTCC 54019 400 4.69e-10 GCGCGGTAAC CCCGTAGTACTCTCTCTTCGC TTTTCATCAC 46460 417 8.59e-10 TTAGATCTAG TCCGTTGTATCCTCTTTTATC GGGAGTCCCT 55230 246 1.45e-09 TCGTTTCCTT TCCCTGGGAGTCTCCTTTCTC CCCCTGTCCA 49038 219 2.54e-09 GACGGGAACG TCCATTCGACTGTTTCTACTC TATAGAATAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50577 4.6e-11 350_[+1]_129 54019 4.7e-10 399_[+1]_80 46460 8.6e-10 416_[+1]_63 55230 1.5e-09 245_[+1]_234 49038 2.5e-09 218_[+1]_261 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=5 50577 ( 351) TCCGTTGCACTTTCTCTACTC 1 54019 ( 400) CCCGTAGTACTCTCTCTTCGC 1 46460 ( 417) TCCGTTGTATCCTCTTTTATC 1 55230 ( 246) TCCCTGGGAGTCTCCTTTCTC 1 49038 ( 219) TCCATTCGACTGTTTCTACTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5760 bayes= 10.4204 E= 6.2e+001 -897 -28 -897 166 -897 203 -897 -897 -897 203 -897 -897 -36 -28 129 -897 -897 -897 -897 198 -36 -897 -29 124 -897 -28 171 -897 -897 -28 71 66 195 -897 -897 -897 -897 130 -29 -34 -897 -28 -897 166 -897 130 -29 -34 -897 -897 -897 198 -897 171 -897 -34 -897 -28 -897 166 -897 130 -897 66 -897 -897 -897 198 63 -897 -897 124 -36 171 -897 -897 -897 -897 -29 166 -897 203 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 6.2e+001 0.000000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.200000 0.600000 0.000000 0.000000 0.000000 0.000000 1.000000 0.200000 0.000000 0.200000 0.600000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.400000 0.400000 1.000000 0.000000 0.000000 0.000000 0.000000 0.600000 0.200000 0.200000 0.000000 0.200000 0.000000 0.800000 0.000000 0.600000 0.200000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.200000 0.000000 0.800000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 0.000000 1.000000 0.400000 0.000000 0.000000 0.600000 0.200000 0.800000 0.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TC]CC[GAC]T[TAG][GC][GTC]A[CGT][TC][CGT]T[CT][TC][CT]T[TA][CA][TG]C -------------------------------------------------------------------------------- Time 1.56 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 5 llr = 77 E-value = 9.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::2a2::44:::: pos.-specific C ::::::68::42aa: probability G 2:aa8:2:842:::a matrix T 8a:::::222:8::: bits 2.0 *** * *** 1.8 *** * *** 1.6 *** * *** 1.4 *** * *** Relative 1.2 ****** ** **** Entropy 1.0 ****** ** **** (22.1 bits) 0.8 ****** ** **** 0.6 ********* **** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel TTGGGACCGAATCCG consensus G A ATTGCC sequence G TG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 12586 1 5.79e-09 . TTGGGACCGAGTCCG TCACGGTACC 54019 364 2.16e-08 GCCGTTCATT TTGGGACCGACCCCG TTCCTTGTTT 55079 385 2.85e-08 ACGACGAAGA TTGGGACCTGATCCG GACCAAACGA 49038 485 1.65e-07 CTCTCACTCG TTGGGAATGTATCCG T 55230 183 2.08e-07 GATGTGCTAC GTGGAAGCGGCTCCG CGCACCGTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12586 5.8e-09 [+2]_485 54019 2.2e-08 363_[+2]_122 55079 2.8e-08 384_[+2]_101 49038 1.6e-07 484_[+2]_1 55230 2.1e-07 182_[+2]_303 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=5 12586 ( 1) TTGGGACCGAGTCCG 1 54019 ( 364) TTGGGACCGACCCCG 1 55079 ( 385) TTGGGACCTGATCCG 1 49038 ( 485) TTGGGAATGTATCCG 1 55230 ( 183) GTGGAAGCGGCTCCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 5832 bayes= 11.1306 E= 9.8e+002 -897 -897 -29 166 -897 -897 -897 198 -897 -897 203 -897 -897 -897 203 -897 -36 -897 171 -897 195 -897 -897 -897 -36 130 -29 -897 -897 171 -897 -34 -897 -897 171 -34 63 -897 71 -34 63 71 -29 -897 -897 -28 -897 166 -897 203 -897 -897 -897 203 -897 -897 -897 -897 203 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 5 E= 9.8e+002 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.600000 0.200000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.800000 0.200000 0.400000 0.000000 0.400000 0.200000 0.400000 0.400000 0.200000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TG]TGG[GA]A[CAG][CT][GT][AGT][ACG][TC]CCG -------------------------------------------------------------------------------- Time 3.11 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 12 llr = 114 E-value = 2.0e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 2337611:1::: pos.-specific C 24:::3185::a probability G 138:::8:3::: matrix T 6::346:21aa: bits 2.0 *** 1.8 *** 1.6 *** 1.4 * *** Relative 1.2 * ** *** Entropy 1.0 *** ** *** (13.8 bits) 0.8 ****** *** 0.6 ****** *** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TCGAATGCCTTC consensus GATTC G sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 54019 277 2.35e-07 TCGATCCCGC TCGAATGCGTTC TGGAAAAGCG 55230 213 6.45e-07 CGTCGGTAGG TGGAACGCCTTC GCTCCAATGC 12586 36 2.17e-06 GGCATCGGGC TGGTACGCCTTC TGCTTCGACA 33525 476 6.90e-06 TGCCTCTCCA TCGAACGTCTTC GTGGGCCACA 31068 267 1.56e-05 CGACAACCAT CGAATTGCCTTC CGCCCTGCCC 50577 374 1.89e-05 CTCTACTCTC TCGATTCCGTTC CGATCCCACT 31927 4 2.04e-05 TGA GGGTTTGCCTTC AATGGAACAG 49339 237 2.62e-05 GATGCAAGCG CCGAACGTCTTC CACGTCAGAA 49038 242 2.62e-05 TTCTACTCTA TAGAATACGTTC GTGAAGCCTT 46460 241 4.07e-05 ACAGTTAATG TAAATTGCATTC TAGAAAACTG 55079 352 6.09e-05 TTTGACAGTC AAATTTGCGTTC TGTTGACTCC 47702 29 1.41e-04 GCTCCTTGGG ACGTAAGCTTTC GCTTTGGTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54019 2.4e-07 276_[+3]_212 55230 6.5e-07 212_[+3]_276 12586 2.2e-06 35_[+3]_453 33525 6.9e-06 475_[+3]_13 31068 1.6e-05 266_[+3]_222 50577 1.9e-05 373_[+3]_115 31927 2e-05 3_[+3]_485 49339 2.6e-05 236_[+3]_252 49038 2.6e-05 241_[+3]_247 46460 4.1e-05 240_[+3]_248 55079 6.1e-05 351_[+3]_137 47702 0.00014 28_[+3]_460 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=12 54019 ( 277) TCGAATGCGTTC 1 55230 ( 213) TGGAACGCCTTC 1 12586 ( 36) TGGTACGCCTTC 1 33525 ( 476) TCGAACGTCTTC 1 31068 ( 267) CGAATTGCCTTC 1 50577 ( 374) TCGATTCCGTTC 1 31927 ( 4) GGGTTTGCCTTC 1 49339 ( 237) CCGAACGTCTTC 1 49038 ( 242) TAGAATACGTTC 1 46460 ( 241) TAAATTGCATTC 1 55079 ( 352) AAATTTGCGTTC 1 47702 ( 29) ACGTAAGCTTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 8.93074 E= 2.0e+003 -63 -55 -155 120 -4 77 44 -1023 -4 -1023 161 -1023 137 -1023 -1023 39 118 -1023 -1023 72 -163 45 -1023 120 -163 -155 177 -1023 -1023 177 -1023 -61 -163 104 44 -160 -1023 -1023 -1023 198 -1023 -1023 -1023 198 -1023 204 -1023 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 2.0e+003 0.166667 0.166667 0.083333 0.583333 0.250000 0.416667 0.333333 0.000000 0.250000 0.000000 0.750000 0.000000 0.666667 0.000000 0.000000 0.333333 0.583333 0.000000 0.000000 0.416667 0.083333 0.333333 0.000000 0.583333 0.083333 0.083333 0.833333 0.000000 0.000000 0.833333 0.000000 0.166667 0.083333 0.500000 0.333333 0.083333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[CGA][GA][AT][AT][TC]GC[CG]TTC -------------------------------------------------------------------------------- Time 4.63 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31927 6.55e-02 3_[+3(2.04e-05)]_485 54019 1.84e-13 276_[+3(2.35e-07)]_75_\ [+2(2.16e-08)]_21_[+1(4.69e-10)]_80 46460 3.67e-07 240_[+3(4.07e-05)]_164_\ [+1(8.59e-10)]_63 47702 1.63e-01 500 49038 5.02e-10 218_[+1(2.54e-09)]_2_[+3(2.62e-05)]_\ 231_[+2(1.65e-07)]_1 49339 1.41e-01 236_[+3(2.62e-05)]_252 55079 3.86e-05 351_[+3(6.09e-05)]_21_\ [+2(2.85e-08)]_101 31068 1.24e-02 266_[+3(1.56e-05)]_222 33525 6.09e-02 319_[+3(7.35e-05)]_144_\ [+3(6.90e-06)]_13 50577 2.28e-08 350_[+1(4.63e-11)]_2_[+3(1.89e-05)]_\ 115 55230 1.17e-11 182_[+2(2.08e-07)]_15_\ [+3(6.45e-07)]_21_[+1(1.45e-09)]_234 12586 5.99e-07 [+2(5.79e-09)]_20_[+3(2.17e-06)]_\ 453 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************