******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/106/106.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10337 1.0000 500 1927 1.0000 500 23359 1.0000 500 23782 1.0000 500 24506 1.0000 500 24556 1.0000 500 263992 1.0000 500 268902 1.0000 500 28487 1.0000 500 32051 1.0000 500 35180 1.0000 500 4091 1.0000 500 41566 1.0000 500 5671 1.0000 500 8227 1.0000 500 9145 1.0000 500 9258 1.0000 500 9261 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/106/106.seqs.fa -oc motifs/106 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.262 C 0.235 G 0.236 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.262 C 0.235 G 0.236 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 9 llr = 142 E-value = 4.8e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 12834a719738:42:a::3a pos.-specific C 82262:281:6:2:88:977: probability G 14:13:11:21:64:2:11:: matrix T :1:::::::1:221::::2:: bits 2.1 1.9 * * * 1.7 * ** * 1.5 * * ** * Relative 1.3 * * * * **** * Entropy 1.0 * * * ** * **** ** (22.7 bits) 0.8 * * ***** * ******* 0.6 * ** **************** 0.4 * ******************* 0.2 ********************* 0.0 --------------------- Multilevel CGACAAACAACAGACCACCCA consensus ACAG C GATCGAG TA sequence C C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 32051 455 4.61e-10 CCTGGGCCTC CGACGAGCAACAGGCCACTCA CTTTGTACTC 9261 333 6.89e-10 GCTGCAACAA CGCCAAACAGCAGGCGACCCA AAAAGGCGCG 9258 451 6.89e-10 GCTGCAACAA CGCCAAACAGCAGGCGACCCA AAAAGGCGCG 28487 377 3.30e-09 CATCCCATCA CCACAACCAAAACACCACTCA CAAGAACCTT 24556 317 6.72e-08 GACCTCTTTC CAACGAACATAATTACACCCA ATGCCAAAGG 23359 348 1.19e-07 CAGGTTGTCA GCAACAAGAACTGGCCACCAA ATTTCCAACC 23782 455 1.36e-07 ACAACATTGC ATAACAACAACATACCAGCCA TAAAGACACC 41566 219 1.88e-07 GTTTACAACA CGAAGAAAAAGAGAACACGAA TCTCTCCTCG 35180 282 3.03e-07 AGGCTCTTGA CAAGAACCCAATCACCACCAA GGCTCTTACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32051 4.6e-10 454_[+1]_25 9261 6.9e-10 332_[+1]_147 9258 6.9e-10 450_[+1]_29 28487 3.3e-09 376_[+1]_103 24556 6.7e-08 316_[+1]_163 23359 1.2e-07 347_[+1]_132 23782 1.4e-07 454_[+1]_25 41566 1.9e-07 218_[+1]_261 35180 3e-07 281_[+1]_198 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=9 32051 ( 455) CGACGAGCAACAGGCCACTCA 1 9261 ( 333) CGCCAAACAGCAGGCGACCCA 1 9258 ( 451) CGCCAAACAGCAGGCGACCCA 1 28487 ( 377) CCACAACCAAAACACCACTCA 1 24556 ( 317) CAACGAACATAATTACACCCA 1 23359 ( 348) GCAACAAGAACTGGCCACCAA 1 23782 ( 455) ATAACAACAACATACCAGCCA 1 41566 ( 219) CGAAGAAAAAGAGAACACGAA 1 35180 ( 282) CAAGAACCCAATCACCACCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 8640 bayes= 10.7541 E= 4.8e-001 -124 173 -109 -982 -24 -8 91 -126 157 -8 -982 -982 35 124 -109 -982 76 -8 49 -982 193 -982 -982 -982 135 -8 -109 -982 -124 173 -109 -982 176 -108 -982 -982 135 -982 -9 -126 35 124 -109 -982 157 -982 -982 -26 -982 -8 123 -26 76 -982 91 -126 -24 173 -982 -982 -982 173 -9 -982 193 -982 -982 -982 -982 192 -109 -982 -982 151 -109 -26 35 151 -982 -982 193 -982 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 9 E= 4.8e-001 0.111111 0.777778 0.111111 0.000000 0.222222 0.222222 0.444444 0.111111 0.777778 0.222222 0.000000 0.000000 0.333333 0.555556 0.111111 0.000000 0.444444 0.222222 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.222222 0.111111 0.000000 0.111111 0.777778 0.111111 0.000000 0.888889 0.111111 0.000000 0.000000 0.666667 0.000000 0.222222 0.111111 0.333333 0.555556 0.111111 0.000000 0.777778 0.000000 0.000000 0.222222 0.000000 0.222222 0.555556 0.222222 0.444444 0.000000 0.444444 0.111111 0.222222 0.777778 0.000000 0.000000 0.000000 0.777778 0.222222 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.666667 0.111111 0.222222 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[GAC][AC][CA][AGC]A[AC]CA[AG][CA][AT][GCT][AG][CA][CG]AC[CT][CA]A -------------------------------------------------------------------------------- Time 3.16 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 8 llr = 114 E-value = 3.7e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :51:95:49::::::: pos.-specific C a:4a:1a6:a65:63a probability G :11:1::::::344:: matrix T :44::4::1:436:8: bits 2.1 * * * * * 1.9 * * * * * 1.7 * * * * * 1.5 * ** * ** * Relative 1.3 * ** * ** * Entropy 1.0 * ** ***** **** (20.5 bits) 0.8 * ** ***** **** 0.6 ** ************* 0.4 ** ************* 0.2 **************** 0.0 ---------------- Multilevel CACCAACCACCCTCTC consensus TT T A TGGGC sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 9258 345 2.56e-09 GGTGCACGTC CTCCATCCACCCTCTC TTGCTGGATC 4091 457 5.16e-09 ACTCCTCACT CATCATCCACCCGCTC CGTCGTCGCG 35180 165 9.76e-08 AGCCTGCCCT CATCAACAACTCTCCC CTCCGCAAGA 24556 54 1.09e-07 TTTGATTCTA CTCCACCCACCTTCTC GCTTTCAACA 23782 368 2.94e-07 ATTTTCTCTC CTCCGACCACCGTGTC TTTTCATGTA 1927 231 3.17e-07 GATGCTACGA CAACAACCACTGGGTC TATTTTCTTT 9145 364 6.00e-07 GTCCATCGAT CATCAACATCTCGGTC TGCTTCTGCA 24506 211 1.01e-06 TACACGGTTA CGGCATCAACCTTCCC TCGTTCTCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9258 2.6e-09 344_[+2]_140 4091 5.2e-09 456_[+2]_28 35180 9.8e-08 164_[+2]_320 24556 1.1e-07 53_[+2]_431 23782 2.9e-07 367_[+2]_117 1927 3.2e-07 230_[+2]_254 9145 6e-07 363_[+2]_121 24506 1e-06 210_[+2]_274 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=8 9258 ( 345) CTCCATCCACCCTCTC 1 4091 ( 457) CATCATCCACCCGCTC 1 35180 ( 165) CATCAACAACTCTCCC 1 24556 ( 54) CTCCACCCACCTTCTC 1 23782 ( 368) CTCCGACCACCGTGTC 1 1927 ( 231) CAACAACCACTGGGTC 1 9145 ( 364) CATCAACATCTCGGTC 1 24506 ( 211) CGGCATCAACCTTCCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8730 bayes= 9.24139 E= 3.7e+000 -965 209 -965 -965 93 -965 -92 49 -107 68 -92 49 -965 209 -965 -965 174 -965 -92 -965 93 -91 -965 49 -965 209 -965 -965 52 141 -965 -965 174 -965 -965 -109 -965 209 -965 -965 -965 141 -965 49 -965 109 8 -9 -965 -965 66 123 -965 141 66 -965 -965 9 -965 149 -965 209 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 3.7e+000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.125000 0.375000 0.125000 0.375000 0.125000 0.375000 0.000000 1.000000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.500000 0.125000 0.000000 0.375000 0.000000 1.000000 0.000000 0.000000 0.375000 0.625000 0.000000 0.000000 0.875000 0.000000 0.000000 0.125000 0.000000 1.000000 0.000000 0.000000 0.000000 0.625000 0.000000 0.375000 0.000000 0.500000 0.250000 0.250000 0.000000 0.000000 0.375000 0.625000 0.000000 0.625000 0.375000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[AT][CT]CA[AT]C[CA]AC[CT][CGT][TG][CG][TC]C -------------------------------------------------------------------------------- Time 5.84 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 14 sites = 8 llr = 106 E-value = 8.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :6::1::41:::98 pos.-specific C ::a::a:19::81: probability G a::a4:94:583:3 matrix T :4::5:11:53::: bits 2.1 * ** * 1.9 * ** * 1.7 * ** * 1.5 * ** ** * * Relative 1.3 * ** ** * **** Entropy 1.0 **** ** ****** (19.2 bits) 0.8 **** ** ****** 0.6 ******* ****** 0.4 ******* ****** 0.2 ************** 0.0 -------------- Multilevel GACGTCGACGGCAA consensus T G G TTG G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 9261 316 1.86e-08 CAGGTGTACG GACGGCGGCTGCAA CAACGCCAAA 9258 434 1.86e-08 CAGGTGTACG GACGGCGGCTGCAA CAACGCCAAA 1927 142 1.35e-07 TCACGATGAC GACGACGACGGCAA CTCGACACAC 28487 107 1.49e-07 GGTAGTTGAA GTCGTCGACGGCAG GTGTCAACTT 268902 107 3.08e-07 TCGTGTTCTT GACGTCGAATGCAA TGTAGAAGAC 4091 364 1.02e-06 CAATACCATT GACGGCTGCGTCAA ATCCAGCATC 24506 155 1.51e-06 GTCGTTGAGT GTCGTCGTCTGGAG GAGTGAAGGT 24556 413 4.23e-06 AGTACTGTTT GTCGTCGCCGTGCA CAAGAACGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9261 1.9e-08 315_[+3]_171 9258 1.9e-08 433_[+3]_53 1927 1.3e-07 141_[+3]_345 28487 1.5e-07 106_[+3]_380 268902 3.1e-07 106_[+3]_380 4091 1e-06 363_[+3]_123 24506 1.5e-06 154_[+3]_332 24556 4.2e-06 412_[+3]_74 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=14 seqs=8 9261 ( 316) GACGGCGGCTGCAA 1 9258 ( 434) GACGGCGGCTGCAA 1 1927 ( 142) GACGACGACGGCAA 1 28487 ( 107) GTCGTCGACGGCAG 1 268902 ( 107) GACGTCGAATGCAA 1 4091 ( 364) GACGGCTGCGTCAA 1 24506 ( 155) GTCGTCGTCTGGAG 1 24556 ( 413) GTCGTCGCCGTGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 8766 bayes= 10.8339 E= 8.9e+000 -965 -965 208 -965 125 -965 -965 49 -965 209 -965 -965 -965 -965 208 -965 -107 -965 66 90 -965 209 -965 -965 -965 -965 189 -109 52 -91 66 -109 -107 190 -965 -965 -965 -965 108 90 -965 -965 166 -9 -965 168 8 -965 174 -91 -965 -965 152 -965 8 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 8 E= 8.9e+000 0.000000 0.000000 1.000000 0.000000 0.625000 0.000000 0.000000 0.375000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.125000 0.000000 0.375000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.875000 0.125000 0.375000 0.125000 0.375000 0.125000 0.125000 0.875000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.750000 0.250000 0.000000 0.750000 0.250000 0.000000 0.875000 0.125000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[AT]CG[TG]CG[AG]C[GT][GT][CG]A[AG] -------------------------------------------------------------------------------- Time 9.38 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10337 1.00e+00 500 1927 8.22e-07 141_[+3(1.35e-07)]_75_\ [+2(3.17e-07)]_254 23359 5.83e-05 347_[+1(1.19e-07)]_132 23782 1.75e-06 367_[+2(2.94e-07)]_71_\ [+1(1.36e-07)]_25 24506 2.24e-05 154_[+3(1.51e-06)]_42_\ [+2(1.01e-06)]_142_[+3(5.57e-05)]_118 24556 1.32e-09 53_[+2(1.09e-07)]_247_\ [+1(6.72e-08)]_75_[+3(4.23e-06)]_74 263992 6.07e-01 500 268902 2.20e-03 106_[+3(3.08e-07)]_380 28487 2.17e-09 48_[+1(8.78e-05)]_37_[+3(1.49e-07)]_\ 256_[+1(3.30e-09)]_43_[+1(7.79e-05)]_39 32051 2.37e-06 454_[+1(4.61e-10)]_25 35180 8.90e-07 164_[+2(9.76e-08)]_101_\ [+1(3.03e-07)]_198 4091 2.62e-07 363_[+3(1.02e-06)]_79_\ [+2(5.16e-09)]_28 41566 4.40e-03 218_[+1(1.88e-07)]_261 5671 4.93e-02 51_[+2(9.55e-05)]_433 8227 2.50e-01 500 9145 3.03e-04 363_[+2(6.00e-07)]_121 9258 3.15e-15 296_[+3(2.50e-06)]_34_\ [+2(2.56e-09)]_73_[+3(1.86e-08)]_3_[+1(6.89e-10)]_29 9261 6.70e-10 315_[+3(1.86e-08)]_3_[+1(6.89e-10)]_\ 147 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************