******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/445/445.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 12186 1.0000 500 18407 1.0000 500 20782 1.0000 500 21453 1.0000 500 24610 1.0000 500 261629 1.0000 500 264051 1.0000 500 29075 1.0000 500 37574 1.0000 500 37923 1.0000 500 4923 1.0000 500 5191 1.0000 500 5595 1.0000 500 5833 1.0000 500 7503 1.0000 500 8917 1.0000 500 bd1771 1.0000 500 bd821 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/445/445.seqs.fa -oc motifs/445 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.257 C 0.225 G 0.254 T 0.264 Background letter frequencies (from dataset with add-one prior applied): A 0.257 C 0.225 G 0.254 T 0.264 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 18 llr = 173 E-value = 4.6e-005 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :893451a4129 pos.-specific C 91:7:29::76: probability G 111:22::2:31 matrix T ::::31::42:: bits 2.2 1.9 * * 1.7 * * ** * 1.5 * * ** * Relative 1.3 * ** ** * Entropy 1.1 **** ** * * (13.9 bits) 0.9 **** ** * * 0.6 **** ** *** 0.4 ***** ****** 0.2 ************ 0.0 ------------ Multilevel CAACAACAACCA consensus ATG T G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ bd821 424 1.30e-07 AATTAGACGG CAACTACAACCA CCACCATCCG 5191 399 1.07e-06 GTGTTGACGC CAACTCCATCCA CTAAAGTCAA bd1771 482 1.25e-06 TCATCACCTC CAACGACAACGA TAGAACG 261629 51 1.25e-06 TATTCTTTGG CAACAACATCAA TCTTGTTTCA 5595 457 1.71e-06 CAATCAGCGA CAACGACATCGA CAGCGGCAGT 7503 472 2.88e-06 GGGAGAAGCA CCACAACATCCA ACTACAGTAT 37574 419 3.26e-06 GCGGCTTCCA CAACAAAAACCA AGAGAGTGTA 12186 430 6.21e-06 CAGTCCACTT CAACTACAGCAA ACGAGCACAT 24610 483 7.73e-06 CGCTACTTCA CAACAACAACCG CCCAAC 8917 458 1.35e-05 CTCCTCTGAT CAAAATCAACGA ATTGTAGAGT 264051 478 1.35e-05 TGTCACCATG CAACTCAAACCA AACAATCTTA 18407 483 1.93e-05 GCCCTCGACC CAACAGCAATAA CTCAAC 29075 478 2.28e-05 GCCGCCGACA CAACTGCAAAGA GTTGAGCAGC 37923 436 2.49e-05 TTCCACGCCG CCAAACCATCCA AGTACCACGT 5833 256 4.86e-05 GGTTTGCCCT CAAATGCATTGA GTGTTTTGTA 20782 290 1.35e-04 GATCGATCGT CGACGACAGACA CCAACACGAA 21453 158 1.42e-04 TTGTTTTCGC CAGAGGCAGCCA GTTGAGAAAG 4923 411 2.31e-04 CAGCCGGTGT GAAAATCATTCA TGAACGCAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- bd821 1.3e-07 423_[+1]_65 5191 1.1e-06 398_[+1]_90 bd1771 1.3e-06 481_[+1]_7 261629 1.3e-06 50_[+1]_438 5595 1.7e-06 456_[+1]_32 7503 2.9e-06 471_[+1]_17 37574 3.3e-06 418_[+1]_70 12186 6.2e-06 429_[+1]_59 24610 7.7e-06 482_[+1]_6 8917 1.4e-05 457_[+1]_31 264051 1.4e-05 477_[+1]_11 18407 1.9e-05 482_[+1]_6 29075 2.3e-05 477_[+1]_11 37923 2.5e-05 435_[+1]_53 5833 4.9e-05 255_[+1]_233 20782 0.00013 289_[+1]_199 21453 0.00014 157_[+1]_331 4923 0.00023 410_[+1]_78 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=18 bd821 ( 424) CAACTACAACCA 1 5191 ( 399) CAACTCCATCCA 1 bd1771 ( 482) CAACGACAACGA 1 261629 ( 51) CAACAACATCAA 1 5595 ( 457) CAACGACATCGA 1 7503 ( 472) CCACAACATCCA 1 37574 ( 419) CAACAAAAACCA 1 12186 ( 430) CAACTACAGCAA 1 24610 ( 483) CAACAACAACCG 1 8917 ( 458) CAAAATCAACGA 1 264051 ( 478) CAACTCAAACCA 1 18407 ( 483) CAACAGCAATAA 1 29075 ( 478) CAACTGCAAAGA 1 37923 ( 436) CCAAACCATCCA 1 5833 ( 256) CAAATGCATTGA 1 20782 ( 290) CGACGACAGACA 1 21453 ( 158) CAGAGGCAGCCA 1 4923 ( 411) GAAAATCATTCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 9.0653 E= 4.6e-005 -1081 207 -219 -1081 170 -102 -219 -1081 188 -1081 -219 -1081 11 168 -1081 -1081 79 -1081 -19 33 96 -43 -19 -125 -121 198 -1081 -1081 196 -1081 -1081 -1081 79 -1081 -61 56 -121 168 -1081 -67 -62 130 13 -1081 188 -1081 -219 -1081 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 18 E= 4.6e-005 0.000000 0.944444 0.055556 0.000000 0.833333 0.111111 0.055556 0.000000 0.944444 0.000000 0.055556 0.000000 0.277778 0.722222 0.000000 0.000000 0.444444 0.000000 0.222222 0.333333 0.500000 0.166667 0.222222 0.111111 0.111111 0.888889 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.444444 0.000000 0.166667 0.388889 0.111111 0.722222 0.000000 0.166667 0.166667 0.555556 0.277778 0.000000 0.944444 0.000000 0.055556 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CAA[CA][ATG][AG]CA[AT]C[CG]A -------------------------------------------------------------------------------- Time 3.08 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 11 llr = 132 E-value = 6.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 1::14151:5::51: pos.-specific C :::31::::12::11 probability G :4a3291:a::9::9 matrix T 96:44:59:58158: bits 2.2 1.9 * * 1.7 * * 1.5 * * * ** * * Relative 1.3 * * * ** ** * Entropy 1.1 *** * ** ** ** (17.3 bits) 0.9 *** * ** ***** 0.6 *** ********** 0.4 *** ********** 0.2 *************** 0.0 --------------- Multilevel TTGTAGATGATGTTG consensus G CT T T A sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 37923 189 3.12e-08 ACTGTGAAAG TTGCTGATGTTGTTG CCAAGGTTGA 20782 84 4.96e-08 GCGCCCACTG TTGCAGATGTTGATG TGATGACAAA 24610 213 1.70e-07 CATGGTGGAA TTGGGGTTGTTGTTG GGGGAGGGTC 4923 174 2.22e-07 TAGGTGTCTG TGGTGGTTGATGATG CCATGATGAA 5833 222 4.13e-07 AAGATGATTC TTGGAGGTGATGTTG CTTTGGCTTG 37574 18 1.47e-06 TGCTACCGCT TTGTTGTTGCCGTTG GGAGGCGATT 261629 437 1.47e-06 ATTGCATTCT TGGTAGTTGATGATC TCTCACATTG 29075 105 2.16e-06 ACAGTGACAG TGGACGATGATGATG CGTGTGTCGA 21453 142 5.34e-06 TGGTGGTCGG TTGTTGTTGTTTTCG CCAGAGGCAG 5595 49 1.17e-05 TTAATTGAGA AGGGTAATGTTGATG GGTGACAATA bd1771 374 1.34e-05 GGGTTGAGCG TTGCAGAAGACGTAG TATCTATCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37923 3.1e-08 188_[+2]_297 20782 5e-08 83_[+2]_402 24610 1.7e-07 212_[+2]_273 4923 2.2e-07 173_[+2]_312 5833 4.1e-07 221_[+2]_264 37574 1.5e-06 17_[+2]_468 261629 1.5e-06 436_[+2]_49 29075 2.2e-06 104_[+2]_381 21453 5.3e-06 141_[+2]_344 5595 1.2e-05 48_[+2]_437 bd1771 1.3e-05 373_[+2]_112 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=11 37923 ( 189) TTGCTGATGTTGTTG 1 20782 ( 84) TTGCAGATGTTGATG 1 24610 ( 213) TTGGGGTTGTTGTTG 1 4923 ( 174) TGGTGGTTGATGATG 1 5833 ( 222) TTGGAGGTGATGTTG 1 37574 ( 18) TTGTTGTTGCCGTTG 1 261629 ( 437) TGGTAGTTGATGATC 1 29075 ( 105) TGGACGATGATGATG 1 21453 ( 142) TTGTTGTTGTTTTCG 1 5595 ( 49) AGGGTAATGTTGATG 1 bd1771 ( 374) TTGCAGAAGACGTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 8748 bayes= 9.98898 E= 6.9e+000 -149 -1010 -1010 178 -1010 -1010 52 127 -1010 -1010 198 -1010 -149 28 10 46 50 -131 -48 46 -149 -1010 184 -1010 82 -1010 -148 78 -149 -1010 -1010 178 -1010 -1010 198 -1010 82 -131 -1010 78 -1010 -31 -1010 163 -1010 -1010 184 -154 82 -1010 -1010 104 -149 -131 -1010 163 -1010 -131 184 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 11 E= 6.9e+000 0.090909 0.000000 0.000000 0.909091 0.000000 0.000000 0.363636 0.636364 0.000000 0.000000 1.000000 0.000000 0.090909 0.272727 0.272727 0.363636 0.363636 0.090909 0.181818 0.363636 0.090909 0.000000 0.909091 0.000000 0.454545 0.000000 0.090909 0.454545 0.090909 0.000000 0.000000 0.909091 0.000000 0.000000 1.000000 0.000000 0.454545 0.090909 0.000000 0.454545 0.000000 0.181818 0.000000 0.818182 0.000000 0.000000 0.909091 0.090909 0.454545 0.000000 0.000000 0.545455 0.090909 0.090909 0.000000 0.818182 0.000000 0.090909 0.909091 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[TG]G[TCG][AT]G[AT]TG[AT]TG[TA]TG -------------------------------------------------------------------------------- Time 6.15 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 6 llr = 89 E-value = 3.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::3:::::::::::: pos.-specific C 88:83a:aa:a2258 probability G 22::2:2::::23:: matrix T ::725:8::a:7552 bits 2.2 * ** * 1.9 * **** 1.7 * **** 1.5 ** * * **** * Relative 1.3 ** * ****** * Entropy 1.1 **** ****** ** (21.5 bits) 0.9 **** ****** ** 0.6 ************ ** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CCTCTCTCCTCTTCC consensus A C GT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 5833 366 3.06e-09 TCGAAGTTGA CCTCTCTCCTCTGTC CAATAGCAAA 21453 394 2.26e-08 CGGCACCGGT CCTTTCTCCTCTTCC TCTCGGTCCT 18407 176 6.27e-08 TACTTCTCAG GCACTCTCCTCTTTC ACGCACTGCT 7503 404 1.10e-07 GCCTCCTCCT CCTCCCTCCTCTCTT CTCGCATCAC 29075 434 2.12e-07 CACCCCCCGT CCACCCGCCTCGTCC GCTGAACTTT 24610 444 3.03e-07 CACGAGCTTT CGTCGCTCCTCCGCC GCCGACGGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 5833 3.1e-09 365_[+3]_120 21453 2.3e-08 393_[+3]_92 18407 6.3e-08 175_[+3]_310 7503 1.1e-07 403_[+3]_82 29075 2.1e-07 433_[+3]_52 24610 3e-07 443_[+3]_42 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=6 5833 ( 366) CCTCTCTCCTCTGTC 1 21453 ( 394) CCTTTCTCCTCTTCC 1 18407 ( 176) GCACTCTCCTCTTTC 1 7503 ( 404) CCTCCCTCCTCTCTT 1 29075 ( 434) CCACCCGCCTCGTCC 1 24610 ( 444) CGTCGCTCCTCCGCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 8748 bayes= 10.1675 E= 3.3e+001 -923 189 -61 -923 -923 189 -61 -923 38 -923 -923 133 -923 189 -923 -66 -923 56 -61 92 -923 215 -923 -923 -923 -923 -61 165 -923 215 -923 -923 -923 215 -923 -923 -923 -923 -923 192 -923 215 -923 -923 -923 -43 -61 133 -923 -43 39 92 -923 115 -923 92 -923 189 -923 -66 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 6 E= 3.3e+001 0.000000 0.833333 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.333333 0.000000 0.000000 0.666667 0.000000 0.833333 0.000000 0.166667 0.000000 0.333333 0.166667 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.166667 0.666667 0.000000 0.166667 0.333333 0.500000 0.000000 0.500000 0.000000 0.500000 0.000000 0.833333 0.000000 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CC[TA]C[TC]CTCCTCT[TG][CT]C -------------------------------------------------------------------------------- Time 8.82 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12186 4.11e-03 429_[+1(6.21e-06)]_59 18407 3.48e-05 175_[+3(6.27e-08)]_292_\ [+1(1.93e-05)]_6 20782 1.25e-04 83_[+2(4.96e-08)]_2_[+2(6.72e-05)]_\ 385 21453 4.25e-07 141_[+2(5.34e-06)]_159_\ [+3(2.58e-06)]_63_[+3(2.26e-08)]_92 24610 1.41e-08 164_[+2(5.34e-06)]_33_\ [+2(1.70e-07)]_216_[+3(3.03e-07)]_24_[+1(7.73e-06)]_6 261629 3.68e-05 50_[+1(1.25e-06)]_374_\ [+2(1.47e-06)]_49 264051 7.78e-03 477_[+1(1.35e-05)]_11 29075 2.79e-07 104_[+2(2.16e-06)]_314_\ [+3(2.12e-07)]_29_[+1(2.28e-05)]_11 37574 1.12e-04 17_[+2(1.47e-06)]_386_\ [+1(3.26e-06)]_70 37923 1.12e-05 53_[+2(8.72e-05)]_120_\ [+2(3.12e-08)]_232_[+1(2.49e-05)]_53 4923 8.89e-04 173_[+2(2.22e-07)]_312 5191 7.24e-03 398_[+1(1.07e-06)]_90 5595 6.67e-06 48_[+2(1.17e-05)]_137_\ [+3(1.79e-05)]_241_[+1(1.71e-06)]_32 5833 2.50e-09 221_[+2(4.13e-07)]_19_\ [+1(4.86e-05)]_98_[+3(3.06e-09)]_120 7503 9.06e-06 271_[+1(1.35e-05)]_120_\ [+3(1.10e-07)]_53_[+1(2.88e-06)]_17 8917 3.47e-02 457_[+1(1.35e-05)]_31 bd1771 1.66e-04 373_[+2(1.34e-05)]_93_\ [+1(1.25e-06)]_7 bd821 1.26e-03 423_[+1(1.30e-07)]_65 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************