******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/75/75.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47536 1.0000 500 37859 1.0000 500 48250 1.0000 500 48662 1.0000 500 48723 1.0000 500 49182 1.0000 500 55108 1.0000 500 49936 1.0000 500 50005 1.0000 500 40967 1.0000 500 44121 1.0000 500 43863 1.0000 500 47736 1.0000 500 50503 1.0000 500 47916 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/75/75.seqs.fa -oc motifs/75 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.267 C 0.221 G 0.228 T 0.284 Background letter frequencies (from dataset with add-one prior applied): A 0.267 C 0.221 G 0.228 T 0.284 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 19 sites = 8 llr = 120 E-value = 5.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1:13a16:5:436::::3a pos.-specific C :4:1:9::16:::61354: probability G 95:6::1a:4:8::18:3: matrix T :19:::3:4:6:448:51: bits 2.2 * 2.0 * * * 1.7 * * * 1.5 * ** * * Relative 1.3 * * ** * * * * * Entropy 1.1 * * ** * * * * ** * (21.7 bits) 0.9 * **** * ******** * 0.7 ******** ******** * 0.4 ***************** * 0.2 ******************* 0.0 ------------------- Multilevel GGTGACAGACTGACTGCCA consensus C A T TGAATT CTA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 48250 248 2.73e-09 CATCTGCATA GGAGACAGACTGACTGTGA GGTTACGATC 37859 148 3.14e-09 ACTGTGACTG GCTGACTGACTGACTGCTA TGTAATGGTC 55108 426 5.40e-09 CGTCAGACGG GCTGACTGTGTGACTGTGA CTGTCTGAGA 50503 481 9.97e-09 CGAAGTTTAA GGTGACAGTCAATCTGCAA A 44121 83 1.27e-07 GACGCACATT GTTAACAGCGTGATTGTCA TGTGTCTTTC 47916 255 2.09e-07 ATGGTACATT GCTCACAGTCTAATTCCAA ATTGCTGACC 48662 182 3.24e-07 CATACCCAAA GGTGAAAGAGAGTCGCTCA AACAAGAGGT 48723 356 7.12e-07 TCCAATTCCC AGTAACGGACAGTTCGCCA TTGCCAAAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48250 2.7e-09 247_[+1]_234 37859 3.1e-09 147_[+1]_334 55108 5.4e-09 425_[+1]_56 50503 1e-08 480_[+1]_1 44121 1.3e-07 82_[+1]_399 47916 2.1e-07 254_[+1]_227 48662 3.2e-07 181_[+1]_300 48723 7.1e-07 355_[+1]_126 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=8 48250 ( 248) GGAGACAGACTGACTGTGA 1 37859 ( 148) GCTGACTGACTGACTGCTA 1 55108 ( 426) GCTGACTGTGTGACTGTGA 1 50503 ( 481) GGTGACAGTCAATCTGCAA 1 44121 ( 83) GTTAACAGCGTGATTGTCA 1 47916 ( 255) GCTCACAGTCTAATTCCAA 1 48662 ( 182) GGTGAAAGAGAGTCGCTCA 1 48723 ( 356) AGTAACGGACAGTTCGCCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 7230 bayes= 9.81818 E= 5.3e+001 -109 -965 194 -965 -965 76 113 -118 -109 -965 -965 162 -9 -82 145 -965 191 -965 -965 -965 -109 198 -965 -965 123 -965 -87 -18 -965 -965 213 -965 91 -82 -965 40 -965 150 72 -965 49 -965 -965 114 -9 -965 172 -965 123 -965 -965 40 -965 150 -965 40 -965 -82 -87 140 -965 17 172 -965 -965 117 -965 82 -9 76 13 -118 191 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 8 E= 5.3e+001 0.125000 0.000000 0.875000 0.000000 0.000000 0.375000 0.500000 0.125000 0.125000 0.000000 0.000000 0.875000 0.250000 0.125000 0.625000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.625000 0.000000 0.125000 0.250000 0.000000 0.000000 1.000000 0.000000 0.500000 0.125000 0.000000 0.375000 0.000000 0.625000 0.375000 0.000000 0.375000 0.000000 0.000000 0.625000 0.250000 0.000000 0.750000 0.000000 0.625000 0.000000 0.000000 0.375000 0.000000 0.625000 0.000000 0.375000 0.000000 0.125000 0.125000 0.750000 0.000000 0.250000 0.750000 0.000000 0.000000 0.500000 0.000000 0.500000 0.250000 0.375000 0.250000 0.125000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[GC]T[GA]AC[AT]G[AT][CG][TA][GA][AT][CT]T[GC][CT][CAG]A -------------------------------------------------------------------------------- Time 1.99 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 3 llr = 51 E-value = 4.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::3:3:::3:3:: pos.-specific C aaa7a3aa77a::a probability G ::::::::3::7:: matrix T :::::3::::::a: bits 2.2 *** * ** * * 2.0 *** * ** * * 1.7 *** * ** * ** 1.5 *** * ** * ** Relative 1.3 *** * *** * ** Entropy 1.1 ***** ******** (24.3 bits) 0.9 ***** ******** 0.7 ***** ******** 0.4 ************** 0.2 ************** 0.0 -------------- Multilevel CCCCCACCCCCGTC consensus A C GA A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 48723 19 8.79e-10 TGGTAGCTCC CCCCCCCCCCCGTC AAATCTTAGT 40967 436 1.73e-08 GGGTTTTCTA CCCCCTCCCACGTC ATCATTGGAC 50005 380 5.19e-08 CCATCATCGT CCCACACCGCCATC CAACCTCGTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48723 8.8e-10 18_[+2]_468 40967 1.7e-08 435_[+2]_51 50005 5.2e-08 379_[+2]_107 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=3 48723 ( 19) CCCCCCCCCCCGTC 1 40967 ( 436) CCCCCTCCCACGTC 1 50005 ( 380) CCCACACCGCCATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 7305 bayes= 10.9079 E= 4.3e+002 -823 217 -823 -823 -823 217 -823 -823 -823 217 -823 -823 32 159 -823 -823 -823 217 -823 -823 32 59 -823 23 -823 217 -823 -823 -823 217 -823 -823 -823 159 55 -823 32 159 -823 -823 -823 217 -823 -823 32 -823 154 -823 -823 -823 -823 181 -823 217 -823 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 3 E= 4.3e+002 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.333333 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CCC[CA]C[ACT]CC[CG][CA]C[GA]TC -------------------------------------------------------------------------------- Time 3.73 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 6 llr = 76 E-value = 1.8e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::a:aa:25::: pos.-specific C 5::7:::8:7a7 probability G 2a::::a:23:3 matrix T 3::3::::3::: bits 2.2 * * * 2.0 ** *** * 1.7 ** *** * 1.5 ** **** * Relative 1.3 ** **** *** Entropy 1.1 ******* *** (18.3 bits) 0.9 ******* *** 0.7 ******** *** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CGACAAGCACCC consensus T T TG G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44121 111 2.32e-07 ATGTGTCTTT CGACAAGCGCCC GAATTTTAGG 49936 163 4.55e-07 GAAGCGCTGT TGACAAGCACCG AATTGCTGCC 48662 352 7.84e-07 CCCTTGCGAT CGATAAGCAGCC TTGTGGGTTA 47736 111 9.74e-07 TGGCTTTAAA TGACAAGCTCCG CCGGTGTGCT 37859 87 1.58e-06 TTTCGCTCCA GGACAAGCTGCC AGTCGACGGG 50005 13 2.34e-06 GGCAATCATT CGATAAGAACCC GGATGAAGAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44121 2.3e-07 110_[+3]_378 49936 4.6e-07 162_[+3]_326 48662 7.8e-07 351_[+3]_137 47736 9.7e-07 110_[+3]_378 37859 1.6e-06 86_[+3]_402 50005 2.3e-06 12_[+3]_476 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=6 44121 ( 111) CGACAAGCGCCC 1 49936 ( 163) TGACAAGCACCG 1 48662 ( 352) CGATAAGCAGCC 1 47736 ( 111) TGACAAGCTCCG 1 37859 ( 87) GGACAAGCTGCC 1 50005 ( 13) CGATAAGAACCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7335 bayes= 9.91309 E= 1.8e+003 -923 117 -45 23 -923 -923 213 -923 191 -923 -923 -923 -923 159 -923 23 191 -923 -923 -923 191 -923 -923 -923 -923 -923 213 -923 -68 191 -923 -923 91 -923 -45 23 -923 159 55 -923 -923 217 -923 -923 -923 159 55 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 1.8e+003 0.000000 0.500000 0.166667 0.333333 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.500000 0.000000 0.166667 0.333333 0.000000 0.666667 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CT]GA[CT]AAGC[AT][CG]C[CG] -------------------------------------------------------------------------------- Time 5.66 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47536 5.94e-01 500 37859 1.43e-07 86_[+3(1.58e-06)]_49_[+1(3.14e-09)]_\ 89_[+1(6.52e-06)]_226 48250 3.06e-05 247_[+1(2.73e-09)]_234 48662 9.14e-06 181_[+1(3.24e-07)]_151_\ [+3(7.84e-07)]_137 48723 2.78e-08 18_[+2(8.79e-10)]_323_\ [+1(7.12e-07)]_126 49182 5.04e-01 500 55108 4.63e-05 425_[+1(5.40e-09)]_56 49936 1.86e-03 15_[+3(8.29e-05)]_135_\ [+3(4.55e-07)]_326 50005 1.29e-06 12_[+3(2.34e-06)]_355_\ [+2(5.19e-08)]_107 40967 4.36e-05 435_[+2(1.73e-08)]_51 44121 2.66e-07 82_[+1(1.27e-07)]_9_[+3(2.32e-07)]_\ 378 43863 7.51e-01 500 47736 9.17e-03 110_[+3(9.74e-07)]_378 50503 1.06e-04 480_[+1(9.97e-09)]_1 47916 1.19e-03 254_[+1(2.09e-07)]_227 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************