******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/282/282.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 8132 1.0000 500 38130 1.0000 500 14885 1.0000 500 43439 1.0000 500 7924 1.0000 500 9829 1.0000 500 9984 1.0000 500 42322 1.0000 500 33214 1.0000 500 46022 1.0000 500 47383 1.0000 500 49974 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/282/282.seqs.fa -oc motifs/282 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.228 G 0.223 T 0.281 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.228 G 0.223 T 0.281 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 7 llr = 95 E-value = 9.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :91:a3:::443631: pos.-specific C a:4::::9:631361: probability G :14a:111a::6:179 matrix T :::::69:::3:1::1 bits 2.2 * * * 1.9 * ** * 1.7 * ** * 1.5 * ** ** * Relative 1.3 ** ** *** * Entropy 1.1 ** ** **** * (19.6 bits) 0.9 ** ** **** ** 0.6 ***** **** ***** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CACGATTCGCAGACGG consensus G A ACACA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 38130 294 2.02e-08 TCTTGCGACC CACGATTCGCTGCAGG AAAACCGCAG 46022 307 4.42e-08 GCGAGGGGCA CACGAATCGCCGTCGG GCTATATATA 49974 478 4.95e-08 ATCAAGTTTT CAGGATTCGCTGACGT ATTAATT 9984 212 1.47e-07 CGTTCGCAGG CAAGAGTCGCAAACGG GTACGGTCCA 14885 17 6.61e-07 ATACAGAAAC CGGGAATCGAACCCGG GACCTACTGC 42322 360 1.05e-06 TTGTGCCCAA CAGGATTGGAAGAGCG GGACGGGAAA 33214 162 1.57e-06 ACATACCGTA CACGATGCGACAAAAG ACACGGAGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38130 2e-08 293_[+1]_191 46022 4.4e-08 306_[+1]_178 49974 5e-08 477_[+1]_7 9984 1.5e-07 211_[+1]_273 14885 6.6e-07 16_[+1]_468 42322 1.1e-06 359_[+1]_125 33214 1.6e-06 161_[+1]_323 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=7 38130 ( 294) CACGATTCGCTGCAGG 1 46022 ( 307) CACGAATCGCCGTCGG 1 49974 ( 478) CAGGATTCGCTGACGT 1 9984 ( 212) CAAGAGTCGCAAACGG 1 14885 ( 17) CGGGAATCGAACCCGG 1 42322 ( 360) CAGGATTGGAAGAGCG 1 33214 ( 162) CACGATGCGACAAAAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 10.304 E= 9.8e+002 -945 213 -945 -945 168 -945 -64 -945 -91 91 94 -945 -945 -945 216 -945 190 -945 -945 -945 9 -945 -64 102 -945 -945 -64 161 -945 191 -64 -945 -945 -945 216 -945 68 132 -945 -945 68 33 -945 2 9 -67 136 -945 109 33 -945 -97 9 132 -64 -945 -91 -67 168 -945 -945 -945 194 -97 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 9.8e+002 0.000000 1.000000 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.142857 0.428571 0.428571 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.000000 0.142857 0.571429 0.000000 0.000000 0.142857 0.857143 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 1.000000 0.000000 0.428571 0.571429 0.000000 0.000000 0.428571 0.285714 0.000000 0.285714 0.285714 0.142857 0.571429 0.000000 0.571429 0.285714 0.000000 0.142857 0.285714 0.571429 0.142857 0.000000 0.142857 0.142857 0.714286 0.000000 0.000000 0.000000 0.857143 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CA[CG]GA[TA]TCG[CA][ACT][GA][AC][CA]GG -------------------------------------------------------------------------------- Time 1.24 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 7 llr = 109 E-value = 6.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::7:441a19:44474:1:a pos.-specific C a:1:614:3:13::14:43: probability G :9:a:43::19166:::46: matrix T :11:::1:6::1::11a:1: bits 2.2 * * 1.9 * * * * 1.7 * * * * * 1.5 ** * * * * * Relative 1.3 ** * * ** * * Entropy 1.1 ** ** * ** ** * * (22.6 bits) 0.9 ***** * ** *** * * 0.6 ****** **** *** **** 0.4 ****** **** ******** 0.2 ******************** 0.0 -------------------- Multilevel CGAGCACATAGAGGAATCGA consensus AGG C CAA C GC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 8132 132 1.30e-10 ATTTGAAATA CGAGCGCACAGCAGACTGGA CCGGGCGGCG 9829 14 1.70e-08 CCTCACACCT CGAGCACAAAGAAAACTGTA TCAAAGACAA 42322 179 1.87e-08 TGTGAGAAAC CGAGAGAACAGCGGTCTCGA TCCCTCGCGG 46022 71 2.48e-08 AGAAGTAGAC CGAGACGATAGTGGATTCGA CGAGGAGTGG 47383 106 8.43e-08 TATGTGCGTG CGTGCGTATAGAAGAATACA GCTTGCTACA 38130 391 1.36e-07 GTGCTTGGAA CTCGCAGATAGAGACATCGA ACGTGTCGGG 43439 150 1.54e-07 ATCAATGCAA CGAGAACATGCGGAAATGCA GTCTCATTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8132 1.3e-10 131_[+2]_349 9829 1.7e-08 13_[+2]_467 42322 1.9e-08 178_[+2]_302 46022 2.5e-08 70_[+2]_410 47383 8.4e-08 105_[+2]_375 38130 1.4e-07 390_[+2]_90 43439 1.5e-07 149_[+2]_331 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=7 8132 ( 132) CGAGCGCACAGCAGACTGGA 1 9829 ( 14) CGAGCACAAAGAAAACTGTA 1 42322 ( 179) CGAGAGAACAGCGGTCTCGA 1 46022 ( 71) CGAGACGATAGTGGATTCGA 1 47383 ( 106) CGTGCGTATAGAAGAATACA 1 38130 ( 391) CTCGCAGATAGAGACATCGA 1 43439 ( 150) CGAGAACATGCGGAAATGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5772 bayes= 10.2921 E= 6.9e+001 -945 213 -945 -945 -945 -945 194 -97 141 -67 -945 -97 -945 -945 216 -945 68 132 -945 -945 68 -67 94 -945 -91 91 36 -97 190 -945 -945 -945 -91 33 -945 102 168 -945 -64 -945 -945 -67 194 -945 68 33 -64 -97 68 -945 136 -945 68 -945 136 -945 141 -67 -945 -97 68 91 -945 -97 -945 -945 -945 183 -91 91 94 -945 -945 33 136 -97 190 -945 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 7 E= 6.9e+001 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.714286 0.142857 0.000000 0.142857 0.000000 0.000000 1.000000 0.000000 0.428571 0.571429 0.000000 0.000000 0.428571 0.142857 0.428571 0.000000 0.142857 0.428571 0.285714 0.142857 1.000000 0.000000 0.000000 0.000000 0.142857 0.285714 0.000000 0.571429 0.857143 0.000000 0.142857 0.000000 0.000000 0.142857 0.857143 0.000000 0.428571 0.285714 0.142857 0.142857 0.428571 0.000000 0.571429 0.000000 0.428571 0.000000 0.571429 0.000000 0.714286 0.142857 0.000000 0.142857 0.428571 0.428571 0.000000 0.142857 0.000000 0.000000 0.000000 1.000000 0.142857 0.428571 0.428571 0.000000 0.000000 0.285714 0.571429 0.142857 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CGAG[CA][AG][CG]A[TC]AG[AC][GA][GA]A[AC]T[CG][GC]A -------------------------------------------------------------------------------- Time 2.48 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 11 llr = 121 E-value = 9.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::53:62324633a9 pos.-specific C 13:443:::21:2::1 probability G 5:a1:73:735:47:: matrix T 57::4:18:4:42::: bits 2.2 * 1.9 * * 1.7 * * 1.5 * ** Relative 1.3 * * * *** Entropy 1.1 ** * ** *** (15.9 bits) 0.9 ** * ** * *** 0.6 **** **** ** *** 0.4 ********* ** *** 0.2 ********* ** *** 0.0 ---------------- Multilevel GTGACGATGTGAGGAA consensus TC CTCG AGATAA sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 33214 346 1.49e-09 CGCCTACGTA GTGATGATGGGAGGAA AAGTAAACGA 7924 112 1.83e-08 GTGAAGTTAC TTGACGATGTGTGGAA CATACCTTGA 38130 341 7.21e-07 CGGGGGTTTC TCGCTGATAGGAGGAA ACTTGTTCCG 9984 274 8.39e-07 TGACATCAAA TCGACGATGCAACGAA GTAGAACCAC 14885 481 1.73e-06 ACTACTCTCT GTGAAGATGAATTGAA ATCG 42322 162 6.16e-06 TCGCCGACGC CTGACCGTGTGAGAAA CCGAGAGAAC 43439 67 9.80e-06 GTAGATTAGA GTGGAGGTGGAAAAAA TCAAGGCGTC 49974 221 1.29e-05 GTTGGGATTA GTGCCGAAGTGTCGAC AATATTCTTG 46022 157 1.87e-05 ATGAGACAAT TTGATCATACCATGAA CTGTGCACTG 9829 362 2.33e-05 CTAGGGAGAA GCGCACGAGTGAAAAA ATAATGTCCA 47383 311 2.58e-05 GCAAGTATAT TTGCTGTTAAATAGAA AGTAAAGTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 33214 1.5e-09 345_[+3]_139 7924 1.8e-08 111_[+3]_373 38130 7.2e-07 340_[+3]_144 9984 8.4e-07 273_[+3]_211 14885 1.7e-06 480_[+3]_4 42322 6.2e-06 161_[+3]_323 43439 9.8e-06 66_[+3]_418 49974 1.3e-05 220_[+3]_264 46022 1.9e-05 156_[+3]_328 9829 2.3e-05 361_[+3]_123 47383 2.6e-05 310_[+3]_174 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=11 33214 ( 346) GTGATGATGGGAGGAA 1 7924 ( 112) TTGACGATGTGTGGAA 1 38130 ( 341) TCGCTGATAGGAGGAA 1 9984 ( 274) TCGACGATGCAACGAA 1 14885 ( 481) GTGAAGATGAATTGAA 1 42322 ( 162) CTGACCGTGTGAGAAA 1 43439 ( 67) GTGGAGGTGGAAAAAA 1 49974 ( 221) GTGCCGAAGTGTCGAC 1 46022 ( 157) TTGATCATACCATGAA 1 9829 ( 362) GCGCACGAGTGAAAAA 1 47383 ( 311) TTGCTGTTAAATAGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 9.40033 E= 9.2e+002 -1010 -132 103 69 -1010 26 -1010 137 -1010 -1010 216 -1010 102 67 -129 -1010 2 67 -1010 37 -1010 26 171 -1010 125 -1010 29 -163 -56 -1010 -1010 154 2 -1010 171 -1010 -56 -33 29 37 44 -132 129 -1010 125 -1010 -1010 37 2 -33 71 -63 2 -1010 171 -1010 190 -1010 -1010 -1010 176 -132 -1010 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 9.2e+002 0.000000 0.090909 0.454545 0.454545 0.000000 0.272727 0.000000 0.727273 0.000000 0.000000 1.000000 0.000000 0.545455 0.363636 0.090909 0.000000 0.272727 0.363636 0.000000 0.363636 0.000000 0.272727 0.727273 0.000000 0.636364 0.000000 0.272727 0.090909 0.181818 0.000000 0.000000 0.818182 0.272727 0.000000 0.727273 0.000000 0.181818 0.181818 0.272727 0.363636 0.363636 0.090909 0.545455 0.000000 0.636364 0.000000 0.000000 0.363636 0.272727 0.181818 0.363636 0.181818 0.272727 0.000000 0.727273 0.000000 1.000000 0.000000 0.000000 0.000000 0.909091 0.090909 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GT][TC]G[AC][CTA][GC][AG]T[GA][TG][GA][AT][GA][GA]AA -------------------------------------------------------------------------------- Time 3.67 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8132 8.03e-06 131_[+2(1.30e-10)]_349 38130 1.02e-10 293_[+1(2.02e-08)]_31_\ [+3(7.21e-07)]_34_[+2(1.36e-07)]_90 14885 2.12e-05 16_[+1(6.61e-07)]_448_\ [+3(1.73e-06)]_4 43439 1.15e-05 66_[+3(9.80e-06)]_67_[+2(1.54e-07)]_\ 331 7924 1.23e-05 111_[+3(1.83e-08)]_373 9829 8.10e-06 13_[+2(1.70e-08)]_328_\ [+3(2.33e-05)]_123 9984 3.24e-06 211_[+1(1.47e-07)]_46_\ [+3(8.39e-07)]_211 42322 4.64e-09 161_[+3(6.16e-06)]_1_[+2(1.87e-08)]_\ 161_[+1(1.05e-06)]_125 33214 3.52e-08 161_[+1(1.57e-06)]_168_\ [+3(1.49e-09)]_139 46022 8.95e-10 70_[+2(2.48e-08)]_66_[+3(1.87e-05)]_\ 134_[+1(4.42e-08)]_178 47383 5.52e-06 105_[+2(8.43e-08)]_185_\ [+3(2.58e-05)]_174 49974 1.22e-05 220_[+3(1.29e-05)]_241_\ [+1(4.95e-08)]_7 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************