******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/248/248.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11860 1.0000 500 12022 1.0000 500 1479 1.0000 500 20796 1.0000 500 23085 1.0000 500 23254 1.0000 500 25135 1.0000 500 263240 1.0000 500 37040 1.0000 500 3890 1.0000 500 4948 1.0000 500 7076 1.0000 500 8006 1.0000 500 9514 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/248/248.seqs.fa -oc motifs/248 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.264 C 0.228 G 0.234 T 0.274 Background letter frequencies (from dataset with add-one prior applied): A 0.264 C 0.228 G 0.234 T 0.274 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 12 llr = 152 E-value = 3.7e-005 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 886338:3a276255: pos.-specific C 224753a3:82:74:9 probability G ::::2::4::232151 matrix T :::::::::::1:::: bits 2.1 * 1.9 * * 1.7 * * * 1.5 * ** * Relative 1.3 ** * ** * Entropy 1.1 **** ** ** ** (18.2 bits) 0.9 **** ** ** * ** 0.6 ******* ******** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel AAACCACGACAACAAC consensus CAAC A G CG sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 37040 211 3.11e-09 CCCTCGTCAA AAACCACAACAACAAC CAAACGCAGC 23085 469 6.92e-08 AGCTATTCAC AAACACCGACAGCCGC AATCCACCTG 23254 471 7.93e-08 CTCAACAACA AAACCACCACAAAAGC ACCAGCCTGC 1479 429 1.05e-07 AGGGAAAATC AAAACACGACAAGAGC CAACCTCTCT 20796 414 1.96e-07 TCCACACACA AAACGACGACGACAAC GACGACGGAA 11860 43 7.59e-07 AACATTGAAC AACAAACAACAGGCAC CAACATCTCC 4948 236 9.92e-07 TTCCGTTCTC AACCAACCACAGCAGG CTCCACCAAA 263240 459 1.73e-06 CGTATCCTCA AACCACCGACGACGGC GTCTGCCGCC 25135 314 1.86e-06 ACGTCGTCAC CAAACACGAAAGCAAC GTGAACGTCC 9514 62 2.15e-06 ACCTCCTCCT CCACCACCACCACCAC AAGAGGACGG 3890 413 3.19e-06 GCAGCTCCCC AACCGCCAACCAACGC TCCGATACTT 12022 119 7.09e-06 ACTCAACAAC ACCACACAAAATCCAC ATACTGCCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37040 3.1e-09 210_[+1]_274 23085 6.9e-08 468_[+1]_16 23254 7.9e-08 470_[+1]_14 1479 1e-07 428_[+1]_56 20796 2e-07 413_[+1]_71 11860 7.6e-07 42_[+1]_442 4948 9.9e-07 235_[+1]_249 263240 1.7e-06 458_[+1]_26 25135 1.9e-06 313_[+1]_171 9514 2.1e-06 61_[+1]_423 3890 3.2e-06 412_[+1]_72 12022 7.1e-06 118_[+1]_366 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=12 37040 ( 211) AAACCACAACAACAAC 1 23085 ( 469) AAACACCGACAGCCGC 1 23254 ( 471) AAACCACCACAAAAGC 1 1479 ( 429) AAAACACGACAAGAGC 1 20796 ( 414) AAACGACGACGACAAC 1 11860 ( 43) AACAAACAACAGGCAC 1 4948 ( 236) AACCAACCACAGCAGG 1 263240 ( 459) AACCACCGACGACGGC 1 25135 ( 314) CAAACACGAAAGCAAC 1 9514 ( 62) CCACCACCACCACCAC 1 3890 ( 413) AACCGCCAACCAACGC 1 12022 ( 119) ACCACACAAAATCCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 10.2426 E= 3.7e-005 166 -45 -1023 -1023 166 -45 -1023 -1023 114 87 -1023 -1023 34 155 -1023 -1023 34 113 -49 -1023 151 13 -1023 -1023 -1023 213 -1023 -1023 34 13 83 -1023 192 -1023 -1023 -1023 -66 187 -1023 -1023 134 -45 -49 -1023 114 -1023 51 -172 -66 155 -49 -1023 92 87 -149 -1023 92 -1023 109 -1023 -1023 201 -149 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 12 E= 3.7e-005 0.833333 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.583333 0.416667 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.333333 0.500000 0.166667 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.250000 0.416667 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.666667 0.166667 0.166667 0.000000 0.583333 0.000000 0.333333 0.083333 0.166667 0.666667 0.166667 0.000000 0.500000 0.416667 0.083333 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.916667 0.083333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- AA[AC][CA][CA][AC]C[GAC]ACA[AG]C[AC][AG]C -------------------------------------------------------------------------------- Time 1.71 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 11 llr = 145 E-value = 7.8e-005 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :3:24229:159:75: pos.-specific C 111:5:::::::2:3: probability G 91681881a95:8227 matrix T :53:1::::::1:1:3 bits 2.1 * 1.9 * 1.7 * ** 1.5 * *** ** Relative 1.3 * * ***** ** * Entropy 1.1 * * ******** * (19.0 bits) 0.9 * ** ********* * 0.6 * ** *********** 0.4 * ************** 0.2 **************** 0.0 ---------------- Multilevel GTGGCGGAGGAAGAAG consensus AT A G CT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 23254 345 3.18e-10 GCAAATGGTT GTGGCGGAGGAAGAAG GAAGTGAAAG 20796 159 2.44e-08 GATCTCGTTC GTGGAGGAGGGAGAGT ATTGGTGGCA 263240 105 3.61e-08 CTATTCGGAG GTTGAGGAGGGAGGAG GTGTGGATGA 25135 452 6.94e-08 GCTCGCTTGC GACGCGGAGGGAGACG GCTTGGAGTT 7076 438 1.95e-07 ATCTTTTGTA GAGGCGAAGGAAGACT CTAGTTGAGA 8006 256 2.83e-07 TACGGAACTA CTGGAAGAGGAAGAAG TTTTGTCTGT 9514 116 1.10e-06 AGATGGGGGA GATGAGGGGGGAGACT CGAAAAGACA 23085 355 1.18e-06 ATATCGTATG GTGACGAAGGAACAGG GATAGGAAGG 4948 22 1.27e-06 GATTGCTGGA GGTGGAGAGGAAGAAG GATTGTCGTC 1479 174 2.93e-06 ACTTCGTAGG GTGGTGGAGAGACGAG TACTTCTACT 12022 190 4.62e-06 GATGCGAGGT GCGACGGAGGATGTAG CGGGATACTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 23254 3.2e-10 344_[+2]_140 20796 2.4e-08 158_[+2]_326 263240 3.6e-08 104_[+2]_380 25135 6.9e-08 451_[+2]_33 7076 2e-07 437_[+2]_47 8006 2.8e-07 255_[+2]_229 9514 1.1e-06 115_[+2]_369 23085 1.2e-06 354_[+2]_130 4948 1.3e-06 21_[+2]_463 1479 2.9e-06 173_[+2]_311 12022 4.6e-06 189_[+2]_295 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=11 23254 ( 345) GTGGCGGAGGAAGAAG 1 20796 ( 159) GTGGAGGAGGGAGAGT 1 263240 ( 105) GTTGAGGAGGGAGGAG 1 25135 ( 452) GACGCGGAGGGAGACG 1 7076 ( 438) GAGGCGAAGGAAGACT 1 8006 ( 256) CTGGAAGAGGAAGAAG 1 9514 ( 116) GATGAGGGGGGAGACT 1 23085 ( 355) GTGACGAAGGAACAGG 1 4948 ( 22) GGTGGAGAGGAAGAAG 1 1479 ( 174) GTGGTGGAGAGACGAG 1 12022 ( 190) GCGACGGAGGATGTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 9.62303 E= 7.8e-005 -1010 -132 196 -1010 5 -132 -136 99 -1010 -132 144 -1 -54 -1010 180 -1010 46 100 -136 -159 -54 -1010 180 -1010 -54 -1010 180 -1010 178 -1010 -136 -1010 -1010 -1010 209 -1010 -153 -1010 196 -1010 105 -1010 96 -1010 178 -1010 -1010 -159 -1010 -33 180 -1010 146 -1010 -37 -159 105 26 -37 -1010 -1010 -1010 163 -1 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 7.8e-005 0.000000 0.090909 0.909091 0.000000 0.272727 0.090909 0.090909 0.545455 0.000000 0.090909 0.636364 0.272727 0.181818 0.000000 0.818182 0.000000 0.363636 0.454545 0.090909 0.090909 0.181818 0.000000 0.818182 0.000000 0.181818 0.000000 0.818182 0.000000 0.909091 0.000000 0.090909 0.000000 0.000000 0.000000 1.000000 0.000000 0.090909 0.000000 0.909091 0.000000 0.545455 0.000000 0.454545 0.000000 0.909091 0.000000 0.000000 0.090909 0.000000 0.181818 0.818182 0.000000 0.727273 0.000000 0.181818 0.090909 0.545455 0.272727 0.181818 0.000000 0.000000 0.000000 0.727273 0.272727 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[TA][GT]G[CA]GGAGG[AG]AGA[AC][GT] -------------------------------------------------------------------------------- Time 3.48 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 9 llr = 116 E-value = 3.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :64:a3:1124:::7: pos.-specific C a31a::7:87:899:7 probability G ::2::21:::::1:22 matrix T :12::4291162:111 bits 2.1 * * 1.9 * ** 1.7 * ** ** 1.5 * ** ** Relative 1.3 * ** * *** Entropy 1.1 * ** ** *** (18.5 bits) 0.9 * ** ******** * 0.6 ** ** ********** 0.4 ** ************* 0.2 **************** 0.0 ---------------- Multilevel CAACATCTCCTCCCAC consensus CG AT AAT GG sequence T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 11860 152 1.66e-10 CAACAGGCTC CAACATCTCCTCCCAC CACAAGTGAT 20796 332 1.53e-07 AAACACAAAA CATCAACTCCTTCCAG AGTGACTTTC 3890 231 1.88e-07 TTCGAAAATG CAGCATTTCCTCCCTC TTTCTTTGAC 8006 388 2.78e-07 CTCTAGCTAA CCTCAGCTCAACCCGC AGCGCTTGAG 4948 209 5.27e-07 TTTCCACTCC CTCCAACTCCTTCCAC TTTCCGTTCT 12022 41 1.46e-06 CTCATTCACG CCACAACTACTCCCGT CACCTCCTCC 9514 475 1.80e-06 TGCTAACACA CCACATGTTAACCCAC AGGACACAAG 23254 233 2.06e-06 CAAGACACTT CAACAGTACTACCCAC TATTCTCACT 7076 121 2.69e-06 GACAAAGGCA CAGCATCTCCACGTAG AATCACCGTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11860 1.7e-10 151_[+3]_333 20796 1.5e-07 331_[+3]_153 3890 1.9e-07 230_[+3]_254 8006 2.8e-07 387_[+3]_97 4948 5.3e-07 208_[+3]_276 12022 1.5e-06 40_[+3]_444 9514 1.8e-06 474_[+3]_10 23254 2.1e-06 232_[+3]_252 7076 2.7e-06 120_[+3]_364 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=9 11860 ( 152) CAACATCTCCTCCCAC 1 20796 ( 332) CATCAACTCCTTCCAG 1 3890 ( 231) CAGCATTTCCTCCCTC 1 8006 ( 388) CCTCAGCTCAACCCGC 1 4948 ( 209) CTCCAACTCCTTCCAC 1 12022 ( 41) CCACAACTACTCCCGT 1 9514 ( 475) CCACATGTTAACCCAC 1 23254 ( 233) CAACAGTACTACCCAC 1 7076 ( 121) CAGCATCTCCACGTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 10.4062 E= 3.9e+001 -982 213 -982 -982 107 55 -982 -130 75 -103 -8 -30 -982 213 -982 -982 192 -982 -982 -982 34 -982 -8 70 -982 155 -107 -30 -124 -982 -982 170 -124 177 -982 -130 -25 155 -982 -130 75 -982 -982 102 -982 177 -982 -30 -982 196 -107 -982 -982 196 -982 -130 134 -982 -8 -130 -982 155 -8 -130 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 3.9e+001 0.000000 1.000000 0.000000 0.000000 0.555556 0.333333 0.000000 0.111111 0.444444 0.111111 0.222222 0.222222 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.000000 0.222222 0.444444 0.000000 0.666667 0.111111 0.222222 0.111111 0.000000 0.000000 0.888889 0.111111 0.777778 0.000000 0.111111 0.222222 0.666667 0.000000 0.111111 0.444444 0.000000 0.000000 0.555556 0.000000 0.777778 0.000000 0.222222 0.000000 0.888889 0.111111 0.000000 0.000000 0.888889 0.000000 0.111111 0.666667 0.000000 0.222222 0.111111 0.000000 0.666667 0.222222 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[AC][AGT]CA[TAG][CT]TC[CA][TA][CT]CC[AG][CG] -------------------------------------------------------------------------------- Time 5.25 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11860 9.06e-09 42_[+1(7.59e-07)]_[+3(2.06e-06)]_61_\ [+1(3.69e-05)]_[+3(1.66e-10)]_333 12022 1.09e-06 40_[+3(1.46e-06)]_62_[+1(7.09e-06)]_\ 55_[+2(4.62e-06)]_295 1479 8.96e-06 173_[+2(2.93e-06)]_239_\ [+1(1.05e-07)]_56 20796 4.05e-11 158_[+2(2.44e-08)]_157_\ [+3(1.53e-07)]_66_[+1(1.96e-07)]_71 23085 3.06e-06 354_[+2(1.18e-06)]_98_\ [+1(6.92e-08)]_16 23254 3.38e-12 10_[+2(7.33e-05)]_206_\ [+3(2.06e-06)]_96_[+2(3.18e-10)]_110_[+1(7.93e-08)]_14 25135 4.57e-06 313_[+1(1.86e-06)]_122_\ [+2(6.94e-08)]_5_[+2(3.19e-05)]_12 263240 5.39e-08 104_[+2(3.61e-08)]_317_\ [+3(2.78e-05)]_5_[+1(1.73e-06)]_26 37040 1.10e-05 210_[+1(3.11e-09)]_274 3890 1.81e-05 230_[+3(1.88e-07)]_166_\ [+1(3.19e-06)]_72 4948 2.24e-08 21_[+2(1.27e-06)]_171_\ [+3(5.27e-07)]_11_[+1(9.92e-07)]_249 7076 1.34e-05 120_[+3(2.69e-06)]_301_\ [+2(1.95e-07)]_47 8006 2.88e-06 255_[+2(2.83e-07)]_116_\ [+3(2.78e-07)]_97 9514 1.23e-07 61_[+1(2.15e-06)]_38_[+2(1.10e-06)]_\ 343_[+3(1.80e-06)]_10 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************