******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/160/160.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11876 1.0000 500 20220 1.0000 500 22533 1.0000 500 24182 1.0000 500 261932 1.0000 500 262668 1.0000 500 268560 1.0000 500 269666 1.0000 500 31803 1.0000 500 38788 1.0000 500 6582 1.0000 500 9472 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/160/160.seqs.fa -oc motifs/160 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.232 G 0.234 T 0.266 Background letter frequencies (from dataset with add-one prior applied): A 0.267 C 0.232 G 0.234 T 0.266 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 12 llr = 122 E-value = 3.5e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 8:374:3:::2: pos.-specific C :9833a::28:8 probability G :1::::7:21:2 matrix T 2::13:1a728: bits 2.1 * 1.9 * * 1.7 * * * 1.5 * * * * Relative 1.3 *** * * ** Entropy 1.1 *** * * *** (14.7 bits) 0.8 **** *** *** 0.6 **** ******* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel ACCAACGTTCTC consensus ACC A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 11876 477 1.05e-07 TCATCTCATC ACCACCGTTCTC TCGCACAGCA 31803 75 1.17e-06 TTCAAACAAC ACCAACGTTTTC ACCGCCTTCC 262668 472 2.33e-06 TGACTGTAAC ACCACCTTTCTC TCCTATATCC 22533 322 2.57e-06 CTTGTATTGG ACAAACATTCTC ATAAGTTGTC 24182 97 4.98e-06 GGCCCTCTCG TCAAACGTTCTC TCGGAGCATC 6582 481 5.74e-06 GGTGCATGGT ACCCTCGTTCAC CTAATGCA 9472 50 8.61e-06 TGATGCACAG ACCATCGTGCTG CAGACGGTGG 268560 262 9.44e-06 TGTTCGGACG ACCAACGTCGTC TTTTGTGTGT 269666 31 1.19e-05 CAATCATCAT ACCCTCATGCTC CGATGGGACA 20220 476 2.56e-05 CCTCACACAT AGCTCCGTTCTC CACCTTATCA 38788 425 4.20e-05 CCTGCCTTCT TCAAACATCCTC TACCCTCTCC 261932 372 6.22e-05 TAGATTCTTG ACCCCCGTTTAG ATAAAAACTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11876 1.1e-07 476_[+1]_12 31803 1.2e-06 74_[+1]_414 262668 2.3e-06 471_[+1]_17 22533 2.6e-06 321_[+1]_167 24182 5e-06 96_[+1]_392 6582 5.7e-06 480_[+1]_8 9472 8.6e-06 49_[+1]_439 268560 9.4e-06 261_[+1]_227 269666 1.2e-05 30_[+1]_458 20220 2.6e-05 475_[+1]_13 38788 4.2e-05 424_[+1]_64 261932 6.2e-05 371_[+1]_117 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=12 11876 ( 477) ACCACCGTTCTC 1 31803 ( 75) ACCAACGTTTTC 1 262668 ( 472) ACCACCTTTCTC 1 22533 ( 322) ACAAACATTCTC 1 24182 ( 97) TCAAACGTTCTC 1 6582 ( 481) ACCCTCGTTCAC 1 9472 ( 50) ACCATCGTGCTG 1 268560 ( 262) ACCAACGTCGTC 1 269666 ( 31) ACCCTCATGCTC 1 20220 ( 476) AGCTCCGTTCTC 1 38788 ( 425) TCAAACATCCTC 1 261932 ( 372) ACCCCCGTTTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 8.93074 E= 3.5e-001 164 -1023 -1023 -67 -1023 198 -149 -1023 -10 169 -1023 -1023 132 11 -1023 -167 64 52 -1023 -9 -1023 211 -1023 -1023 -10 -1023 151 -167 -1023 -1023 -1023 191 -1023 -48 -49 132 -1023 169 -149 -67 -68 -1023 -1023 165 -1023 184 -49 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 3.5e-001 0.833333 0.000000 0.000000 0.166667 0.000000 0.916667 0.083333 0.000000 0.250000 0.750000 0.000000 0.000000 0.666667 0.250000 0.000000 0.083333 0.416667 0.333333 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.666667 0.083333 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.166667 0.666667 0.000000 0.750000 0.083333 0.166667 0.166667 0.000000 0.000000 0.833333 0.000000 0.833333 0.166667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- AC[CA][AC][ACT]C[GA]TTCTC -------------------------------------------------------------------------------- Time 1.71 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 6 llr = 107 E-value = 6.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 82:a8:87287333:a::237 pos.-specific C :77:2a227::752a:2a273 probability G 2:2::::2:23::3::8:7:: matrix T :22:::::2:::22::::::: bits 2.1 * * * 1.9 * * ** * 1.7 * * ** * 1.5 * * **** Relative 1.3 * **** * **** Entropy 1.1 * **** *** **** ** (25.6 bits) 0.8 ******* **** ******* 0.6 ************* ******* 0.4 ************* ******* 0.2 ************* ******* 0.0 --------------------- Multilevel ACCAACAACAACCACAGCGCA consensus GAAG AC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 262668 146 4.66e-12 ATCACCACCT ACCAACAACAACAACAGCGCC GCCACAAATC 38788 454 1.75e-09 TCCTTCAATC AACAACCACAACCACAGCACA CAAACGACAA 31803 156 1.75e-09 CCTTGATACA ACGAACAACAAACGCACCGAA TCTATATATC 261932 39 1.07e-08 ATAAACGACA ACTAACAAAGACTTCAGCGCA TTTCTTACCC 6582 47 1.31e-08 TCAAGTAACT ATCAACAGTAGAAGCAGCGCC ATGGTTCAAT 9472 459 2.84e-08 TTCGACATAA GCCACCACCAGCCCCAGCCAA GAAGCTCACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 262668 4.7e-12 145_[+2]_334 38788 1.7e-09 453_[+2]_26 31803 1.7e-09 155_[+2]_324 261932 1.1e-08 38_[+2]_441 6582 1.3e-08 46_[+2]_433 9472 2.8e-08 458_[+2]_21 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=6 262668 ( 146) ACCAACAACAACAACAGCGCC 1 38788 ( 454) AACAACCACAACCACAGCACA 1 31803 ( 156) ACGAACAACAAACGCACCGAA 1 261932 ( 39) ACTAACAAAGACTTCAGCGCA 1 6582 ( 47) ATCAACAGTAGAAGCAGCGCC 1 9472 ( 459) GCCACCACCAGCCCCAGCCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5760 bayes= 10.3532 E= 6.6e+001 164 -923 -49 -923 -68 152 -923 -67 -923 152 -49 -67 190 -923 -923 -923 164 -48 -923 -923 -923 210 -923 -923 164 -48 -923 -923 132 -48 -49 -923 -68 152 -923 -67 164 -923 -49 -923 132 -923 51 -923 32 152 -923 -923 32 111 -923 -67 32 -48 51 -67 -923 210 -923 -923 190 -923 -923 -923 -923 -48 183 -923 -923 210 -923 -923 -68 -48 151 -923 32 152 -923 -923 132 52 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 6.6e+001 0.833333 0.000000 0.166667 0.000000 0.166667 0.666667 0.000000 0.166667 0.000000 0.666667 0.166667 0.166667 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.666667 0.166667 0.166667 0.000000 0.166667 0.666667 0.000000 0.166667 0.833333 0.000000 0.166667 0.000000 0.666667 0.000000 0.333333 0.000000 0.333333 0.666667 0.000000 0.000000 0.333333 0.500000 0.000000 0.166667 0.333333 0.166667 0.333333 0.166667 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.166667 0.666667 0.000000 0.333333 0.666667 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- ACCAACAACA[AG][CA][CA][AG]CAGCG[CA][AC] -------------------------------------------------------------------------------- Time 3.23 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 12 llr = 128 E-value = 1.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :3:3:3:225:368: pos.-specific C 2:2:1::1:16::2: probability G 788:9154833821a matrix T 2::7:753:21:3:: bits 2.1 * 1.9 * 1.7 * * 1.5 * * * * Relative 1.3 ** * * * * Entropy 1.1 **** * * * * (15.4 bits) 0.8 ***** * * ** ** 0.6 ******* * ***** 0.4 ******* * ***** 0.2 *************** 0.0 --------------- Multilevel GGGTGTGGGACGAAG consensus A A ATT GGAT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 269666 145 6.57e-10 GGGAGGCGAT GGGTGTGGGACGAAG GCTATTTTGC 261932 93 7.00e-08 ATTTGGAGTT GGCTGTTGGACGAAG CTGAGGTGGT 268560 20 4.60e-07 CGCCGCGACG GGGTGATTGACGACG GGATGGAGAG 20220 223 5.92e-07 CCACTCTCGC TGGTGTTTGGGGAAG AAGAGGTAGG 31803 308 4.49e-06 CTCGCGGCAG TGGTGGTGGACGGAG AAAGTCGGGG 6582 212 5.87e-06 GAGTGGCGAG GGGTGTGGAGGATAG GCTCACTCCT 9472 190 6.39e-06 TGAACCAAGA GGGTGAGTGTGGAGG CGGAAACAGG 38788 33 8.89e-06 AGAAGTGAAG GAGAGTTTGCCAAAG GTGGATGAAC 262668 287 1.21e-05 CGCTTTGGGG GAGAGTGCGATGAAG AACGAGGAGG 22533 23 3.30e-05 GTTGTATCTT CGGAGAGAGGCAGAG AGAGATAATG 24182 59 4.14e-05 CGAGTGAGAA GAGAGTTAATGGTAG CGGTGTTTGT 11876 205 4.86e-05 AGTGACGACC CGCTCTGGGACGTCG CGAAACGCAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 269666 6.6e-10 144_[+3]_341 261932 7e-08 92_[+3]_393 268560 4.6e-07 19_[+3]_466 20220 5.9e-07 222_[+3]_263 31803 4.5e-06 307_[+3]_178 6582 5.9e-06 211_[+3]_274 9472 6.4e-06 189_[+3]_296 38788 8.9e-06 32_[+3]_453 262668 1.2e-05 286_[+3]_199 22533 3.3e-05 22_[+3]_463 24182 4.1e-05 58_[+3]_427 11876 4.9e-05 204_[+3]_281 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=12 269666 ( 145) GGGTGTGGGACGAAG 1 261932 ( 93) GGCTGTTGGACGAAG 1 268560 ( 20) GGGTGATTGACGACG 1 20220 ( 223) TGGTGTTTGGGGAAG 1 31803 ( 308) TGGTGGTGGACGGAG 1 6582 ( 212) GGGTGTGGAGGATAG 1 9472 ( 190) GGGTGAGTGTGGAGG 1 38788 ( 33) GAGAGTTTGCCAAAG 1 262668 ( 287) GAGAGTGCGATGAAG 1 22533 ( 23) CGGAGAGAGGCAGAG 1 24182 ( 59) GAGAGTTAATGGTAG 1 11876 ( 205) CGCTCTGGGACGTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 5832 bayes= 8.92184 E= 1.6e+002 -1023 -48 151 -67 -10 -1023 168 -1023 -1023 -48 183 -1023 32 -1023 -1023 132 -1023 -148 197 -1023 -10 -1023 -149 132 -1023 -1023 109 91 -68 -148 83 33 -68 -1023 183 -1023 90 -148 9 -67 -1023 133 51 -167 -10 -1023 168 -1023 112 -1023 -49 -9 149 -48 -149 -1023 -1023 -1023 209 -1023 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 12 E= 1.6e+002 0.000000 0.166667 0.666667 0.166667 0.250000 0.000000 0.750000 0.000000 0.000000 0.166667 0.833333 0.000000 0.333333 0.000000 0.000000 0.666667 0.000000 0.083333 0.916667 0.000000 0.250000 0.000000 0.083333 0.666667 0.000000 0.000000 0.500000 0.500000 0.166667 0.083333 0.416667 0.333333 0.166667 0.000000 0.833333 0.000000 0.500000 0.083333 0.250000 0.166667 0.000000 0.583333 0.333333 0.083333 0.250000 0.000000 0.750000 0.000000 0.583333 0.000000 0.166667 0.250000 0.750000 0.166667 0.083333 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[GA]G[TA]G[TA][GT][GT]G[AG][CG][GA][AT]AG -------------------------------------------------------------------------------- Time 4.74 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11876 4.86e-05 204_[+3(4.86e-05)]_257_\ [+1(1.05e-07)]_12 20220 7.02e-05 222_[+3(5.92e-07)]_181_\ [+3(8.20e-05)]_42_[+1(2.56e-05)]_13 22533 1.40e-03 22_[+3(3.30e-05)]_284_\ [+1(2.57e-06)]_167 24182 1.38e-03 58_[+3(4.14e-05)]_23_[+1(4.98e-06)]_\ 392 261932 1.90e-09 38_[+2(1.07e-08)]_33_[+3(7.00e-08)]_\ 264_[+1(6.22e-05)]_117 262668 8.06e-12 58_[+2(5.89e-05)]_66_[+2(4.66e-12)]_\ 39_[+2(6.16e-05)]_60_[+3(1.21e-05)]_170_[+1(2.33e-06)]_17 268560 7.85e-05 19_[+3(4.60e-07)]_227_\ [+1(9.44e-06)]_227 269666 1.15e-07 30_[+1(1.19e-05)]_102_\ [+3(6.57e-10)]_341 31803 4.29e-10 74_[+1(1.17e-06)]_69_[+2(1.75e-09)]_\ 62_[+2(3.89e-05)]_48_[+3(4.49e-06)]_178 38788 2.18e-08 32_[+3(8.89e-06)]_377_\ [+1(4.20e-05)]_17_[+2(1.75e-09)]_26 6582 1.54e-08 46_[+2(1.31e-08)]_144_\ [+3(5.87e-06)]_254_[+1(5.74e-06)]_8 9472 4.89e-08 49_[+1(8.61e-06)]_128_\ [+3(6.39e-06)]_204_[+2(6.16e-05)]_29_[+2(2.84e-08)]_21 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************