******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/89/89.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 43019 1.0000 500 2782 1.0000 500 46344 1.0000 500 46526 1.0000 500 46692 1.0000 500 47987 1.0000 500 43552 1.0000 500 43654 1.0000 500 43692 1.0000 500 43697 1.0000 500 39523 1.0000 500 39921 1.0000 500 49626 1.0000 500 41109 1.0000 500 41265 1.0000 500 33661 1.0000 500 44421 1.0000 500 12311 1.0000 500 31881 1.0000 500 37027 1.0000 500 39891 1.0000 500 37868 1.0000 500 49070 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/89/89.seqs.fa -oc motifs/89 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 23 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 11500 N= 23 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.273 C 0.236 G 0.222 T 0.269 Background letter frequencies (from dataset with add-one prior applied): A 0.273 C 0.236 G 0.222 T 0.269 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 14 llr = 155 E-value = 1.9e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1:::a361::83 pos.-specific C :::::6::1::1 probability G :7:a:117:a:5 matrix T 93a:::429:21 bits 2.2 * * 2.0 *** * 1.7 *** * 1.5 *** ** Relative 1.3 ***** ** Entropy 1.1 ***** **** (16.0 bits) 0.9 ***** **** 0.7 *********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGTGACAGTGAG consensus T ATT TA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 46692 280 5.04e-08 TTTGACTGAC TGTGACAGTGAG ACCACTCCGT 44421 418 1.62e-07 ATGAATTCAC TGTGACAGTGAA TAACCCCTGC 39921 352 1.62e-07 TAGTATCTGA TGTGACAGTGAA TTAACTGTAA 43697 357 6.19e-07 CATGAATTTG TGTGACAGTGAT ACAGAAAACG 43692 14 6.79e-07 CACCAACAGG TTTGACTGTGAG AGAGCCTACT 46344 166 1.20e-06 ATGAGATGTG TTTGAAAGTGAG ATACCTAGCT 39523 113 1.85e-06 ATTGACAAGA TGTGAAATTGAG AATAATAGCT 43019 151 3.93e-06 TAGTCTGTGA ATTGACAGTGAG TCGTCAAACA 37027 267 4.26e-06 ACGGACTGTG TGTGAGTGTGTG TGTGTGAGAG 43654 311 7.92e-06 ATAAGCTCAC TGTGACTATGAA CAACACTCCG 47987 91 1.46e-05 GTGTGTTGCA AGTGAGAGTGTG TGTTGTGGTA 37868 48 2.22e-05 TCCCTCCCTT TTTGAATGTGTT TCAAACAATT 43552 15 2.49e-05 CTAGTAGTGT TGTGACGTTGAC TGATCCTCGT 41109 349 3.55e-05 ACAACCTGCA TGTGAATTCGAA ACGGTGACCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46692 5e-08 279_[+1]_209 44421 1.6e-07 417_[+1]_71 39921 1.6e-07 351_[+1]_137 43697 6.2e-07 356_[+1]_132 43692 6.8e-07 13_[+1]_475 46344 1.2e-06 165_[+1]_323 39523 1.8e-06 112_[+1]_376 43019 3.9e-06 150_[+1]_338 37027 4.3e-06 266_[+1]_222 43654 7.9e-06 310_[+1]_178 47987 1.5e-05 90_[+1]_398 37868 2.2e-05 47_[+1]_441 43552 2.5e-05 14_[+1]_474 41109 3.6e-05 348_[+1]_140 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=14 46692 ( 280) TGTGACAGTGAG 1 44421 ( 418) TGTGACAGTGAA 1 39921 ( 352) TGTGACAGTGAA 1 43697 ( 357) TGTGACAGTGAT 1 43692 ( 14) TTTGACTGTGAG 1 46344 ( 166) TTTGAAAGTGAG 1 39523 ( 113) TGTGAAATTGAG 1 43019 ( 151) ATTGACAGTGAG 1 37027 ( 267) TGTGAGTGTGTG 1 43654 ( 311) TGTGACTATGAA 1 47987 ( 91) AGTGAGAGTGTG 1 37868 ( 48) TTTGAATGTGTT 1 43552 ( 15) TGTGACGTTGAC 1 41109 ( 349) TGTGAATTCGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 11247 bayes= 10.2544 E= 1.9e-003 -93 -1045 -1045 167 -1045 -1045 169 9 -1045 -1045 -1045 189 -1045 -1045 217 -1045 187 -1045 -1045 -1045 7 127 -63 -1045 107 -1045 -163 41 -193 -1045 169 -33 -1045 -172 -1045 179 -1045 -1045 217 -1045 152 -1045 -1045 -33 7 -172 117 -91 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 1.9e-003 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 0.714286 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.571429 0.142857 0.000000 0.571429 0.000000 0.071429 0.357143 0.071429 0.000000 0.714286 0.214286 0.000000 0.071429 0.000000 0.928571 0.000000 0.000000 1.000000 0.000000 0.785714 0.000000 0.000000 0.214286 0.285714 0.071429 0.500000 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[GT]TGA[CA][AT][GT]TG[AT][GA] -------------------------------------------------------------------------------- Time 4.70 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 5 llr = 94 E-value = 2.9e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::44::::4a6::2::862: pos.-specific C 4:a224::8:::4::a42222 probability G 6a:24:aa24:::a8:4:268 matrix T :::2:6:::2:46:::2:::: bits 2.2 ** ** * * 2.0 ** ** * * * 1.7 ** ** * * * 1.5 ** ** * * * * Relative 1.3 ** *** * *** * Entropy 1.1 *** **** * **** * * (27.1 bits) 0.9 *** **** ****** * * 0.7 *** **** *********** 0.4 *** ***************** 0.2 *** ***************** 0.0 --------------------- Multilevel GGCAATGGCAAATGGCCAAGG consensus C CGC GG TC A GCCAC sequence GC T T GC T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 12311 339 1.51e-10 GCAAGCCCGC GGCAGCGGCAAATGGCCAAGC TACGAAGGTT 39523 207 5.54e-10 TGAATCGCTA CGCTATGGCGAACGGCCAACG AGGACAGCTT 43697 123 1.04e-09 TCCGCCCGCT CGCGGTGGCTATCGGCTAAGG CTTCGGTCAG 46692 437 2.33e-09 ACGGTCGGTC GGCAATGGGAAATGGCGAGAG CATTTGGCAC 37868 411 9.91e-09 TCCAATCGTA GGCCCCGGCGATTGACGCCGG CGCGCCGAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12311 1.5e-10 338_[+2]_141 39523 5.5e-10 206_[+2]_273 43697 1e-09 122_[+2]_357 46692 2.3e-09 436_[+2]_43 37868 9.9e-09 410_[+2]_69 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=5 12311 ( 339) GGCAGCGGCAAATGGCCAAGC 1 39523 ( 207) CGCTATGGCGAACGGCCAACG 1 43697 ( 123) CGCGGTGGCTATCGGCTAAGG 1 46692 ( 437) GGCAATGGGAAATGGCGAGAG 1 37868 ( 411) GGCCCCGGCGATTGACGCCGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 11040 bayes= 11.3595 E= 2.9e+003 -897 76 144 -897 -897 -897 217 -897 -897 208 -897 -897 55 -24 -15 -43 55 -24 85 -897 -897 76 -897 115 -897 -897 217 -897 -897 -897 217 -897 -897 176 -15 -897 55 -897 85 -43 187 -897 -897 -897 113 -897 -897 57 -897 76 -897 115 -897 -897 217 -897 -45 -897 185 -897 -897 208 -897 -897 -897 76 85 -43 155 -24 -897 -897 113 -24 -15 -897 -45 -24 144 -897 -897 -24 185 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 2.9e+003 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.200000 0.200000 0.200000 0.400000 0.200000 0.400000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.400000 0.000000 0.400000 0.200000 1.000000 0.000000 0.000000 0.000000 0.600000 0.000000 0.000000 0.400000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.400000 0.200000 0.800000 0.200000 0.000000 0.000000 0.600000 0.200000 0.200000 0.000000 0.200000 0.200000 0.600000 0.000000 0.000000 0.200000 0.800000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GC]GC[ACGT][AGC][TC]GG[CG][AGT]A[AT][TC]G[GA]C[CGT][AC][ACG][GAC][GC] -------------------------------------------------------------------------------- Time 8.94 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 5 llr = 92 E-value = 3.3e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 226::8::4::::a22::4: pos.-specific C 8:::a:::68::::222::: probability G :648::68::::a:2:8::8 matrix T :2:2:242:2aa::46:a62 bits 2.2 * * 2.0 * **** * 1.7 * **** * 1.5 * **** ** Relative 1.3 * ** * ***** ** * Entropy 1.1 * ************ ** * (26.6 bits) 0.9 * ************ **** 0.7 ************** ***** 0.4 ************** ***** 0.2 ************** ***** 0.0 -------------------- Multilevel CGAGCAGGCCTTGATTGTTG consensus AAGT TTTAT AAC AT sequence T CC G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 47987 10 5.55e-11 GTTGGGTCG CGAGCATGCCTTGAATGTAG TGCTTGCCGT 46526 223 4.41e-10 TCGCACGATT CGGGCATGACTTGACCGTTG GTGGGGATTC 39523 423 2.31e-09 CATTTCAATC CTAGCAGTCCTTGATTGTTT CTCAGTGTCC 43697 319 3.67e-09 TCCCTTCTCA CAGTCAGGCCTTGAGAGTTG CTCGAGTTCA 43552 345 1.86e-08 CCAGATTCGT AGAGCTGGATTTGATTCTAG TCTCTTGAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47987 5.6e-11 9_[+3]_471 46526 4.4e-10 222_[+3]_258 39523 2.3e-09 422_[+3]_58 43697 3.7e-09 318_[+3]_162 43552 1.9e-08 344_[+3]_136 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=5 47987 ( 10) CGAGCATGCCTTGAATGTAG 1 46526 ( 223) CGGGCATGACTTGACCGTTG 1 39523 ( 423) CTAGCAGTCCTTGATTGTTT 1 43697 ( 319) CAGTCAGGCCTTGAGAGTTG 1 43552 ( 345) AGAGCTGGATTTGATTCTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 11063 bayes= 10.545 E= 3.3e+003 -45 176 -897 -897 -45 -897 144 -43 113 -897 85 -897 -897 -897 185 -43 -897 208 -897 -897 155 -897 -897 -43 -897 -897 144 57 -897 -897 185 -43 55 134 -897 -897 -897 176 -897 -43 -897 -897 -897 189 -897 -897 -897 189 -897 -897 217 -897 187 -897 -897 -897 -45 -24 -15 57 -45 -24 -897 115 -897 -24 185 -897 -897 -897 -897 189 55 -897 -897 115 -897 -897 185 -43 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 5 E= 3.3e+003 0.200000 0.800000 0.000000 0.000000 0.200000 0.000000 0.600000 0.200000 0.600000 0.000000 0.400000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.800000 0.200000 0.400000 0.600000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.200000 0.200000 0.400000 0.200000 0.200000 0.000000 0.600000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.000000 1.000000 0.400000 0.000000 0.000000 0.600000 0.000000 0.000000 0.800000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA][GAT][AG][GT]C[AT][GT][GT][CA][CT]TTGA[TACG][TAC][GC]T[TA][GT] -------------------------------------------------------------------------------- Time 13.50 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43019 2.27e-02 150_[+1(3.93e-06)]_338 2782 4.31e-01 500 46344 1.64e-03 165_[+1(1.20e-06)]_102_\ [+1(7.92e-06)]_177_[+3(9.50e-05)]_12 46526 1.24e-05 222_[+3(4.41e-10)]_258 46692 4.19e-09 279_[+1(5.04e-08)]_145_\ [+2(2.33e-09)]_43 47987 2.77e-08 9_[+3(5.55e-11)]_46_[+1(4.15e-05)]_\ 3_[+1(1.46e-05)]_398 43552 9.91e-06 14_[+1(2.49e-05)]_318_\ [+3(1.86e-08)]_136 43654 5.07e-03 310_[+1(7.92e-06)]_151_\ [+3(8.19e-05)]_7 43692 2.50e-03 13_[+1(6.79e-07)]_475 43697 1.82e-13 122_[+2(1.04e-09)]_175_\ [+3(3.67e-09)]_18_[+1(6.19e-07)]_132 39523 1.81e-13 112_[+1(1.85e-06)]_82_\ [+2(5.54e-10)]_195_[+3(2.31e-09)]_58 39921 1.30e-03 351_[+1(1.62e-07)]_25_\ [+1(9.37e-05)]_100 49626 6.75e-01 500 41109 1.75e-01 348_[+1(3.55e-05)]_140 41265 6.19e-01 500 33661 4.52e-01 500 44421 3.89e-04 417_[+1(1.62e-07)]_71 12311 6.77e-07 27_[+1(9.23e-05)]_299_\ [+2(1.51e-10)]_141 31881 4.31e-01 500 37027 1.09e-02 266_[+1(4.26e-06)]_222 39891 1.37e-01 86_[+3(7.26e-05)]_394 37868 1.12e-06 47_[+1(2.22e-05)]_351_\ [+2(9.91e-09)]_69 49070 6.34e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************