******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/51/51.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42694 1.0000 500 43023 1.0000 500 47268 1.0000 500 47271 1.0000 500 47845 1.0000 500 43350 1.0000 500 32833 1.0000 500 39918 1.0000 500 43869 1.0000 500 43990 1.0000 500 50622 1.0000 500 6635 1.0000 500 45391 1.0000 500 20192 1.0000 500 42003 1.0000 500 27696 1.0000 500 47546 1.0000 500 47570 1.0000 500 47574 1.0000 500 47972 1.0000 500 50241 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/51/51.seqs.fa -oc motifs/51 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 21 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 10500 N= 21 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.236 G 0.223 T 0.270 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.236 G 0.223 T 0.270 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 121 E-value = 7.0e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 23:3::a75:237:::::a88 pos.-specific C :3a3a::3585::a:3a3:2: probability G 8::::a:::2:::::::7::: matrix T :3:3::::::373:a7::::2 bits 2.2 * ** * * 1.9 * *** ** * * 1.7 * *** ** * * 1.5 * * *** * ** * * Relative 1.3 * * *** * ** ***** Entropy 1.1 * * ****** ******** (29.0 bits) 0.9 * * ****** ********** 0.6 * * ****** ********** 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel GACACGAAACCTACTTCGAAA consensus C C CC TAT C C sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 47574 146 1.96e-13 TATCATATTA GCCCCGAACCCTACTTCGAAA AGATAGTTTT 47570 89 1.96e-13 TATCATATTA GCCCCGAACCCTACTTCGAAA AGATGGTTTT 39918 116 3.13e-10 ACTCCCACTT GTCACGAAAGTTACTTCGAAA TGATGAGCTT 50241 342 1.00e-09 GAACCCAAGA GTCTCGACACAAACTTCCAAA ACTTCATAGT 45391 217 1.38e-09 TCTTCGGCTA AACACGAAACCATCTCCGAAA TACGTTATGT 43350 449 4.49e-09 CACATTATTA GACTCGACCCTTTCTCCCACT GTCAATACAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47574 2e-13 145_[+1]_334 47570 2e-13 88_[+1]_391 39918 3.1e-10 115_[+1]_364 50241 1e-09 341_[+1]_138 45391 1.4e-09 216_[+1]_263 43350 4.5e-09 448_[+1]_31 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 47574 ( 146) GCCCCGAACCCTACTTCGAAA 1 47570 ( 89) GCCCCGAACCCTACTTCGAAA 1 39918 ( 116) GTCACGAAAGTTACTTCGAAA 1 50241 ( 342) GTCTCGACACAAACTTCCAAA 1 45391 ( 217) AACACGAAACCATCTCCGAAA 1 43350 ( 449) GACTCGACCCTTTCTCCCACT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 10080 bayes= 11.1611 E= 7.0e-002 -70 -923 190 -923 30 50 -923 30 -923 208 -923 -923 30 50 -923 30 -923 208 -923 -923 -923 -923 216 -923 188 -923 -923 -923 130 50 -923 -923 88 108 -923 -923 -923 182 -42 -923 -70 108 -923 30 30 -923 -923 130 130 -923 -923 30 -923 208 -923 -923 -923 -923 -923 189 -923 50 -923 130 -923 208 -923 -923 -923 50 158 -923 188 -923 -923 -923 162 -50 -923 -923 162 -923 -923 -69 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 7.0e-002 0.166667 0.000000 0.833333 0.000000 0.333333 0.333333 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.333333 0.333333 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.166667 0.500000 0.000000 0.333333 0.333333 0.000000 0.000000 0.666667 0.666667 0.000000 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[ACT]C[ACT]CGA[AC][AC]C[CT][TA][AT]CT[TC]C[GC]AAA -------------------------------------------------------------------------------- Time 4.15 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 16 llr = 163 E-value = 1.0e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 8:a1::959682 pos.-specific C :9::821:1::1 probability G 1::9:615:437 matrix T 11::33::1::: bits 2.2 1.9 * 1.7 ** 1.5 *** Relative 1.3 **** * * Entropy 1.1 ***** ***** (14.7 bits) 0.9 ***** ****** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel ACAGCGAAAAAG consensus TT G GG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47546 250 4.23e-07 CGGTCTCTCT ACAGCTAGAAAG ATCCTGTCGT 47574 327 7.70e-07 TGATTTCTGT ACAGTGAGAAAG AATCCAGAGA 47570 270 7.70e-07 TGATTTCTGT ACAGTGAGAAAG AATCCAGAGA 47268 489 7.70e-07 TCAATTCTTG ACAGCTAAAAAG 50622 352 2.36e-06 AGCGACGAAC ACAGCCAAAAGG CGTGCGCCTA 27696 164 3.74e-06 TCCAACAAGA ACAGCTAGAAAA CCGGTAGACT 42003 193 4.31e-06 ATGTAAAGCC ACAGTCAAAGAG AATCGACGCG 32833 23 5.71e-06 TGAAGGCTGA ACAGCGAAAAGC CGACAAGGAG 47845 79 7.95e-06 AACTCTAGTA GCAGCGAGAGAA TTTCTACTGC 50241 394 9.05e-06 TTCTACATAA ACAGCGGGAGGG AGCTGATTTC 43023 160 1.50e-05 GTCAACATAT GCAGCCAAAAGG CCACCCAAAG 47271 73 1.65e-05 AGAAACACCA ATAGCTAGAGAG CCTTAAATGA 43869 296 1.79e-05 ACGATCTTTC ACAGCGCGAAAA AGCTTTGAGA 43990 86 1.92e-05 ACTTGATGTG ACAGTGAATGAG CAGACTATCT 47972 423 3.57e-05 TCTGAGAAGG TCAACGAAAGAG ATACGCCCGT 39918 73 1.14e-04 GAATTGTTAC ACAACGAACGAC TCGCATTTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47546 4.2e-07 249_[+2]_239 47574 7.7e-07 326_[+2]_162 47570 7.7e-07 269_[+2]_219 47268 7.7e-07 488_[+2] 50622 2.4e-06 351_[+2]_137 27696 3.7e-06 163_[+2]_325 42003 4.3e-06 192_[+2]_296 32833 5.7e-06 22_[+2]_466 47845 8e-06 78_[+2]_410 50241 9e-06 393_[+2]_95 43023 1.5e-05 159_[+2]_329 47271 1.6e-05 72_[+2]_416 43869 1.8e-05 295_[+2]_193 43990 1.9e-05 85_[+2]_403 47972 3.6e-05 422_[+2]_66 39918 0.00011 72_[+2]_416 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=16 47546 ( 250) ACAGCTAGAAAG 1 47574 ( 327) ACAGTGAGAAAG 1 47570 ( 270) ACAGTGAGAAAG 1 47268 ( 489) ACAGCTAAAAAG 1 50622 ( 352) ACAGCCAAAAGG 1 27696 ( 164) ACAGCTAGAAAA 1 42003 ( 193) ACAGTCAAAGAG 1 32833 ( 23) ACAGCGAAAAGC 1 47845 ( 79) GCAGCGAGAGAA 1 50241 ( 394) ACAGCGGGAGGG 1 43023 ( 160) GCAGCCAAAAGG 1 47271 ( 73) ATAGCTAGAGAG 1 43869 ( 296) ACAGCGCGAAAA 1 43990 ( 86) ACAGTGAATGAG 1 47972 ( 423) TCAACGAAAGAG 1 39918 ( 73) ACAACGAACGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 10269 bayes= 9.32376 E= 1.0e-001 159 -1064 -84 -211 -1064 199 -1064 -211 189 -1064 -1064 -1064 -111 -1064 197 -1064 -1064 167 -1064 -11 -1064 -33 133 -11 169 -191 -184 -1064 89 -1064 116 -1064 169 -191 -1064 -211 106 -1064 97 -1064 147 -1064 16 -1064 -53 -92 162 -1064 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 16 E= 1.0e-001 0.812500 0.000000 0.125000 0.062500 0.000000 0.937500 0.000000 0.062500 1.000000 0.000000 0.000000 0.000000 0.125000 0.000000 0.875000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.187500 0.562500 0.250000 0.875000 0.062500 0.062500 0.000000 0.500000 0.000000 0.500000 0.000000 0.875000 0.062500 0.000000 0.062500 0.562500 0.000000 0.437500 0.000000 0.750000 0.000000 0.250000 0.000000 0.187500 0.125000 0.687500 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- ACAG[CT][GT]A[AG]A[AG][AG]G -------------------------------------------------------------------------------- Time 8.50 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 9 llr = 128 E-value = 1.8e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :21283:1:3::::3: pos.-specific C 28::21:6:6:a:82a probability G ::98:::3a13:223: matrix T 8::::6a:::7:8:1: bits 2.2 * * * 1.9 * * * * 1.7 * * * * * 1.5 * * * * * Relative 1.3 *** * * * * * Entropy 1.1 ***** * * **** * (20.5 bits) 0.9 ***** * * **** * 0.6 ************** * 0.4 ************** * 0.2 **************** 0.0 ---------------- Multilevel TCGGATTCGCTCTCAC consensus CA ACA G AG GGG sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 43990 449 1.27e-08 GCGGGTCTAA TCGGATTAGCTCTCGC TCAGTATCAT 47574 297 4.66e-08 GAGCAAACTT TCGGAATCGATCGCAC ACGATGATTT 47570 240 4.66e-08 GAGCGAACTT TCGGAATCGATCGCAC ACGATGATTT 32833 370 7.28e-08 ATTAAAGCCT TCGGATTGGCTCTGTC GAAACCATCT 45391 321 1.32e-07 TATATCACTG CAGGATTGGCGCTCGC GAGAGCCGGC 50622 319 1.32e-07 TATATCACTG CAGGATTGGCGCTCGC TAGAGCCAGC 20192 254 4.08e-07 CTCGCTGTTC TCGGCCTCGCGCTGAC GCCTCAGCGC 43350 400 4.08e-07 CCAACCAATA TCGACTTCGGTCTCCC AAACGCCAAA 47972 470 8.08e-07 CTCGCCCACT TCAAAATCGATCTCCC AATCGAAAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43990 1.3e-08 448_[+3]_36 47574 4.7e-08 296_[+3]_188 47570 4.7e-08 239_[+3]_245 32833 7.3e-08 369_[+3]_115 45391 1.3e-07 320_[+3]_164 50622 1.3e-07 318_[+3]_166 20192 4.1e-07 253_[+3]_231 43350 4.1e-07 399_[+3]_85 47972 8.1e-07 469_[+3]_15 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=9 43990 ( 449) TCGGATTAGCTCTCGC 1 47574 ( 297) TCGGAATCGATCGCAC 1 47570 ( 240) TCGGAATCGATCGCAC 1 32833 ( 370) TCGGATTGGCTCTGTC 1 45391 ( 321) CAGGATTGGCGCTCGC 1 50622 ( 319) CAGGATTGGCGCTCGC 1 20192 ( 254) TCGGCCTCGCGCTGAC 1 43350 ( 400) TCGACTTCGGTCTCCC 1 47972 ( 470) TCAAAATCGATCTCCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 10185 bayes= 10.2774 E= 1.8e-001 -982 -9 -982 152 -28 172 -982 -982 -128 -982 199 -982 -28 -982 180 -982 152 -9 -982 -982 30 -109 -982 104 -982 -982 -982 189 -128 123 58 -982 -982 -982 216 -982 30 123 -101 -982 -982 -982 58 130 -982 208 -982 -982 -982 -982 -1 152 -982 172 -1 -982 30 -9 58 -128 -982 208 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 1.8e-001 0.000000 0.222222 0.000000 0.777778 0.222222 0.777778 0.000000 0.000000 0.111111 0.000000 0.888889 0.000000 0.222222 0.000000 0.777778 0.000000 0.777778 0.222222 0.000000 0.000000 0.333333 0.111111 0.000000 0.555556 0.000000 0.000000 0.000000 1.000000 0.111111 0.555556 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.555556 0.111111 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.222222 0.777778 0.000000 0.777778 0.222222 0.000000 0.333333 0.222222 0.333333 0.111111 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TC][CA]G[GA][AC][TA]T[CG]G[CA][TG]C[TG][CG][AGC]C -------------------------------------------------------------------------------- Time 12.41 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42694 7.31e-01 500 43023 1.91e-02 159_[+2(1.50e-05)]_329 47268 1.43e-02 488_[+2(7.70e-07)] 47271 1.13e-01 72_[+2(1.65e-05)]_330_\ [+2(9.75e-05)]_74 47845 2.24e-02 78_[+2(7.95e-06)]_410 43350 2.09e-08 399_[+3(4.08e-07)]_33_\ [+1(4.49e-09)]_31 32833 1.52e-06 22_[+2(5.71e-06)]_335_\ [+3(7.28e-08)]_115 39918 1.27e-06 115_[+1(3.13e-10)]_364 43869 6.27e-02 295_[+2(1.79e-05)]_193 43990 1.39e-06 85_[+2(1.92e-05)]_17_[+3(7.30e-05)]_\ 318_[+3(1.27e-08)]_36 50622 1.08e-05 318_[+3(1.32e-07)]_17_\ [+2(2.36e-06)]_137 6635 5.29e-01 500 45391 1.14e-08 216_[+1(1.38e-09)]_83_\ [+3(1.32e-07)]_164 20192 5.29e-03 253_[+3(4.08e-07)]_231 42003 2.74e-02 192_[+2(4.31e-06)]_296 27696 9.91e-03 163_[+2(3.74e-06)]_325 47546 2.57e-03 249_[+2(4.23e-07)]_239 47570 7.30e-16 88_[+1(1.96e-13)]_130_\ [+3(4.66e-08)]_14_[+2(7.70e-07)]_219 47574 7.30e-16 145_[+1(1.96e-13)]_130_\ [+3(4.66e-08)]_14_[+2(7.70e-07)]_162 47972 4.90e-04 422_[+2(3.57e-05)]_35_\ [+3(8.08e-07)]_15 50241 2.66e-07 341_[+1(1.00e-09)]_31_\ [+2(9.05e-06)]_95 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************