******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/259/259.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 6580 1.0000 500 46616 1.0000 500 42100 1.0000 500 32604 1.0000 500 10011 1.0000 500 49471 1.0000 500 50113 1.0000 500 34441 1.0000 500 11192 1.0000 500 45009 1.0000 500 46158 1.0000 500 46833 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/259/259.seqs.fa -oc motifs/259 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.248 G 0.236 T 0.245 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.248 G 0.236 T 0.245 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 8 llr = 91 E-value = 1.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::3:3::8:::: pos.-specific C 11:6::::1::3 probability G :9:13::::1a1 matrix T 9:835aa399:6 bits 2.1 ** * 1.9 ** * 1.7 ** * 1.5 ** ** *** Relative 1.3 *** ** *** Entropy 1.0 *** ****** (16.4 bits) 0.8 *** ******* 0.6 **** ******* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGTCTTTATTGT consensus ATA T C sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 45009 396 4.86e-08 GATTCCGGTG TGTCTTTATTGT GTATTTCAGA 10011 94 9.20e-07 ATCTTGTTCC TGTTTTTATTGC CGACTACGAC 42100 329 3.31e-06 CGCGGAACGG TGTCTTTTCTGT CGTCGAACAC 46616 94 3.43e-06 CGAACCGCAC TGACATTTTTGT TGATCCAACT 49471 414 4.48e-06 AAGCGGCGCG TGTGTTTATGGT TAAAGAAAGC 46158 91 5.08e-06 AGCTACCGTT TCTTGTTATTGT TATGACGAGA 50113 404 5.52e-06 ACATACTCGC CGTCGTTATTGC CGTTGCTGTC 6580 3 5.52e-06 TC TGACATTATTGG ATTGGTTGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45009 4.9e-08 395_[+1]_93 10011 9.2e-07 93_[+1]_395 42100 3.3e-06 328_[+1]_160 46616 3.4e-06 93_[+1]_395 49471 4.5e-06 413_[+1]_75 46158 5.1e-06 90_[+1]_398 50113 5.5e-06 403_[+1]_85 6580 5.5e-06 2_[+1]_486 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=8 45009 ( 396) TGTCTTTATTGT 1 10011 ( 94) TGTTTTTATTGC 1 42100 ( 329) TGTCTTTTCTGT 1 46616 ( 94) TGACATTTTTGT 1 49471 ( 414) TGTGTTTATGGT 1 46158 ( 91) TCTTGTTATTGT 1 50113 ( 404) CGTCGTTATTGC 1 6580 ( 3) TGACATTATTGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 9.51668 E= 1.5e+002 -965 -98 -965 183 -965 -98 189 -965 -12 -965 -965 161 -965 133 -91 3 -12 -965 9 103 -965 -965 -965 203 -965 -965 -965 203 147 -965 -965 3 -965 -98 -965 183 -965 -965 -91 183 -965 -965 208 -965 -965 1 -91 135 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 8 E= 1.5e+002 0.000000 0.125000 0.000000 0.875000 0.000000 0.125000 0.875000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.625000 0.125000 0.250000 0.250000 0.000000 0.250000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.750000 0.000000 0.000000 0.250000 0.000000 0.125000 0.000000 0.875000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.125000 0.625000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TG[TA][CT][TAG]TT[AT]TTG[TC] -------------------------------------------------------------------------------- Time 1.33 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 4 llr = 70 E-value = 4.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::::::::::33::aa pos.-specific C aa:3::38::35:::: probability G ::::a::::8:38a:: matrix T ::a8:a83a35:3::: bits 2.1 *** ** * * 1.9 *** ** * *** 1.7 *** ** * *** 1.5 *** ** * *** Relative 1.3 ********** **** Entropy 1.0 ********** **** (25.2 bits) 0.8 ********** **** 0.6 ********** **** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CCTTGTTCTGTCGGAA consensus C CT TAAT sequence CG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 10011 465 2.49e-09 ATTTTTGTCG CCTTGTTCTGAGGGAA AGACGTTCAA 46158 338 8.60e-09 TGGTCGTCTA CCTCGTCCTGTCGGAA ACAACCTTGG 46616 406 8.60e-09 TGGTGGTTGC CCTTGTTCTTTCTGAA TGTAAGAAGC 11192 252 1.16e-08 AGAATTCTTA CCTTGTTTTGCAGGAA TTGGGGCAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10011 2.5e-09 464_[+2]_20 46158 8.6e-09 337_[+2]_147 46616 8.6e-09 405_[+2]_79 11192 1.2e-08 251_[+2]_233 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=4 10011 ( 465) CCTTGTTCTGAGGGAA 1 46158 ( 338) CCTCGTCCTGTCGGAA 1 46616 ( 406) CCTTGTTCTTTCTGAA 1 11192 ( 252) CCTTGTTTTGCAGGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 10.5058 E= 4.8e+002 -865 201 -865 -865 -865 201 -865 -865 -865 -865 -865 202 -865 1 -865 161 -865 -865 208 -865 -865 -865 -865 202 -865 1 -865 161 -865 160 -865 3 -865 -865 -865 202 -865 -865 167 3 -12 1 -865 103 -12 101 8 -865 -865 -865 167 3 -865 -865 208 -865 188 -865 -865 -865 188 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 4 E= 4.8e+002 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.250000 0.250000 0.000000 0.500000 0.250000 0.500000 0.250000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CCT[TC]GT[TC][CT]T[GT][TAC][CAG][GT]GAA -------------------------------------------------------------------------------- Time 2.63 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 6 llr = 88 E-value = 4.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::22::::::::::: pos.-specific C 8::::::2:8:77a78 probability G :a83:7:38:2:2:32 matrix T 2:2583a522832::: bits 2.1 * * * 1.9 * * * 1.7 * * * 1.5 *** * *** * * Relative 1.3 *** *** *** * * Entropy 1.0 *** *** **** *** (21.1 bits) 0.8 *** *** ******** 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CGGTTGTTGCTCCCCC consensus G T G T G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 45009 269 4.62e-10 CGAGCAGTGG CGGTTGTGGCTCCCCC ACCGTACCAA 46616 482 1.61e-08 GCACGAAAAA CGGTTGTTTCTTCCCC AAT 50113 446 7.80e-08 CTGCTGTATT CGTTTGTTGCTCTCGC TGTGACCGTC 32604 124 2.48e-07 TATTATAAGT CGGGAGTCGTTCCCCC CTTTTTCGAC 11192 286 3.20e-07 GCTCTAGCTA TGGGTTTGGCTCGCGC CGGTGCGAGC 34441 201 5.72e-07 ATAGTGCCCT CGGATTTTGCGTCCCG GTATGCGTCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45009 4.6e-10 268_[+3]_216 46616 1.6e-08 481_[+3]_3 50113 7.8e-08 445_[+3]_39 32604 2.5e-07 123_[+3]_361 11192 3.2e-07 285_[+3]_199 34441 5.7e-07 200_[+3]_284 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=6 45009 ( 269) CGGTTGTGGCTCCCCC 1 46616 ( 482) CGGTTGTTTCTTCCCC 1 50113 ( 446) CGTTTGTTGCTCTCGC 1 32604 ( 124) CGGGAGTCGTTCCCCC 1 11192 ( 286) TGGGTTTGGCTCGCGC 1 34441 ( 201) CGGATTTTGCGTCCCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 10.3682 E= 4.0e+002 -923 175 -923 -56 -923 -923 208 -923 -923 -923 182 -56 -70 -923 50 103 -70 -923 -923 176 -923 -923 150 44 -923 -923 -923 203 -923 -57 50 103 -923 -923 182 -56 -923 175 -923 -56 -923 -923 -50 176 -923 143 -923 44 -923 143 -50 -56 -923 201 -923 -923 -923 143 50 -923 -923 175 -50 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 4.0e+002 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.166667 0.000000 0.333333 0.500000 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.333333 0.500000 0.000000 0.000000 0.833333 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.166667 0.833333 0.000000 0.666667 0.000000 0.333333 0.000000 0.666667 0.166667 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.833333 0.166667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CGG[TG]T[GT]T[TG]GCT[CT]CC[CG]C -------------------------------------------------------------------------------- Time 3.89 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 6580 3.50e-02 2_[+1(5.52e-06)]_486 46616 2.72e-11 93_[+1(3.43e-06)]_300_\ [+2(8.60e-09)]_60_[+3(1.61e-08)]_3 42100 5.09e-03 328_[+1(3.31e-06)]_160 32604 1.24e-03 123_[+3(2.48e-07)]_336_\ [+3(2.39e-05)]_9 10011 3.84e-08 93_[+1(9.20e-07)]_359_\ [+2(2.49e-09)]_20 49471 1.11e-02 413_[+1(4.48e-06)]_75 50113 4.59e-06 403_[+1(5.52e-06)]_30_\ [+3(7.80e-08)]_39 34441 2.48e-03 200_[+3(5.72e-07)]_284 11192 1.90e-07 251_[+2(1.16e-08)]_18_\ [+3(3.20e-07)]_199 45009 1.76e-09 268_[+3(4.62e-10)]_111_\ [+1(4.86e-08)]_93 46158 3.57e-07 90_[+1(5.08e-06)]_235_\ [+2(8.60e-09)]_147 46833 2.23e-02 31_[+3(6.26e-05)]_341_\ [+1(7.06e-05)]_100 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************