******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/458/458.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 50052 1.0000 500 50623 1.0000 500 5537 1.0000 500 45207 1.0000 500 54493 1.0000 500 38712 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/458/458.seqs.fa -oc motifs/458 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 6 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3000 N= 6 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.259 C 0.248 G 0.205 T 0.288 Background letter frequencies (from dataset with add-one prior applied): A 0.259 C 0.248 G 0.205 T 0.288 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 96 E-value = 1.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :3::::::2:::5::75222: pos.-specific C 82a28:28852:::5::7:3: probability G 2::3273:::52:a532235a matrix T :5:5:352:5385:::3:5:: bits 2.3 * * 2.1 * * * 1.8 * * * 1.6 * * * Relative 1.4 * * * ** * * Entropy 1.1 * * ** ** * *** * (23.2 bits) 0.9 * * ** *** ***** * 0.7 * * ** ********* * ** 0.5 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CTCTCGTCCCGTAGCAACTGG consensus A G TG TT T GGT GC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 54493 60 4.07e-10 ATCCATTTCC CACTCGTCCTTTTGGATCGGG CTAGATGCTA 50052 344 6.72e-10 CTGATTCATT CCCTCGGCCTGTTGCAACGCG AAGCAACCGG 45207 182 2.17e-09 ATTTTGACAC CACGCTTCCCGTAGGGGCTGG CATTTTTTGT 38712 37 9.90e-08 ACATACACAA CTCTGGTTCCGTTGCATGAGG CGCATAAGAC 5537 205 1.40e-07 CAAGCCGTGC GTCGCGGCATCGAGCAACTCG GGCTCACCAC 50623 253 1.81e-07 ACTCGGGGAT CTCCCTCCCCTTAGGGAATAG TGCATTAGCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 54493 4.1e-10 59_[+1]_420 50052 6.7e-10 343_[+1]_136 45207 2.2e-09 181_[+1]_298 38712 9.9e-08 36_[+1]_443 5537 1.4e-07 204_[+1]_275 50623 1.8e-07 252_[+1]_227 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 54493 ( 60) CACTCGTCCTTTTGGATCGGG 1 50052 ( 344) CCCTCGGCCTGTTGCAACGCG 1 45207 ( 182) CACGCTTCCCGTAGGGGCTGG 1 38712 ( 37) CTCTGGTTCCGTTGCATGAGG 1 5537 ( 205) GTCGCGGCATCGAGCAACTCG 1 50623 ( 253) CTCCCTCCCCTTAGGGAATAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 2880 bayes= 9.35214 E= 1.1e+002 -923 175 -30 -923 37 -57 -923 79 -923 201 -923 -923 -923 -57 70 79 -923 175 -30 -923 -923 -923 170 21 -923 -57 70 79 -923 175 -923 -79 -63 175 -923 -923 -923 101 -923 79 -923 -57 128 21 -923 -923 -30 153 95 -923 -923 79 -923 -923 228 -923 -923 101 128 -923 136 -923 70 -923 95 -923 -30 21 -63 143 -30 -923 -63 -923 70 79 -63 43 128 -923 -923 -923 228 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 1.1e+002 0.000000 0.833333 0.166667 0.000000 0.333333 0.166667 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.333333 0.500000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.166667 0.333333 0.500000 0.000000 0.833333 0.000000 0.166667 0.166667 0.833333 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.166667 0.500000 0.333333 0.000000 0.000000 0.166667 0.833333 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.666667 0.000000 0.333333 0.000000 0.500000 0.000000 0.166667 0.333333 0.166667 0.666667 0.166667 0.000000 0.166667 0.000000 0.333333 0.500000 0.166667 0.333333 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[TA]C[TG]C[GT][TG]CC[CT][GT]T[AT]G[CG][AG][AT]C[TG][GC]G -------------------------------------------------------------------------------- Time 0.38 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 6 llr = 92 E-value = 1.0e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 8:2:5832:73:::2a7::: pos.-specific C :888::3273:8:3::2::: probability G :2::32:72:5::5:::372 matrix T 2::22:3:2:22a28:2738 bits 2.3 2.1 * 1.8 * * 1.6 * * Relative 1.4 **** * ** * Entropy 1.1 **** * * ** ** *** (22.1 bits) 0.9 **** * * * ** ** *** 0.7 ****** ************* 0.5 ****** ************* 0.2 ******************** 0.0 -------------------- Multilevel ACCCAAAGCAGCTGTAATGT consensus G C CA C GT sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 38712 456 1.45e-10 GTTTACTCAG ACCCAACGCAGCTCTAATTT TTGAGTTTTG 5537 251 1.09e-08 CGGCACTTTC ACCCGACGCCGCTCTACGGG ACGACCCGTA 50052 387 4.16e-08 AGAAAGTCCT ACCTAATGTAACTTTAATGT AAATAATAGC 50623 309 1.11e-07 TACGTACCTG ACCCTAACCCTCTGTATTGT CACCCGGATA 45207 72 1.76e-07 TGATGACCTT TCCCGGAGCAATTGTAAGTT GCTCCGTGGC 54493 178 2.26e-07 CATTGGCTAT AGACAATAGAGCTGAAATGT AAGCCTAACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38712 1.5e-10 455_[+2]_25 5537 1.1e-08 250_[+2]_230 50052 4.2e-08 386_[+2]_94 50623 1.1e-07 308_[+2]_172 45207 1.8e-07 71_[+2]_409 54493 2.3e-07 177_[+2]_303 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=6 38712 ( 456) ACCCAACGCAGCTCTAATTT 1 5537 ( 251) ACCCGACGCCGCTCTACGGG 1 50052 ( 387) ACCTAATGTAACTTTAATGT 1 50623 ( 309) ACCCTAACCCTCTGTATTGT 1 45207 ( 72) TCCCGGAGCAATTGTAAGTT 1 54493 ( 178) AGACAATAGAGCTGAAATGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 2886 bayes= 8.90689 E= 1.0e+003 169 -923 -923 -79 -923 175 -30 -923 -63 175 -923 -923 -923 175 -923 -79 95 -923 70 -79 169 -923 -30 -923 37 43 -923 21 -63 -57 170 -923 -923 143 -30 -79 136 43 -923 -923 37 -923 128 -79 -923 175 -923 -79 -923 -923 -923 179 -923 43 128 -79 -63 -923 -923 153 195 -923 -923 -923 136 -57 -923 -79 -923 -923 70 121 -923 -923 170 21 -923 -923 -30 153 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 10.0e+002 0.833333 0.000000 0.000000 0.166667 0.000000 0.833333 0.166667 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.500000 0.000000 0.333333 0.166667 0.833333 0.000000 0.166667 0.000000 0.333333 0.333333 0.000000 0.333333 0.166667 0.166667 0.666667 0.000000 0.000000 0.666667 0.166667 0.166667 0.666667 0.333333 0.000000 0.000000 0.333333 0.000000 0.500000 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.500000 0.166667 0.166667 0.000000 0.000000 0.833333 1.000000 0.000000 0.000000 0.000000 0.666667 0.166667 0.000000 0.166667 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.166667 0.833333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- ACCC[AG]A[ACT]GC[AC][GA]CT[GC]TAA[TG][GT]T -------------------------------------------------------------------------------- Time 0.73 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 6 llr = 67 E-value = 1.4e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::3:a:::::28 pos.-specific C :::a:57:72:: probability G ::5::22::872 matrix T aa2::32a3:2: bits 2.3 2.1 ** 1.8 ** ** * 1.6 ** ** * * Relative 1.4 ** ** * * * Entropy 1.1 ** ** * * * (16.1 bits) 0.9 ** ** ***** 0.7 ***** ****** 0.5 ************ 0.2 ************ 0.0 ------------ Multilevel TTGCACCTCGGA consensus A T T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 50052 39 2.83e-07 CTGTACGATG TTGCAGCTCGGA CCTTTTTCTC 50623 128 1.14e-06 GAGAAAATGT TTACACGTCGGA TAGTGATCCC 5537 465 1.27e-06 CGATATTCGC TTGCATCTCGGG ATTGCGGGCG 54493 10 2.28e-06 AATATTTTC TTACACCTCGTA TTGACTATGA 45207 487 1.36e-05 TTAAATAGAT TTGCACCTTCAA GC 38712 297 1.45e-05 ACACAAAAGG TTTCATTTTGGA ATCTATATTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50052 2.8e-07 38_[+3]_450 50623 1.1e-06 127_[+3]_361 5537 1.3e-06 464_[+3]_24 54493 2.3e-06 9_[+3]_479 45207 1.4e-05 486_[+3]_2 38712 1.4e-05 296_[+3]_192 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=6 50052 ( 39) TTGCAGCTCGGA 1 50623 ( 128) TTACACGTCGGA 1 5537 ( 465) TTGCATCTCGGG 1 54493 ( 10) TTACACCTCGTA 1 45207 ( 487) TTGCACCTTCAA 1 38712 ( 297) TTTCATTTTGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 2934 bayes= 9.37898 E= 1.4e+003 -923 -923 -923 179 -923 -923 -923 179 37 -923 128 -79 -923 201 -923 -923 195 -923 -923 -923 -923 101 -30 21 -923 143 -30 -79 -923 -923 -923 179 -923 143 -923 21 -923 -57 202 -923 -63 -923 170 -79 169 -923 -30 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 1.4e+003 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.333333 0.000000 0.500000 0.166667 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.166667 0.333333 0.000000 0.666667 0.166667 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.166667 0.833333 0.000000 0.166667 0.000000 0.666667 0.166667 0.833333 0.000000 0.166667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TT[GA]CA[CT]CT[CT]GGA -------------------------------------------------------------------------------- Time 1.13 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50052 5.68e-13 38_[+3(2.83e-07)]_293_\ [+1(6.72e-10)]_22_[+2(4.16e-08)]_94 50623 9.85e-10 127_[+3(1.14e-06)]_113_\ [+1(1.81e-07)]_35_[+2(1.11e-07)]_172 5537 9.90e-11 204_[+1(1.40e-07)]_25_\ [+2(1.09e-08)]_194_[+3(1.27e-06)]_24 45207 2.49e-10 71_[+2(1.76e-07)]_90_[+1(2.17e-09)]_\ 284_[+3(1.36e-05)]_2 54493 1.24e-11 9_[+3(2.28e-06)]_38_[+1(4.07e-10)]_\ 97_[+2(2.26e-07)]_303 38712 1.23e-11 36_[+1(9.90e-08)]_239_\ [+3(1.45e-05)]_147_[+2(1.45e-10)]_25 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************