******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/187/187.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 28533 1.0000 500 47407 1.0000 500 21821 1.0000 500 48623 1.0000 500 15349 1.0000 500 49012 1.0000 500 50240 1.0000 500 45230 1.0000 500 12612 1.0000 500 47531 1.0000 500 36098 1.0000 500 44213 1.0000 500 49931 1.0000 500 48992 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/187/187.seqs.fa -oc motifs/187 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.239 G 0.218 T 0.275 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.239 G 0.218 T 0.275 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 18 sites = 5 llr = 83 E-value = 3.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :2:8:4:4:6:4:2:62: pos.-specific C :::2:2:4::a::22:4a probability G a86:a4a2a4:4a48:2: matrix T ::4::::::::2:2:42: bits 2.2 * * * * * 2.0 * * * * * * * 1.8 * * * * * * * 1.5 * * * * * * * * Relative 1.3 ** ** * * * * * * Entropy 1.1 ***** * *** * * * (24.0 bits) 0.9 ***** * *** * ** * 0.7 ***** * *** * ** * 0.4 ************* ** * 0.2 **************** * 0.0 ------------------ Multilevel GGGAGAGAGACAGGGACC consensus ATC G C G G ACTA sequence C G T C G T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 21821 425 2.65e-10 TTCCAGCGAA GGGAGCGCGACAGGGACC TCACGGCTCG 49931 42 4.62e-09 GCCATGGAGG GGTAGGGCGACGGCGATC AACAATACCA 12612 40 1.07e-08 TGATGTTGAC GGGAGGGGGGCGGGCTCC GTCCGTGTTC 49012 181 5.10e-08 ACAGGATGCA GGTCGAGAGGCAGAGAAC GAACTGTCAC 47531 339 7.27e-08 TTTATACAGA GAGAGAGAGACTGTGTGC TCGTGTGTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21821 2.6e-10 424_[+1]_58 49931 4.6e-09 41_[+1]_441 12612 1.1e-08 39_[+1]_443 49012 5.1e-08 180_[+1]_302 47531 7.3e-08 338_[+1]_144 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=18 seqs=5 21821 ( 425) GGGAGCGCGACAGGGACC 1 49931 ( 42) GGTAGGGCGACGGCGATC 1 12612 ( 40) GGGAGGGGGGCGGGCTCC 1 49012 ( 181) GGTCGAGAGGCAGAGAAC 1 47531 ( 339) GAGAGAGAGACTGTGTGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 6762 bayes= 11.3442 E= 3.0e+002 -897 -897 219 -897 -42 -897 187 -897 -897 -897 146 54 157 -25 -897 -897 -897 -897 219 -897 57 -25 87 -897 -897 -897 219 -897 57 74 -13 -897 -897 -897 219 -897 116 -897 87 -897 -897 206 -897 -897 57 -897 87 -46 -897 -897 219 -897 -42 -25 87 -46 -897 -25 187 -897 116 -897 -897 54 -42 74 -13 -46 -897 206 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 5 E= 3.0e+002 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 0.600000 0.400000 0.800000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.200000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.400000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.000000 0.400000 0.200000 0.000000 0.000000 1.000000 0.000000 0.200000 0.200000 0.400000 0.200000 0.000000 0.200000 0.800000 0.000000 0.600000 0.000000 0.000000 0.400000 0.200000 0.400000 0.200000 0.200000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[GA][GT][AC]G[AGC]G[ACG]G[AG]C[AGT]G[GACT][GC][AT][CAGT]C -------------------------------------------------------------------------------- Time 1.71 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 6 llr = 91 E-value = 3.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::288:57373::::: pos.-specific C :a:2:73:::788a5a probability G ::8::32273:22:5: matrix T a:::2::2:::::::: bits 2.2 2.0 * * * 1.8 ** * * 1.5 *** *** * Relative 1.3 ***** *** * Entropy 1.1 ****** ******** (21.9 bits) 0.9 ****** ******** 0.7 ****** ********* 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TCGAACAAGACCCCCC consensus GC AGA G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 28533 406 7.59e-09 ATAGTACAGC TCGAACAAGACCGCGC CGCAGCGAAG 45230 427 1.43e-08 AGAACTAAAG TCGAAGCAAACCCCCC GAGTAGAGTC 15349 192 1.77e-08 ACGTAAAACT TCGAAGCGGACCCCGC TGTACGCAGC 47407 354 2.37e-08 CTTCAAGTCT TCGAACGAGGACCCCC TTTGATGACT 49931 275 2.89e-07 TAGTTTAGAG TCGCTCAAAAACCCGC GGGGTCGACT 12612 482 3.68e-07 GAGCAACGGA TCAAACATGGCGCCCC ACG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 28533 7.6e-09 405_[+2]_79 45230 1.4e-08 426_[+2]_58 15349 1.8e-08 191_[+2]_293 47407 2.4e-08 353_[+2]_131 49931 2.9e-07 274_[+2]_210 12612 3.7e-07 481_[+2]_3 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=6 28533 ( 406) TCGAACAAGACCGCGC 1 45230 ( 427) TCGAAGCAAACCCCCC 1 15349 ( 192) TCGAAGCGGACCCCGC 1 47407 ( 354) TCGAACGAGGACCCCC 1 49931 ( 275) TCGCTCAAAAACCCGC 1 12612 ( 482) TCAAACATGGCGCCCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 10.5908 E= 3.8e+002 -923 -923 -923 186 -923 206 -923 -923 -69 -923 193 -923 163 -52 -923 -923 163 -923 -923 -72 -923 148 61 -923 90 48 -39 -923 131 -923 -39 -72 31 -923 161 -923 131 -923 61 -923 31 148 -923 -923 -923 180 -39 -923 -923 180 -39 -923 -923 206 -923 -923 -923 107 119 -923 -923 206 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 3.8e+002 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.833333 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.666667 0.333333 0.000000 0.500000 0.333333 0.166667 0.000000 0.666667 0.000000 0.166667 0.166667 0.333333 0.000000 0.666667 0.000000 0.666667 0.000000 0.333333 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- TCGAA[CG][AC]A[GA][AG][CA]CCC[CG]C -------------------------------------------------------------------------------- Time 3.39 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 5 llr = 86 E-value = 1.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::4a::264::::2:88 pos.-specific C :::a:::84228222:::: probability G aaa:6:a2222:8:2282: matrix T ::::::::2:22:8662:2 bits 2.2 *** * 2.0 **** ** 1.8 **** ** 1.5 **** ** * Relative 1.3 **** *** ** ** Entropy 1.1 ******** *** *** (24.9 bits) 0.9 ******** *** *** 0.7 ******** * ******** 0.4 ******** * ******** 0.2 ******** * ******** 0.0 ------------------- Multilevel GGGCGAGCCAACGTTTGAA consensus A GACCTCCCATGT sequence GGG GG T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 44213 442 1.12e-09 CAAGGCGGAA GGGCAAGCCAACGCCTGAA GTAGGAGAGT 47531 429 1.12e-09 CCGGTCAACT GGGCGAGCCGCCGTTTGAT TGCGAATATC 28533 213 3.00e-09 TTCTGCTTTC GGGCGAGCTAACGTTATAA AGGAGTCTAC 12612 315 2.11e-08 GAGAAAGACG GGGCGAGCAATTCTTGGAA AGAAGCCCTG 48623 36 3.58e-08 TGTGGATCTT GGGCAAGGGCGCGTGTGGA TTTAGCTAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44213 1.1e-09 441_[+3]_40 47531 1.1e-09 428_[+3]_53 28533 3e-09 212_[+3]_269 12612 2.1e-08 314_[+3]_167 48623 3.6e-08 35_[+3]_446 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=5 44213 ( 442) GGGCAAGCCAACGCCTGAA 1 47531 ( 429) GGGCGAGCCGCCGTTTGAT 1 28533 ( 213) GGGCGAGCTAACGTTATAA 1 12612 ( 315) GGGCGAGCAATTCTTGGAA 1 48623 ( 36) GGGCAAGGGCGCGTGTGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 6748 bayes= 10.649 E= 1.6e+002 -897 -897 219 -897 -897 -897 219 -897 -897 -897 219 -897 -897 206 -897 -897 57 -897 146 -897 190 -897 -897 -897 -897 -897 219 -897 -897 174 -13 -897 -42 74 -13 -46 116 -25 -13 -897 57 -25 -13 -46 -897 174 -897 -46 -897 -25 187 -897 -897 -25 -897 154 -897 -25 -13 113 -42 -897 -13 113 -897 -897 187 -46 157 -897 -13 -897 157 -897 -897 -46 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 5 E= 1.6e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.200000 0.400000 0.200000 0.200000 0.600000 0.200000 0.200000 0.000000 0.400000 0.200000 0.200000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.200000 0.200000 0.600000 0.200000 0.000000 0.200000 0.600000 0.000000 0.000000 0.800000 0.200000 0.800000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GGGC[GA]AG[CG][CAGT][ACG][ACGT][CT][GC][TC][TCG][TAG][GT][AG][AT] -------------------------------------------------------------------------------- Time 5.00 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 28533 4.67e-10 212_[+3(3.00e-09)]_174_\ [+2(7.59e-09)]_79 47407 4.20e-05 353_[+2(2.37e-08)]_131 21821 8.24e-06 424_[+1(2.65e-10)]_58 48623 8.70e-04 35_[+3(3.58e-08)]_446 15349 1.35e-04 191_[+2(1.77e-08)]_293 49012 1.21e-03 180_[+1(5.10e-08)]_302 50240 8.28e-01 500 45230 2.50e-04 426_[+2(1.43e-08)]_58 12612 5.20e-12 39_[+1(1.07e-08)]_257_\ [+3(2.11e-08)]_148_[+2(3.68e-07)]_3 47531 4.73e-09 338_[+1(7.27e-08)]_72_\ [+3(1.12e-09)]_53 36098 8.19e-01 500 44213 3.46e-05 441_[+3(1.12e-09)]_40 49931 6.35e-08 41_[+1(4.62e-09)]_215_\ [+2(2.89e-07)]_210 48992 4.00e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************