******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/456/456.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 6300 1.0000 500 37231 1.0000 500 4339 1.0000 500 44864 1.0000 500 45494 1.0000 500 45633 1.0000 500 48377 1.0000 500 31912 1.0000 500 39244 1.0000 500 39390 1.0000 500 35714 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/456/456.seqs.fa -oc motifs/456 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.283 C 0.224 G 0.226 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.283 C 0.224 G 0.226 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 6 llr = 89 E-value = 2.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::55332322:3::a pos.-specific C a:a::::827:8::2: probability G :::5377:5:8::88: matrix T :a::2::::2:272:: bits 2.2 * * 1.9 *** 1.7 *** * 1.5 *** * ** *** Relative 1.3 *** * ** *** Entropy 1.1 **** *** ** *** (21.4 bits) 0.9 **** *** ******* 0.6 **** *********** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CTCAAGGCGCGCTGGA consensus GGAA A A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 39390 315 2.23e-09 GATCAAGCGA CTCAAGACGCGCTGGA AACCTATCAG 45494 331 8.05e-09 CGCGTCTTGT CTCGTAGCGCGCTGGA CCATGGCGAA 44864 166 4.15e-08 CTCTTTTTAC CTCGAGACACGCTGCA AATCCTAGTT 45633 476 1.81e-07 CGGTCTGTAT CTCGGGGAGCGTAGGA GACATCGGG 6300 398 2.55e-07 ATTCGAGCAA CTCAAGGCATACAGGA AAGAGCACGG 39244 432 4.59e-07 GCCTGCCAAC CTCAGAGCCAGCTTGA TACTGTATGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39390 2.2e-09 314_[+1]_170 45494 8e-09 330_[+1]_154 44864 4.1e-08 165_[+1]_319 45633 1.8e-07 475_[+1]_9 6300 2.5e-07 397_[+1]_87 39244 4.6e-07 431_[+1]_53 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=6 39390 ( 315) CTCAAGACGCGCTGGA 1 45494 ( 331) CTCGTAGCGCGCTGGA 1 44864 ( 166) CTCGAGACACGCTGCA 1 45633 ( 476) CTCGGGGAGCGTAGGA 1 6300 ( 398) CTCAAGGCATACAGGA 1 39244 ( 432) CTCAGAGCCAGCTTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5335 bayes= 10.2426 E= 2.5e+002 -923 216 -923 -923 -923 -923 -923 191 -923 216 -923 -923 82 -923 114 -923 82 -923 56 -68 23 -923 156 -923 23 -923 156 -923 -76 189 -923 -923 23 -42 114 -923 -76 157 -923 -68 -76 -923 188 -923 -923 189 -923 -68 23 -923 -923 132 -923 -923 188 -68 -923 -42 188 -923 182 -923 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 2.5e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.500000 0.000000 0.333333 0.166667 0.333333 0.000000 0.666667 0.000000 0.333333 0.000000 0.666667 0.000000 0.166667 0.833333 0.000000 0.000000 0.333333 0.166667 0.500000 0.000000 0.166667 0.666667 0.000000 0.166667 0.166667 0.000000 0.833333 0.000000 0.000000 0.833333 0.000000 0.166667 0.333333 0.000000 0.000000 0.666667 0.000000 0.000000 0.833333 0.166667 0.000000 0.166667 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CTC[AG][AG][GA][GA]C[GA]CGC[TA]GGA -------------------------------------------------------------------------------- Time 1.19 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 8 llr = 114 E-value = 4.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 5a3:a9535398:1935:: pos.-specific C :::6:145:::119:::59 probability G 5:13::1156119:1811: matrix T ::61:::1:1::::::441 bits 2.2 1.9 1.7 * * 1.5 * * ** * Relative 1.3 * ** * **** * Entropy 1.1 ** ** * * **** * (20.6 bits) 0.9 ** *** * ****** * 0.6 ******* ******** ** 0.4 ******* *********** 0.2 ******************* 0.0 ------------------- Multilevel AATCAAACAGAAGCAGACC consensus G AG CAGA ATT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 6300 83 6.14e-09 GCTTCGTATG GATCAAGCAGAAGCAGTGC ATGAACGACT 45494 381 1.28e-08 CTTCCAATCG AATCAAACGGAAGCAGGCT GTTTCAGGCC 39390 394 2.21e-08 GAATCAATTC GAACAACAAGACGCAGACC TCATTCAACT 48377 24 9.33e-08 TTCCTTCGGG AATCAACCGAGAGCAAATC AGAGGGAAAC 45633 165 2.11e-07 TACAGCGCAA GATCACACGTAACCAGATC AACCTTCCTG 39244 145 2.64e-07 GTCTAAAGAG AAGGAAAAAGAGGCAGTTC CTAAGTCATC 44864 307 4.06e-07 GACGAGACCG GAAGAAATGGAAGCGAACC CAAAACACTA 35714 341 7.27e-07 ACGATAGCTT AATTAACGAAAAGAAGTCC ACTGGGGCAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 6300 6.1e-09 82_[+2]_399 45494 1.3e-08 380_[+2]_101 39390 2.2e-08 393_[+2]_88 48377 9.3e-08 23_[+2]_458 45633 2.1e-07 164_[+2]_317 39244 2.6e-07 144_[+2]_337 44864 4.1e-07 306_[+2]_175 35714 7.3e-07 340_[+2]_141 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=8 6300 ( 83) GATCAAGCAGAAGCAGTGC 1 45494 ( 381) AATCAAACGGAAGCAGGCT 1 39390 ( 394) GAACAACAAGACGCAGACC 1 48377 ( 24) AATCAACCGAGAGCAAATC 1 45633 ( 165) GATCACACGTAACCAGATC 1 39244 ( 145) AAGGAAAAAGAGGCAGTTC 1 44864 ( 307) GAAGAAATGGAAGCGAACC 1 35714 ( 341) AATTAACGAAAAGAAGTCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 5302 bayes= 9.37014 E= 4.5e+002 82 -965 114 -965 182 -965 -965 -965 -18 -965 -86 123 -965 148 14 -109 182 -965 -965 -965 163 -84 -965 -965 82 74 -86 -965 -18 116 -86 -109 82 -965 114 -965 -18 -965 146 -109 163 -965 -86 -965 140 -84 -86 -965 -965 -84 195 -965 -118 197 -965 -965 163 -965 -86 -965 -18 -965 173 -965 82 -965 -86 49 -965 116 -86 49 -965 197 -965 -109 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 8 E= 4.5e+002 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.125000 0.625000 0.000000 0.625000 0.250000 0.125000 1.000000 0.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.500000 0.375000 0.125000 0.000000 0.250000 0.500000 0.125000 0.125000 0.500000 0.000000 0.500000 0.000000 0.250000 0.000000 0.625000 0.125000 0.875000 0.000000 0.125000 0.000000 0.750000 0.125000 0.125000 0.000000 0.000000 0.125000 0.875000 0.000000 0.125000 0.875000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.250000 0.000000 0.750000 0.000000 0.500000 0.000000 0.125000 0.375000 0.000000 0.500000 0.125000 0.375000 0.000000 0.875000 0.000000 0.125000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AG]A[TA][CG]AA[AC][CA][AG][GA]AAGCA[GA][AT][CT]C -------------------------------------------------------------------------------- Time 2.36 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 5 llr = 85 E-value = 6.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::82::6:6468:4:22 pos.-specific C 28282:a::a:24:a:a4: probability G 82:2:2:a4:42:2:2:48 matrix T ::8::6:::::2:::4::: bits 2.2 ** * * * 1.9 ** * * * 1.7 ** * * * 1.5 ** * ** * * * Relative 1.3 **** ** * * * * Entropy 1.1 ***** ***** *** * * (24.5 bits) 0.9 ***** ***** *** * * 0.6 *********** *** *** 0.4 *********** ******* 0.2 *********** ******* 0.0 ------------------- Multilevel GCTCATCGACAAAACACCG consensus CGCGCA G GCCG T GA sequence G G G A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 39390 185 1.23e-09 TCATTCAGTC GCTCATCGACACAGCACGG ATTAGAGTTT 6300 131 1.93e-09 TCGGTACGTA GCTCAACGACATCACTCCG AGTTGTCCCT 4339 190 3.08e-09 AAGAACTTGA GCCCATCGGCGGCACACGG TAATCCCGGT 35714 141 1.72e-08 ACAGTGTATT GCTGATCGGCAAAACGCCA TGAGTGGCTG 31912 266 8.15e-08 TATATGCGCA CGTCCGCGACGAAACTCAG CTCCAAGGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39390 1.2e-09 184_[+3]_297 6300 1.9e-09 130_[+3]_351 4339 3.1e-09 189_[+3]_292 35714 1.7e-08 140_[+3]_341 31912 8.2e-08 265_[+3]_216 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=5 39390 ( 185) GCTCATCGACACAGCACGG 1 6300 ( 131) GCTCAACGACATCACTCCG 1 4339 ( 190) GCCCATCGGCGGCACACGG 1 35714 ( 141) GCTGATCGGCAAAACGCCA 1 31912 ( 266) CGTCCGCGACGAAACTCAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 5302 bayes= 10.3008 E= 6.2e+002 -897 -16 182 -897 -897 184 -18 -897 -897 -16 -897 158 -897 184 -18 -897 150 -16 -897 -897 -50 -897 -18 117 -897 216 -897 -897 -897 -897 214 -897 108 -897 82 -897 -897 216 -897 -897 108 -897 82 -897 50 -16 -18 -41 108 84 -897 -897 150 -897 -18 -897 -897 216 -897 -897 50 -897 -18 58 -897 216 -897 -897 -50 84 82 -897 -50 -897 182 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 5 E= 6.2e+002 0.000000 0.200000 0.800000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.800000 0.200000 0.000000 0.800000 0.200000 0.000000 0.000000 0.200000 0.000000 0.200000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.400000 0.200000 0.200000 0.200000 0.600000 0.400000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.000000 0.200000 0.400000 0.000000 1.000000 0.000000 0.000000 0.200000 0.400000 0.400000 0.000000 0.200000 0.000000 0.800000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC][CG][TC][CG][AC][TAG]CG[AG]C[AG][ACGT][AC][AG]C[ATG]C[CGA][GA] -------------------------------------------------------------------------------- Time 3.59 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 6300 2.28e-13 82_[+2(6.14e-09)]_29_[+3(1.93e-09)]_\ 248_[+1(2.55e-07)]_87 37231 7.96e-01 500 4339 7.01e-05 189_[+3(3.08e-09)]_292 44864 5.46e-07 165_[+1(4.15e-08)]_125_\ [+2(4.06e-07)]_175 45494 5.64e-09 330_[+1(8.05e-09)]_34_\ [+2(1.28e-08)]_101 45633 9.31e-07 164_[+2(2.11e-07)]_292_\ [+1(1.81e-07)]_9 48377 2.03e-03 23_[+2(9.33e-08)]_458 31912 1.29e-04 265_[+3(8.15e-08)]_216 39244 3.02e-06 144_[+2(2.64e-07)]_268_\ [+1(4.59e-07)]_53 39390 5.59e-15 184_[+3(1.23e-09)]_111_\ [+1(2.23e-09)]_63_[+2(2.21e-08)]_88 35714 2.54e-07 140_[+3(1.72e-08)]_181_\ [+2(7.27e-07)]_141 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************