******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/443/443.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 37811 1.0000 500 38165 1.0000 500 49343 1.0000 500 40214 1.0000 500 39057 1.0000 500 33555 1.0000 500 39218 1.0000 500 49121 1.0000 500 37769 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/443/443.seqs.fa -oc motifs/443 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.262 C 0.231 G 0.237 T 0.270 Background letter frequencies (from dataset with add-one prior applied): A 0.262 C 0.231 G 0.237 T 0.270 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 8 llr = 99 E-value = 3.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::83::4:a5:6311 pos.-specific C a::1a:1::18:1:3 probability G ::11:55::134::6 matrix T :a15:5:a:3::69: bits 2.1 * * 1.9 ** * ** 1.7 ** * ** 1.5 ** * ** Relative 1.3 ** * ** * * Entropy 1.1 ** ** ** ** * (17.9 bits) 0.8 *** ** ** ** ** 0.6 *** ***** ***** 0.4 *** ***** ***** 0.2 *************** 0.0 --------------- Multilevel CTATCGGTAACATTG consensus A TA TGGA C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 40214 280 2.45e-08 AGTAGAGTAG CTATCGGTATCGTTG ACATTTGCTT 39218 114 2.75e-07 ACCAATATCT CTATCTATAACATAG GACAATGATA 49343 362 3.24e-07 AATGAGGAAG CTATCGATACCATTC TTTGCTGCAC 33555 369 8.00e-07 CACGATTGGG CTAACGGTAAGACTG CCCAGGTCAG 37811 213 8.94e-07 AGTGACGGGC CTACCTGTATCGATG GCCAAAACTC 38165 298 1.09e-06 TGGTGAATTC CTTTCGATAGCATTG ACCGATTGAT 37769 434 2.37e-06 CTGATAGTGA CTGACTGTAACAATC TATCTCGTCT 39057 213 7.20e-06 GGTATTTCTT CTAGCTCTAAGGTTA GACAGAAATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40214 2.4e-08 279_[+1]_206 39218 2.8e-07 113_[+1]_372 49343 3.2e-07 361_[+1]_124 33555 8e-07 368_[+1]_117 37811 8.9e-07 212_[+1]_273 38165 1.1e-06 297_[+1]_188 37769 2.4e-06 433_[+1]_52 39057 7.2e-06 212_[+1]_273 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=8 40214 ( 280) CTATCGGTATCGTTG 1 39218 ( 114) CTATCTATAACATAG 1 49343 ( 362) CTATCGATACCATTC 1 33555 ( 369) CTAACGGTAAGACTG 1 37811 ( 213) CTACCTGTATCGATG 1 38165 ( 298) CTTTCGATAGCATTG 1 37769 ( 434) CTGACTGTAACAATC 1 39057 ( 213) CTAGCTCTAAGGTTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4374 bayes= 9.0921 E= 3.3e+001 -965 211 -965 -965 -965 -965 -965 189 151 -965 -92 -111 -7 -88 -92 89 -965 211 -965 -965 -965 -965 107 89 52 -88 107 -965 -965 -965 -965 189 193 -965 -965 -965 93 -88 -92 -11 -965 170 7 -965 125 -965 66 -965 -7 -88 -965 121 -107 -965 -965 170 -107 12 140 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 8 E= 3.3e+001 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.750000 0.000000 0.125000 0.125000 0.250000 0.125000 0.125000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.375000 0.125000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.125000 0.125000 0.250000 0.000000 0.750000 0.250000 0.000000 0.625000 0.000000 0.375000 0.000000 0.250000 0.125000 0.000000 0.625000 0.125000 0.000000 0.000000 0.875000 0.125000 0.250000 0.625000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CTA[TA]C[GT][GA]TA[AT][CG][AG][TA]T[GC] -------------------------------------------------------------------------------- Time 0.87 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 18 sites = 4 llr = 75 E-value = 5.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::a83:5::::a3:::aa pos.-specific C 8:::8:5:a5a:8583:: probability G 38:::a:::3::::38:: matrix T :3:3:::a:3:::5:::: bits 2.1 * * * 1.9 * * ** ** ** 1.7 * * ** ** ** 1.5 * * ** ** ** Relative 1.3 *** ** ** *** **** Entropy 1.1 ********* ******** (27.2 bits) 0.8 ********* ******** 0.6 ****************** 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel CGAACGATCCCACCCGAA consensus GT TA C G ATGC sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 49121 258 7.63e-11 AGCGCCACGA CGAACGATCGCACCCGAA TCCGCATGCT 38165 357 1.19e-09 ACAGCCTTTC GGAACGCTCCCAACCGAA CTTGTATAAA 40214 384 1.59e-09 ACCCTGTGAT CGATCGATCCCACTCCAA TTTACCTTGC 37769 266 9.29e-09 GTCAATGTGT CTAAAGCTCTCACTGGAA GGCATTGTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49121 7.6e-11 257_[+2]_225 38165 1.2e-09 356_[+2]_126 40214 1.6e-09 383_[+2]_99 37769 9.3e-09 265_[+2]_217 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=18 seqs=4 49121 ( 258) CGAACGATCGCACCCGAA 1 38165 ( 357) GGAACGCTCCCAACCGAA 1 40214 ( 384) CGATCGATCCCACTCCAA 1 37769 ( 266) CTAAAGCTCTCACTGGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 4347 bayes= 10.0845 E= 5.8e+001 -865 170 7 -865 -865 -865 166 -11 193 -865 -865 -865 151 -865 -865 -11 -7 170 -865 -865 -865 -865 207 -865 93 111 -865 -865 -865 -865 -865 189 -865 211 -865 -865 -865 111 7 -11 -865 211 -865 -865 193 -865 -865 -865 -7 170 -865 -865 -865 111 -865 89 -865 170 7 -865 -865 12 166 -865 193 -865 -865 -865 193 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 4 E= 5.8e+001 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.750000 0.250000 1.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.750000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG][GT]A[AT][CA]G[AC]TC[CGT]CA[CA][CT][CG][GC]AA -------------------------------------------------------------------------------- Time 1.80 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 9 llr = 94 E-value = 9.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 28::29:8:13a pos.-specific C 2272::8::::: probability G 6:118:2:a:7: matrix T ::27:1:2:9:: bits 2.1 * 1.9 * * 1.7 * * 1.5 * ** * Relative 1.3 * *** ** * Entropy 1.1 * ******** (15.1 bits) 0.8 ** ******** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GACTGACAGTGA consensus ACTCA GT A sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 37769 470 5.93e-07 GTCCGATCGT GACTGACTGTGA CGTACTCGAA 33555 475 8.91e-07 TTTGAAAATC GAGTGACAGTGA AAAATCTCAC 40214 363 2.67e-06 CCGTGAATTT GATTGAGAGTGA CCCTGTGATC 49343 79 3.94e-06 AGCCAAGCCA ACCTGACAGTAA AAAATGGGAA 39057 293 5.34e-06 CTCAAATACA GACCGACAGAGA ACAGACGGTA 38165 335 7.04e-06 ACTAATCGAC GATTAACAGTAA ACAGCCTTTC 39218 74 1.49e-05 AAAAACGGAA CCCTAACAGTAA TATCAGGGGC 49121 7 2.16e-05 GCATGG AACCGAGTGTGA GACTCCACTG 37811 196 2.45e-05 GCTTGTATCG CACGGTCAGTGA CGGGCCTACC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37769 5.9e-07 469_[+3]_19 33555 8.9e-07 474_[+3]_14 40214 2.7e-06 362_[+3]_126 49343 3.9e-06 78_[+3]_410 39057 5.3e-06 292_[+3]_196 38165 7e-06 334_[+3]_154 39218 1.5e-05 73_[+3]_415 49121 2.2e-05 6_[+3]_482 37811 2.5e-05 195_[+3]_293 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=9 37769 ( 470) GACTGACTGTGA 1 33555 ( 475) GAGTGACAGTGA 1 40214 ( 363) GATTGAGAGTGA 1 49343 ( 79) ACCTGACAGTAA 1 39057 ( 293) GACCGACAGAGA 1 38165 ( 335) GATTAACAGTAA 1 39218 ( 74) CCCTAACAGTAA 1 49121 ( 7) AACCGAGTGTGA 1 37811 ( 196) CACGGTCAGTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4401 bayes= 8.93074 E= 9.5e+001 -24 -5 123 -982 157 -5 -982 -982 -982 153 -109 -28 -982 -5 -109 130 -24 -982 171 -982 176 -982 -982 -128 -982 175 -9 -982 157 -982 -982 -28 -982 -982 207 -982 -124 -982 -982 172 35 -982 149 -982 193 -982 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 9.5e+001 0.222222 0.222222 0.555556 0.000000 0.777778 0.222222 0.000000 0.000000 0.000000 0.666667 0.111111 0.222222 0.000000 0.222222 0.111111 0.666667 0.222222 0.000000 0.777778 0.000000 0.888889 0.000000 0.000000 0.111111 0.000000 0.777778 0.222222 0.000000 0.777778 0.000000 0.000000 0.222222 0.000000 0.000000 1.000000 0.000000 0.111111 0.000000 0.000000 0.888889 0.333333 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GAC][AC][CT][TC][GA]A[CG][AT]GT[GA]A -------------------------------------------------------------------------------- Time 2.48 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37811 2.39e-04 195_[+3(2.45e-05)]_5_[+1(8.94e-07)]_\ 273 38165 4.27e-10 297_[+1(1.09e-06)]_22_\ [+3(7.04e-06)]_10_[+2(1.19e-09)]_126 49343 2.17e-05 78_[+3(3.94e-06)]_271_\ [+1(3.24e-07)]_124 40214 6.52e-12 279_[+1(2.45e-08)]_68_\ [+3(2.67e-06)]_9_[+2(1.59e-09)]_99 39057 1.98e-04 212_[+1(7.20e-06)]_65_\ [+3(5.34e-06)]_196 33555 2.00e-05 368_[+1(8.00e-07)]_91_\ [+3(8.91e-07)]_14 39218 5.61e-05 73_[+3(1.49e-05)]_28_[+1(2.75e-07)]_\ 372 49121 3.83e-08 6_[+3(2.16e-05)]_239_[+2(7.63e-11)]_\ 225 37769 5.99e-10 265_[+2(9.29e-09)]_148_\ [+3(1.49e-06)]_26_[+3(5.93e-07)]_19 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************