******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/397/397.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 14202 1.0000 500 48325 1.0000 500 48511 1.0000 500 34119 1.0000 500 45199 1.0000 500 45968 1.0000 500 31702 1.0000 500 42668 1.0000 500 45211 1.0000 500 34683 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/397/397.seqs.fa -oc motifs/397 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.260 C 0.254 G 0.223 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.260 C 0.254 G 0.223 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 9 llr = 106 E-value = 2.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :82:91:79848:4a pos.-specific C ::34:7:1::2::2: probability G a2:1:26::222a3: matrix T ::441:421:1:::: bits 2.2 * * 1.9 * * * 1.7 * * * 1.5 * * * * * Relative 1.3 ** * ** ** * Entropy 1.1 ** * * ** ** * (17.1 bits) 0.9 ** *** ** ** * 0.6 ** ******* ** * 0.4 ********** **** 0.2 *************** 0.0 --------------- Multilevel GATCACGAAAAAGAA consensus GCT GTT GCG G sequence A G C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 31702 136 9.35e-08 TTCTGTTGGG GATCACGAAATAGAA CGGACTGTGA 45968 417 2.00e-07 AAGAACCTTT GATTAGGAAACAGAA CATTGTCTCA 14202 243 2.45e-07 GGATGAGCCT GAACACGAAAAGGGA CAGCCACCGT 34119 106 7.18e-07 TACTACCGGT GACGACGAAAAGGAA GAACAATCTG 45211 179 2.10e-06 GTTATGTTTC GATTAATTAAAAGGA ATTTCGTAAC 34683 243 2.66e-06 GAAAATACAT GGACAGTAAAGAGGA GCTCTCTACT 45199 102 2.66e-06 CTCGTCGGTA GACCACGTAGCAGCA TCCCCATCCT 48325 221 4.53e-06 CTGCGTTTCT GATTTCTAAGAAGCA AAACACAAGT 42668 116 1.49e-05 GTCTCTCTTT GGCTACTCTAGAGAA TAGGTTTGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31702 9.4e-08 135_[+1]_350 45968 2e-07 416_[+1]_69 14202 2.4e-07 242_[+1]_243 34119 7.2e-07 105_[+1]_380 45211 2.1e-06 178_[+1]_307 34683 2.7e-06 242_[+1]_243 45199 2.7e-06 101_[+1]_384 48325 4.5e-06 220_[+1]_265 42668 1.5e-05 115_[+1]_370 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=9 31702 ( 136) GATCACGAAATAGAA 1 45968 ( 417) GATTAGGAAACAGAA 1 14202 ( 243) GAACACGAAAAGGGA 1 34119 ( 106) GACGACGAAAAGGAA 1 45211 ( 179) GATTAATTAAAAGGA 1 34683 ( 243) GGACAGTAAAGAGGA 1 45199 ( 102) GACCACGTAGCAGCA 1 48325 ( 221) GATTTCTAAGAAGCA 1 42668 ( 116) GGCTACTCTAGAGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4860 bayes= 9.20868 E= 2.8e+001 -982 -982 216 -982 158 -982 -1 -982 -23 39 -982 76 -982 80 -100 76 177 -982 -982 -124 -122 139 -1 -982 -982 -982 131 76 136 -119 -982 -24 177 -982 -982 -124 158 -982 -1 -982 77 -19 -1 -124 158 -982 -1 -982 -982 -982 216 -982 77 -19 58 -982 194 -982 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 9 E= 2.8e+001 0.000000 0.000000 1.000000 0.000000 0.777778 0.000000 0.222222 0.000000 0.222222 0.333333 0.000000 0.444444 0.000000 0.444444 0.111111 0.444444 0.888889 0.000000 0.000000 0.111111 0.111111 0.666667 0.222222 0.000000 0.000000 0.000000 0.555556 0.444444 0.666667 0.111111 0.000000 0.222222 0.888889 0.000000 0.000000 0.111111 0.777778 0.000000 0.222222 0.000000 0.444444 0.222222 0.222222 0.111111 0.777778 0.000000 0.222222 0.000000 0.000000 0.000000 1.000000 0.000000 0.444444 0.222222 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[AG][TCA][CT]A[CG][GT][AT]A[AG][ACG][AG]G[AGC]A -------------------------------------------------------------------------------- Time 1.00 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 6 llr = 100 E-value = 2.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A a5:::33:2:8::::2:77:3 pos.-specific C :22:5:2::3:28a:333:8: probability G :23a:23a83282:855:3:7 matrix T :25:552::3::::2:2::2: bits 2.2 * * 1.9 * * * * 1.7 * * * * 1.5 * * ** * ** Relative 1.3 * * ** ***** * Entropy 1.1 * * ** ***** **** (24.0 bits) 0.9 * ** ** ***** **** 0.6 * *** ** *********** 0.4 * **** ************** 0.2 ****** ************** 0.0 --------------------- Multilevel AATGCTAGGCAGCCGGGAACG consensus G TAG G CCCG A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 31702 369 2.79e-11 TGAGAGAAGG AATGTAAGGGAGCCGCGAACG TGCCCACCTT 48511 190 5.31e-09 CTGTCGAAGG AATGCTGGGTGGCCTCGAACG ATTCTGACTC 42668 33 5.86e-09 GTCGTTAGCC AGGGCTCGGGAGCCGGTAACA ACAATTTAGA 48325 19 1.98e-08 TACGCAGAAC AACGTTGGACAGCCGGCCACA CCTCCGCATT 34683 55 6.52e-08 TTCGTTACGG ACTGCGAGGCACGCGGCAGCG TGGGTATTGT 45211 19 8.55e-08 CCGACATTGA ATGGTATGGTAGCCGAGCGTG CGGTACGGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31702 2.8e-11 368_[+2]_111 48511 5.3e-09 189_[+2]_290 42668 5.9e-09 32_[+2]_447 48325 2e-08 18_[+2]_461 34683 6.5e-08 54_[+2]_425 45211 8.5e-08 18_[+2]_461 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=6 31702 ( 369) AATGTAAGGGAGCCGCGAACG 1 48511 ( 190) AATGCTGGGTGGCCTCGAACG 1 42668 ( 33) AGGGCTCGGGAGCCGGTAACA 1 48325 ( 19) AACGTTGGACAGCCGGCCACA 1 34683 ( 55) ACTGCGAGGCACGCGGCAGCG 1 45211 ( 19) ATGGTATGGTAGCCGAGCGTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 10.09 E= 2.3e+002 194 -923 -923 -923 94 -61 -42 -65 -923 -61 58 93 -923 -923 216 -923 -923 97 -923 93 36 -923 -42 93 36 -61 58 -65 -923 -923 216 -923 -64 -923 190 -923 -923 39 58 34 168 -923 -42 -923 -923 -61 190 -923 -923 171 -42 -923 -923 197 -923 -923 -923 -923 190 -65 -64 39 116 -923 -923 39 116 -65 136 39 -923 -923 136 -923 58 -923 -923 171 -923 -65 36 -923 158 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 2.3e+002 1.000000 0.000000 0.000000 0.000000 0.500000 0.166667 0.166667 0.166667 0.000000 0.166667 0.333333 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.333333 0.000000 0.166667 0.500000 0.333333 0.166667 0.333333 0.166667 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.333333 0.333333 0.333333 0.833333 0.000000 0.166667 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.166667 0.333333 0.500000 0.000000 0.000000 0.333333 0.500000 0.166667 0.666667 0.333333 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.833333 0.000000 0.166667 0.333333 0.000000 0.666667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- AA[TG]G[CT][TA][AG]GG[CGT]AGCCG[GC][GC][AC][AG]C[GA] -------------------------------------------------------------------------------- Time 1.90 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 10 llr = 96 E-value = 1.6e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::82:982:186 pos.-specific C 9:164:12a:21 probability G 19116113:9:: matrix T :1:1:::3:::3 bits 2.2 1.9 * 1.7 * ** 1.5 ** * ** Relative 1.3 ** * *** Entropy 1.1 *** *** *** (13.9 bits) 0.9 *** *** *** 0.6 *** *** **** 0.4 ******* **** 0.2 ******* **** 0.0 ------------ Multilevel CGACGAAGCGAA consensus AC T CT sequence A C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 48511 439 5.68e-07 TTGGATTGCT CGACCAAACGAA TCGCTATTGC 34683 117 1.81e-06 TACGTTTTGC CGACGAGGCGAA AGCTCGAATT 34119 54 2.42e-06 AGTAGAAAGG CGGCGAATCGAA ACCAGGATCC 45211 250 5.79e-06 AGACTTTCCT CGATGAATCGAT GCCTTTTCGT 45968 286 8.12e-06 GATTCTGCTT CGACGAAGCAAT CGTGCGTTTT 42668 9 1.13e-05 ATTCTTTC GGAAGAATCGAA TCGTCGTTAG 14202 9 2.67e-05 TTGAAAAA CGACCGAGCGAC TTATCGAGTG 31702 420 2.97e-05 GATCTCCTGG CGCACAACCGAA CAAAACCTAT 45199 40 4.33e-05 ACTCCGGTGG CGACGACCCGCT CTCCGCATCA 48325 152 1.47e-04 TACTCGACTA CTAGCAAACGCA CGGCGTCGGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48511 5.7e-07 438_[+3]_50 34683 1.8e-06 116_[+3]_372 34119 2.4e-06 53_[+3]_435 45211 5.8e-06 249_[+3]_239 45968 8.1e-06 285_[+3]_203 42668 1.1e-05 8_[+3]_480 14202 2.7e-05 8_[+3]_480 31702 3e-05 419_[+3]_69 45199 4.3e-05 39_[+3]_449 48325 0.00015 151_[+3]_337 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=10 48511 ( 439) CGACCAAACGAA 1 34683 ( 117) CGACGAGGCGAA 1 34119 ( 54) CGGCGAATCGAA 1 45211 ( 250) CGATGAATCGAT 1 45968 ( 286) CGACGAAGCAAT 1 42668 ( 9) GGAAGAATCGAA 1 14202 ( 9) CGACCGAGCGAC 1 31702 ( 420) CGCACAACCGAA 1 45199 ( 40) CGACGACCCGCT 1 48325 ( 152) CTAGCAAACGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 8.93074 E= 1.6e+003 -997 182 -116 -997 -997 -997 201 -139 162 -134 -116 -997 -38 124 -116 -139 -997 65 143 -997 179 -997 -116 -997 162 -134 -116 -997 -38 -35 43 19 -997 197 -997 -997 -138 -997 201 -997 162 -35 -997 -997 121 -134 -997 19 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 1.6e+003 0.000000 0.900000 0.100000 0.000000 0.000000 0.000000 0.900000 0.100000 0.800000 0.100000 0.100000 0.000000 0.200000 0.600000 0.100000 0.100000 0.000000 0.400000 0.600000 0.000000 0.900000 0.000000 0.100000 0.000000 0.800000 0.100000 0.100000 0.000000 0.200000 0.200000 0.300000 0.300000 0.000000 1.000000 0.000000 0.000000 0.100000 0.000000 0.900000 0.000000 0.800000 0.200000 0.000000 0.000000 0.600000 0.100000 0.000000 0.300000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CGA[CA][GC]AA[GTAC]CG[AC][AT] -------------------------------------------------------------------------------- Time 2.81 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14202 1.48e-04 8_[+3(2.67e-05)]_222_[+1(2.45e-07)]_\ 243 48325 2.21e-07 18_[+2(1.98e-08)]_181_\ [+1(4.53e-06)]_265 48511 1.59e-07 189_[+2(5.31e-09)]_228_\ [+3(5.68e-07)]_50 34119 5.06e-05 53_[+3(2.42e-06)]_40_[+1(7.18e-07)]_\ 380 45199 1.70e-03 39_[+3(4.33e-05)]_50_[+1(2.66e-06)]_\ 384 45968 3.76e-05 285_[+3(8.12e-06)]_119_\ [+1(2.00e-07)]_69 31702 4.89e-12 135_[+1(9.35e-08)]_218_\ [+2(2.79e-11)]_30_[+3(2.97e-05)]_69 42668 3.19e-08 8_[+3(1.13e-05)]_12_[+2(5.86e-09)]_\ 62_[+1(1.49e-05)]_370 45211 3.38e-08 18_[+2(8.55e-08)]_139_\ [+1(2.10e-06)]_56_[+3(5.79e-06)]_239 34683 1.12e-08 54_[+2(6.52e-08)]_41_[+3(1.81e-06)]_\ 114_[+1(2.66e-06)]_243 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************