******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/470/470.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 46346 1.0000 500 48904 1.0000 500 33473 1.0000 500 35055 1.0000 500 7942 1.0000 500 39610 1.0000 500 34698 1.0000 500 49984 1.0000 500 45031 1.0000 500 45771 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/470/470.seqs.fa -oc motifs/470 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.261 C 0.225 G 0.226 T 0.288 Background letter frequencies (from dataset with add-one prior applied): A 0.261 C 0.225 G 0.226 T 0.288 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 5 llr = 81 E-value = 1.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::62:::a:::::48a pos.-specific C 2::24:4:a:::84:: probability G 8a246:6:::a:222: matrix T ::22:a:::a:a:::: bits 2.2 * * * 1.9 * ** * * 1.7 * * ***** * 1.5 ** * ****** * Relative 1.3 ** * ****** ** Entropy 1.1 ** ********* ** (23.4 bits) 0.9 ** ********* ** 0.6 *** ************ 0.4 *** ************ 0.2 **************** 0.0 ---------------- Multilevel GGAGGTGACTGTCAAA consensus C GAC C GCG sequence TC G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 7942 17 7.01e-09 GAGTACTCAA GGAGGTGACTGTCAGA AACGATAGAA 46346 109 1.03e-08 AAAGTCGGGA GGAACTCACTGTCCAA GCCATGGCAA 45771 1 3.20e-08 . CGACGTCACTGTCCAA TATCTGTCAT 33473 14 4.45e-08 TTTCACACCA GGTTGTGACTGTCGAA GGCGAAGGTC 48904 219 4.82e-08 GGCTGTTACT GGGGCTGACTGTGAAA CCAATTATGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7942 7e-09 16_[+1]_468 46346 1e-08 108_[+1]_376 45771 3.2e-08 [+1]_484 33473 4.5e-08 13_[+1]_471 48904 4.8e-08 218_[+1]_266 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=5 7942 ( 17) GGAGGTGACTGTCAGA 1 46346 ( 109) GGAACTCACTGTCCAA 1 45771 ( 1) CGACGTCACTGTCCAA 1 33473 ( 14) GGTTGTGACTGTCGAA 1 48904 ( 219) GGGGCTGACTGTGAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4850 bayes= 10.1721 E= 1.0e+001 -897 -17 182 -897 -897 -897 214 -897 120 -897 -18 -52 -38 -17 82 -52 -897 83 141 -897 -897 -897 -897 179 -897 83 141 -897 193 -897 -897 -897 -897 215 -897 -897 -897 -897 -897 179 -897 -897 214 -897 -897 -897 -897 179 -897 183 -18 -897 61 83 -18 -897 161 -897 -18 -897 193 -897 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 1.0e+001 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.000000 0.200000 0.200000 0.200000 0.200000 0.400000 0.200000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.200000 0.000000 0.400000 0.400000 0.200000 0.000000 0.800000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GC]G[AGT][GACT][GC]T[GC]ACTGT[CG][ACG][AG]A -------------------------------------------------------------------------------- Time 1.09 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 10 llr = 101 E-value = 4.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 829a9613:94a pos.-specific C :7::142:::5: probability G 21::::459::: matrix T ::1:::32111: bits 2.2 1.9 * * 1.7 * * * 1.5 *** ** * Relative 1.3 * *** ** * Entropy 1.1 * **** ** * (14.5 bits) 0.9 ****** ** * 0.6 ****** **** 0.4 ****** ***** 0.2 ************ 0.0 ------------ Multilevel ACAAAAGGGACA consensus GA CTA A sequence CT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 7942 113 1.46e-07 GAGTATGCTG ACAAAAGGGAAA AGCGTAGAAA 35055 269 1.73e-06 TCCTATATCG AAAAACGGGACA TAGCATGGTG 34698 165 3.43e-06 GAAAGCACGG ACAAAAAAGACA ACCTTGCGTG 46346 131 3.43e-06 CCAAGCCATG GCAAAATGGAAA ATGGCTGACT 45031 20 6.25e-06 CAAGAGCACT ACAACAGAGACA ATGCTGGAAT 49984 127 1.61e-05 GATCAGTTCC AGAAACCAGACA TTTGGATGCT 33473 390 1.61e-05 ACAAGATGCC ACAAACTGTAAA CTGGCTACCT 39610 311 2.12e-05 TCACTCATCA ACAAAATTGTCA GTGAGGACCG 45771 125 3.34e-05 TGTTTGCTTG GCAAACGTGATA CGTACTGTCC 48904 138 4.03e-05 ATTCGTTAGT AATAAACGGAAA GAAATTCTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7942 1.5e-07 112_[+2]_376 35055 1.7e-06 268_[+2]_220 34698 3.4e-06 164_[+2]_324 46346 3.4e-06 130_[+2]_358 45031 6.2e-06 19_[+2]_469 49984 1.6e-05 126_[+2]_362 33473 1.6e-05 389_[+2]_99 39610 2.1e-05 310_[+2]_178 45771 3.3e-05 124_[+2]_364 48904 4e-05 137_[+2]_351 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=10 7942 ( 113) ACAAAAGGGAAA 1 35055 ( 269) AAAAACGGGACA 1 34698 ( 165) ACAAAAAAGACA 1 46346 ( 131) GCAAAATGGAAA 1 45031 ( 20) ACAACAGAGACA 1 49984 ( 127) AGAAACCAGACA 1 33473 ( 390) ACAAACTGTAAA 1 39610 ( 311) ACAAAATTGTCA 1 45771 ( 125) GCAAACGTGATA 1 48904 ( 138) AATAAACGGAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 8.93074 E= 4.9e+001 161 -997 -18 -997 -38 164 -118 -997 178 -997 -997 -152 194 -997 -997 -997 178 -117 -997 -997 120 83 -997 -997 -138 -17 82 6 20 -997 114 -53 -997 -997 199 -152 178 -997 -997 -152 61 115 -997 -152 194 -997 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 4.9e+001 0.800000 0.000000 0.200000 0.000000 0.200000 0.700000 0.100000 0.000000 0.900000 0.000000 0.000000 0.100000 1.000000 0.000000 0.000000 0.000000 0.900000 0.100000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.100000 0.200000 0.400000 0.300000 0.300000 0.000000 0.500000 0.200000 0.000000 0.000000 0.900000 0.100000 0.900000 0.000000 0.000000 0.100000 0.400000 0.500000 0.000000 0.100000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AG][CA]AAA[AC][GTC][GAT]GA[CA]A -------------------------------------------------------------------------------- Time 2.04 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 8 llr = 100 E-value = 2.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :95a9::1364:::11 pos.-specific C 9:1:::846::a1438 probability G 113:1:3414:::531 matrix T ::1::a:1::6:914: bits 2.2 * 1.9 * * 1.7 * * * 1.5 ** *** * Relative 1.3 ** **** ** Entropy 1.1 ** **** * ** * (18.1 bits) 0.9 ** **** ***** * 0.6 ** **** ****** * 0.4 ** **** ****** * 0.2 ************** * 0.0 ---------------- Multilevel CAAAATCCCATCTGTC consensus G GGAGA CC sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 48904 311 1.68e-07 CAATCCTTAG CATAATCCAATCTGTC CGCTTTCCTG 34698 217 1.86e-07 CTGTAAGATT CAAAATGGAATCTCGC TTTGGTCTAT 7942 266 2.80e-07 CGAAAGAGTC CAGAATCCCAACTGTA AATGGAAATA 45031 112 6.35e-07 TTTGAGAACA CAAAGTCTCGTCTGCC AAATTTGCCC 35055 352 7.62e-07 ACAGACAATT CAAAATCACGTCTCGG CCTGGCCTGG 45771 216 9.82e-07 AGCTCGTACA GAAAATCCCAACTTCC CCTCCTCGGA 39610 281 1.15e-06 CGGTCATGTT CGGAATCGGATCTCTC ACAGTCACTC 46346 267 4.26e-06 ACTTCGGTCA CACAATGGCGACCGAC ACAACGTGCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48904 1.7e-07 310_[+3]_174 34698 1.9e-07 216_[+3]_268 7942 2.8e-07 265_[+3]_219 45031 6.3e-07 111_[+3]_373 35055 7.6e-07 351_[+3]_133 45771 9.8e-07 215_[+3]_269 39610 1.1e-06 280_[+3]_204 46346 4.3e-06 266_[+3]_218 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=8 48904 ( 311) CATAATCCAATCTGTC 1 34698 ( 217) CAAAATGGAATCTCGC 1 7942 ( 266) CAGAATCCCAACTGTA 1 45031 ( 112) CAAAGTCTCGTCTGCC 1 35055 ( 352) CAAAATCACGTCTCGG 1 45771 ( 216) GAAAATCCCAACTTCC 1 39610 ( 281) CGGAATCGGATCTCTC 1 46346 ( 267) CACAATGGCGACCGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4850 bayes= 9.24139 E= 2.0e+002 -965 196 -85 -965 174 -965 -85 -965 94 -84 14 -120 194 -965 -965 -965 174 -965 -85 -965 -965 -965 -965 179 -965 174 14 -965 -106 74 73 -120 -6 148 -85 -965 126 -965 73 -965 52 -965 -965 112 -965 215 -965 -965 -965 -84 -965 160 -965 74 114 -120 -106 15 14 38 -106 174 -85 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 2.0e+002 0.000000 0.875000 0.125000 0.000000 0.875000 0.000000 0.125000 0.000000 0.500000 0.125000 0.250000 0.125000 1.000000 0.000000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 0.125000 0.375000 0.375000 0.125000 0.250000 0.625000 0.125000 0.000000 0.625000 0.000000 0.375000 0.000000 0.375000 0.000000 0.000000 0.625000 0.000000 1.000000 0.000000 0.000000 0.000000 0.125000 0.000000 0.875000 0.000000 0.375000 0.500000 0.125000 0.125000 0.250000 0.250000 0.375000 0.125000 0.750000 0.125000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CA[AG]AAT[CG][CG][CA][AG][TA]CT[GC][TCG]C -------------------------------------------------------------------------------- Time 2.99 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46346 5.75e-09 108_[+1(1.03e-08)]_6_[+2(3.43e-06)]_\ 124_[+3(4.26e-06)]_218 48904 1.17e-08 137_[+2(4.03e-05)]_69_\ [+1(4.82e-08)]_76_[+3(1.68e-07)]_174 33473 1.02e-06 13_[+1(4.45e-08)]_360_\ [+2(1.61e-05)]_99 35055 2.21e-05 268_[+2(1.73e-06)]_71_\ [+3(7.62e-07)]_133 7942 1.69e-11 16_[+1(7.01e-09)]_80_[+2(1.46e-07)]_\ 141_[+3(2.80e-07)]_219 39610 3.46e-04 280_[+3(1.15e-06)]_14_\ [+2(2.12e-05)]_178 34698 1.22e-05 164_[+2(3.43e-06)]_40_\ [+3(1.86e-07)]_268 49984 3.21e-02 126_[+2(1.61e-05)]_362 45031 2.51e-05 19_[+2(6.25e-06)]_80_[+3(6.35e-07)]_\ 373 45771 3.41e-08 [+1(3.20e-08)]_108_[+2(3.34e-05)]_\ 79_[+3(9.82e-07)]_269 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************