******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/426/426.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47041 1.0000 500 47218 1.0000 500 47503 1.0000 500 51073 1.0000 500 47704 1.0000 500 49081 1.0000 500 23314 1.0000 500 43839 1.0000 500 45058 1.0000 500 468 1.0000 500 47000 1.0000 500 44846 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/426/426.seqs.fa -oc motifs/426 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.247 C 0.249 G 0.239 T 0.264 Background letter frequencies (from dataset with add-one prior applied): A 0.248 C 0.249 G 0.239 T 0.264 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 10 llr = 112 E-value = 2.5e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :9:::716:::9 pos.-specific C 8:4:1:81:::: probability G 2::1911:9:9: matrix T :169:2:31a11 bits 2.1 1.9 * 1.7 * *** 1.4 * ** **** Relative 1.2 ** ** **** Entropy 1.0 ***** * **** (16.1 bits) 0.8 ******* **** 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CATTGACAGTGA consensus G C T T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 43839 225 1.77e-07 AAAGCGAGTC CATTGACTGTGA ACGCGGATTG 47218 388 3.56e-07 CTCAGGCCAA GATTGACAGTGA ATACTGGGAT 47000 285 9.35e-07 TCGCTCACAA GATTGACTGTGA TTAGTCGGTT 44846 408 1.18e-06 TGACTTGTAG CTTTGACAGTGA ATCGACCTCC 51073 409 1.76e-06 CAGTCCGTTG CACTCACAGTGA GCCCACAGCA 23314 441 1.94e-06 CACCGGTGCA CACTGACATTGA CTAATTCCCC 49081 18 3.40e-06 CATAGTTGAT CATTGTAAGTGA AGACCCACAG 47704 40 6.18e-06 GAGACACATA CACTGGCCGTGA CAGGGGGCAT 47041 342 1.28e-05 TTTGAATATT CACGGACAGTTA GGGTGTGGTC 468 41 3.57e-05 TGTGAGTACC CATTGTGTGTGT GTGTGTGTGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43839 1.8e-07 224_[+1]_264 47218 3.6e-07 387_[+1]_101 47000 9.4e-07 284_[+1]_204 44846 1.2e-06 407_[+1]_81 51073 1.8e-06 408_[+1]_80 23314 1.9e-06 440_[+1]_48 49081 3.4e-06 17_[+1]_471 47704 6.2e-06 39_[+1]_449 47041 1.3e-05 341_[+1]_147 468 3.6e-05 40_[+1]_448 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=10 43839 ( 225) CATTGACTGTGA 1 47218 ( 388) GATTGACAGTGA 1 47000 ( 285) GATTGACTGTGA 1 44846 ( 408) CTTTGACAGTGA 1 51073 ( 409) CACTCACAGTGA 1 23314 ( 441) CACTGACATTGA 1 49081 ( 18) CATTGTAAGTGA 1 47704 ( 40) CACTGGCCGTGA 1 47041 ( 342) CACGGACAGTTA 1 468 ( 41) CATTGTGTGTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 9.4462 E= 2.5e-001 -997 168 -26 -997 186 -997 -997 -140 -997 68 -997 118 -997 -997 -126 177 -997 -132 191 -997 150 -997 -126 -40 -131 168 -126 -997 128 -132 -997 18 -997 -997 191 -140 -997 -997 -997 192 -997 -997 191 -140 186 -997 -997 -140 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 10 E= 2.5e-001 0.000000 0.800000 0.200000 0.000000 0.900000 0.000000 0.000000 0.100000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.100000 0.900000 0.000000 0.100000 0.900000 0.000000 0.700000 0.000000 0.100000 0.200000 0.100000 0.800000 0.100000 0.000000 0.600000 0.100000 0.000000 0.300000 0.000000 0.000000 0.900000 0.100000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.900000 0.100000 0.900000 0.000000 0.000000 0.100000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CG]A[TC]TG[AT]C[AT]GTGA -------------------------------------------------------------------------------- Time 1.38 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 12 llr = 114 E-value = 5.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :3:1138:81:1 pos.-specific C 37:7:72a::13 probability G 4:22::::39:2 matrix T 31819:::::94 bits 2.1 * 1.9 * 1.7 * * 1.4 * ** ** Relative 1.2 * * ***** Entropy 1.0 * ******* (13.7 bits) 0.8 ** ******* 0.6 ********** 0.4 *********** 0.2 ************ 0.0 ------------ Multilevel GCTCTCACAGTT consensus TA A G C sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 44846 42 8.50e-07 GGCTTGGCTG GCTCTCACGGTT TGAATGTCAA 47000 214 1.58e-06 CCATCCACGG TCTCTCACGGTT ACTGTGAGTT 51073 86 1.58e-06 GTTCGGAAAG TATCTCACAGTC AGTTATTTTC 23314 63 2.27e-06 TTCCGCGGTT GCGCTCACAGTC AATCGTGATC 47041 269 3.67e-06 TTTTCAGACA CCTGTCACAGTC AGGTGCAAAA 49081 406 1.02e-05 CTTTCTTTCT CCTCTACCAGTT CGTGCGTCCG 468 237 1.24e-05 GCACTGTGTT TCTCTCACAATC GTTTGCGCGC 43839 379 4.41e-05 GTATCGAAGC GTTCTAACAGTA AATACAGTAG 47704 92 4.41e-05 TCGAACGGTA CCTCTAACAGCG CGGTGCCGCC 45058 45 7.46e-05 AGGCGTTTCG GATGTCCCGGTT TTCCAACGGG 47218 45 8.41e-05 CCAAAACCAG GCTAACACAGTG AAAAGCATAT 47503 304 1.12e-04 GACACGTTTA TAGTTAACAGTT GCCTCGCTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44846 8.5e-07 41_[+2]_447 47000 1.6e-06 213_[+2]_275 51073 1.6e-06 85_[+2]_403 23314 2.3e-06 62_[+2]_426 47041 3.7e-06 268_[+2]_220 49081 1e-05 405_[+2]_83 468 1.2e-05 236_[+2]_252 43839 4.4e-05 378_[+2]_110 47704 4.4e-05 91_[+2]_397 45058 7.5e-05 44_[+2]_444 47218 8.4e-05 44_[+2]_444 47503 0.00011 303_[+2]_185 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=12 44846 ( 42) GCTCTCACGGTT 1 47000 ( 214) TCTCTCACGGTT 1 51073 ( 86) TATCTCACAGTC 1 23314 ( 63) GCGCTCACAGTC 1 47041 ( 269) CCTGTCACAGTC 1 49081 ( 406) CCTCTACCAGTT 1 468 ( 237) TCTCTCACAATC 1 43839 ( 379) GTTCTAACAGTA 1 47704 ( 92) CCTCTAACAGCG 1 45058 ( 45) GATGTCCCGGTT 1 47218 ( 45) GCTAACACAGTG 1 47503 ( 304) TAGTTAACAGTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5868 bayes= 8.93074 E= 5.0e+002 -1023 0 80 34 1 142 -1023 -166 -1023 -1023 -52 166 -157 142 -52 -166 -157 -1023 -1023 179 43 142 -1023 -1023 175 -58 -1023 -1023 -1023 200 -1023 -1023 160 -1023 6 -1023 -157 -1023 194 -1023 -1023 -158 -1023 179 -157 42 -52 66 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 5.0e+002 0.000000 0.250000 0.416667 0.333333 0.250000 0.666667 0.000000 0.083333 0.000000 0.000000 0.166667 0.833333 0.083333 0.666667 0.166667 0.083333 0.083333 0.000000 0.000000 0.916667 0.333333 0.666667 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.083333 0.000000 0.916667 0.000000 0.000000 0.083333 0.000000 0.916667 0.083333 0.333333 0.166667 0.416667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GTC][CA]TCT[CA]AC[AG]GT[TC] -------------------------------------------------------------------------------- Time 2.59 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 4 llr = 82 E-value = 1.8e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 3:::3:::::::::83a:a8 pos.-specific C :::5::::533:::33:3:: probability G 8:5:8:a:5:8:a::::::: matrix T :a55:a:a:8:a:a:5:8:3 bits 2.1 * * * * 1.9 * *** *** * * 1.7 * *** *** * * 1.4 * *** *** * * Relative 1.2 ** **** ***** * ** Entropy 1.0 *************** **** (29.6 bits) 0.8 *************** **** 0.6 *************** **** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GTGCGTGTCTGTGTATATAA consensus A TTA GCC CA C T sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 468 89 6.45e-11 GTGTGTGTGT GTGTGTGTGTGTGTATATAT GTATACGTGT 45058 272 1.61e-10 GAAACTGTTG GTTCGTGTCCGTGTACATAA ATACGTTGCG 47704 460 4.61e-10 CAGAGAGAGA GTGTGTGTGTGTGTCAACAA CAATCGCAGA 23314 198 1.02e-09 GGATTGGTCC ATTCATGTCTCTGTATATAA GTAGGGGAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 468 6.4e-11 88_[+3]_392 45058 1.6e-10 271_[+3]_209 47704 4.6e-10 459_[+3]_21 23314 1e-09 197_[+3]_283 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=4 468 ( 89) GTGTGTGTGTGTGTATATAT 1 45058 ( 272) GTTCGTGTCCGTGTACATAA 1 47704 ( 460) GTGTGTGTGTGTGTCAACAA 1 23314 ( 198) ATTCATGTCTCTGTATATAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5772 bayes= 10.4939 E= 1.8e+003 1 -865 165 -865 -865 -865 -865 192 -865 -865 106 92 -865 100 -865 92 1 -865 165 -865 -865 -865 -865 192 -865 -865 206 -865 -865 -865 -865 192 -865 100 106 -865 -865 0 -865 150 -865 0 165 -865 -865 -865 -865 192 -865 -865 206 -865 -865 -865 -865 192 160 0 -865 -865 1 0 -865 92 201 -865 -865 -865 -865 0 -865 150 201 -865 -865 -865 160 -865 -865 -8 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 1.8e+003 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.500000 0.000000 0.500000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.250000 0.000000 0.500000 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.750000 1.000000 0.000000 0.000000 0.000000 0.750000 0.000000 0.000000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA]T[GT][CT][GA]TGT[CG][TC][GC]TGT[AC][TAC]A[TC]A[AT] -------------------------------------------------------------------------------- Time 3.80 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47041 6.54e-04 268_[+2(3.67e-06)]_61_\ [+1(1.28e-05)]_147 47218 1.93e-04 44_[+2(8.41e-05)]_331_\ [+1(3.56e-07)]_101 47503 3.22e-01 500 51073 3.31e-05 85_[+2(1.58e-06)]_311_\ [+1(1.76e-06)]_37_[+2(3.26e-05)]_31 47704 4.83e-09 39_[+1(6.18e-06)]_40_[+2(4.41e-05)]_\ 356_[+3(4.61e-10)]_21 49081 4.74e-04 17_[+1(3.40e-06)]_376_\ [+2(1.02e-05)]_83 23314 2.23e-10 62_[+2(2.27e-06)]_84_[+1(8.60e-05)]_\ 27_[+3(1.02e-09)]_223_[+1(1.94e-06)]_48 43839 1.33e-04 224_[+1(1.77e-07)]_142_\ [+2(4.41e-05)]_110 45058 5.19e-07 44_[+2(7.46e-05)]_215_\ [+3(1.61e-10)]_209 468 1.23e-09 44_[+3(3.31e-06)]_[+3(3.31e-06)]_4_\ [+3(6.45e-11)]_128_[+2(1.24e-05)]_252 47000 2.37e-05 213_[+2(1.58e-06)]_59_\ [+1(9.35e-07)]_204 44846 1.15e-05 9_[+2(2.14e-06)]_20_[+2(8.50e-07)]_\ 354_[+1(1.18e-06)]_81 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************