******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/446/446.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 37482 1.0000 500 48291 1.0000 500 48640 1.0000 500 49156 1.0000 500 43830 1.0000 500 50602 1.0000 500 50164 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/446/446.seqs.fa -oc motifs/446 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.303 C 0.214 G 0.209 T 0.275 Background letter frequencies (from dataset with add-one prior applied): A 0.303 C 0.214 G 0.209 T 0.275 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 17 sites = 4 llr = 72 E-value = 5.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :3::::::::3::3::: pos.-specific C ::::::3:::8a88:85 probability G 8:::a::58a::3:5:5 matrix T 38aa:a853:::::53: bits 2.3 * * * 2.0 * * * 1.8 **** * * 1.6 **** * * Relative 1.4 * **** ****** ** Entropy 1.1 ***************** (25.9 bits) 0.9 ***************** 0.7 ***************** 0.5 ***************** 0.2 ***************** 0.0 ----------------- Multilevel GTTTGTTGGGCCCCGCC consensus TA CTT A GATTG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ----------------- 37482 404 2.49e-11 CGACGGAACT GTTTGTTGGGCCCCGCC TGATTGACAG 50602 288 2.40e-09 AAAAAAAATG GATTGTTTGGCCGCGCC GAGAACTAGA 48640 317 1.02e-08 CAAAGTCGTT GTTTGTCGTGCCCATCG AAGTTATTGA 43830 377 1.92e-08 AGATCCAGAC TTTTGTTTGGACCCTTG ACACTGCTTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37482 2.5e-11 403_[+1]_80 50602 2.4e-09 287_[+1]_196 48640 1e-08 316_[+1]_167 43830 1.9e-08 376_[+1]_107 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=17 seqs=4 37482 ( 404) GTTTGTTGGGCCCCGCC 1 50602 ( 288) GATTGTTTGGCCGCGCC 1 48640 ( 317) GTTTGTCGTGCCCATCG 1 43830 ( 377) TTTTGTTTGGACCCTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 17 n= 3388 bayes= 10.4622 E= 5.5e+001 -865 -865 184 -13 -28 -865 -865 145 -865 -865 -865 186 -865 -865 -865 186 -865 -865 226 -865 -865 -865 -865 186 -865 23 -865 145 -865 -865 126 86 -865 -865 184 -13 -865 -865 226 -865 -28 181 -865 -865 -865 222 -865 -865 -865 181 26 -865 -28 181 -865 -865 -865 -865 126 86 -865 181 -865 -13 -865 122 126 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 17 nsites= 4 E= 5.5e+001 0.000000 0.000000 0.750000 0.250000 0.250000 0.000000 0.000000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.750000 0.000000 0.250000 0.000000 0.500000 0.500000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GT][TA]TTGT[TC][GT][GT]G[CA]C[CG][CA][GT][CT][CG] -------------------------------------------------------------------------------- Time 0.45 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 5 llr = 74 E-value = 1.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A a:6:44:::2::2: pos.-specific C :a48:28::6a::: probability G :::::4::::::8a matrix T :::26:2aa2:a:: bits 2.3 * * * 2.0 * * * 1.8 ** ** ** * 1.6 ** ** ** * Relative 1.4 ** * *** **** Entropy 1.1 ** * *** **** (21.3 bits) 0.9 ***** *** **** 0.7 ***** ******** 0.5 ************** 0.2 ************** 0.0 -------------- Multilevel ACACTACTTCCTGG consensus CTAGT A A sequence C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 50602 50 2.27e-08 ATACGCCGAC ACCCAACTTCCTGG ATGAAAGTGC 43830 394 3.46e-08 TGGACCCTTG ACACTGCTTACTGG TTTGTTTATT 49156 337 1.03e-07 CCACCACCAA ACACTACTTCCTAG CTAGGTCCTT 37482 330 1.27e-07 TTTCAAGTCT ACCCAGTTTCCTGG CATAAAATTA 48291 235 3.77e-07 GCTTTTCCGG ACATTCCTTTCTGG TAGGGATATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50602 2.3e-08 49_[+2]_437 43830 3.5e-08 393_[+2]_93 49156 1e-07 336_[+2]_150 37482 1.3e-07 329_[+2]_157 48291 3.8e-07 234_[+2]_252 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=5 50602 ( 50) ACCCAACTTCCTGG 1 43830 ( 394) ACACTGCTTACTGG 1 49156 ( 337) ACACTACTTCCTAG 1 37482 ( 330) ACCCAGTTTCCTGG 1 48291 ( 235) ACATTCCTTTCTGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 3409 bayes= 9.66297 E= 1.1e+001 172 -897 -897 -897 -897 222 -897 -897 99 90 -897 -897 -897 190 -897 -46 40 -897 -897 113 40 -10 94 -897 -897 190 -897 -46 -897 -897 -897 186 -897 -897 -897 186 -60 149 -897 -46 -897 222 -897 -897 -897 -897 -897 186 -60 -897 194 -897 -897 -897 226 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 5 E= 1.1e+001 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.400000 0.000000 0.000000 0.600000 0.400000 0.200000 0.400000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.200000 0.600000 0.000000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- AC[AC][CT][TA][AGC][CT]TT[CAT]CT[GA]G -------------------------------------------------------------------------------- Time 0.94 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 5 llr = 76 E-value = 9.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a:::6:::::6::4aa pos.-specific C ::6:::::462624:: probability G :a::2::a::2422:: matrix T ::4a2aa:64::6::: bits 2.3 * * 2.0 * * 1.8 ** * *** ** 1.6 ** * *** ** Relative 1.4 ** * *** * ** Entropy 1.1 **** ***** * ** (22.0 bits) 0.9 **** ***** * ** 0.7 **** ***** ** ** 0.5 **************** 0.2 **************** 0.0 ---------------- Multilevel AGCTATTGTCACTAAA consensus T G CTCGCC sequence T G GG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 50602 361 5.54e-09 CTACACAAGT AGCTATTGTCAGTAAA GATAGTTTTA 50164 309 1.69e-08 GACTTCGAGA AGCTATTGCCAGTGAA CCGAGTTCAC 37482 437 2.48e-08 ACAGTGAAGC AGCTATTGTTCCTCAA AGTTGACAGT 48640 29 1.87e-07 CTACAGGCTG AGTTTTTGTCACCAAA TCAAATATGC 43830 429 3.11e-07 TGACGGTTAC AGTTGTTGCTGCGCAA GCCACGAGCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50602 5.5e-09 360_[+3]_124 50164 1.7e-08 308_[+3]_176 37482 2.5e-08 436_[+3]_48 48640 1.9e-07 28_[+3]_456 43830 3.1e-07 428_[+3]_56 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=5 50602 ( 361) AGCTATTGTCAGTAAA 1 50164 ( 309) AGCTATTGCCAGTGAA 1 37482 ( 437) AGCTATTGTTCCTCAA 1 48640 ( 29) AGTTTTTGTCACCAAA 1 43830 ( 429) AGTTGTTGCTGCGCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 3395 bayes= 9.65702 E= 9.3e+001 172 -897 -897 -897 -897 -897 226 -897 -897 149 -897 54 -897 -897 -897 186 99 -897 -6 -46 -897 -897 -897 186 -897 -897 -897 186 -897 -897 226 -897 -897 90 -897 113 -897 149 -897 54 99 -10 -6 -897 -897 149 94 -897 -897 -10 -6 113 40 90 -6 -897 172 -897 -897 -897 172 -897 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 9.3e+001 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 0.000000 1.000000 0.600000 0.000000 0.200000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.600000 0.000000 0.400000 0.600000 0.200000 0.200000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.200000 0.200000 0.600000 0.400000 0.400000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AG[CT]T[AGT]TTG[TC][CT][ACG][CG][TCG][ACG]AA -------------------------------------------------------------------------------- Time 1.38 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37482 7.24e-15 329_[+2(1.27e-07)]_60_\ [+1(2.49e-11)]_16_[+3(2.48e-08)]_48 48291 2.77e-03 234_[+2(3.77e-07)]_252 48640 7.24e-08 28_[+3(1.87e-07)]_272_\ [+1(1.02e-08)]_167 49156 1.91e-03 336_[+2(1.03e-07)]_150 43830 1.24e-11 376_[+1(1.92e-08)]_[+2(3.46e-08)]_\ 21_[+3(3.11e-07)]_56 50602 2.61e-14 49_[+2(2.27e-08)]_224_\ [+1(2.40e-09)]_56_[+3(5.54e-09)]_124 50164 2.66e-04 308_[+3(1.69e-08)]_176 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************