******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/398/398.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 31431 1.0000 500 42512 1.0000 500 31877 1.0000 500 37300 1.0000 500 47324 1.0000 500 44574 1.0000 500 54879 1.0000 500 47022 1.0000 500 34625 1.0000 500 44091 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/398/398.seqs.fa -oc motifs/398 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.266 C 0.226 G 0.221 T 0.287 Background letter frequencies (from dataset with add-one prior applied): A 0.266 C 0.226 G 0.221 T 0.287 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 4 llr = 70 E-value = 1.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::3::a3:3:3::a8 pos.-specific C a::3:a::a:::8::3 probability G ::::a::5:3:8:a:: matrix T :aa5:::3:5a:3::: bits 2.2 * ** * * 2.0 * *** * ** 1.7 *** *** * * ** 1.5 *** *** * * ** Relative 1.3 *** *** * ***** Entropy 1.1 *** *** * ****** (25.1 bits) 0.9 *** *** * ****** 0.7 *** *** * ****** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel CTTTGCAGCTTGCGAA consensus A A A AT C sequence C T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 47022 309 2.07e-09 CTCCGTTGAG CTTTGCATCTTGCGAA GTCCGATAAA 54879 312 9.22e-09 GAGTAGAATT CTTCGCAACATGCGAA TCAAAGTAAT 37300 32 1.22e-08 TTATGCTCTA CTTTGCAGCTTACGAC GTTGCGTACG 31877 422 1.36e-08 ATTTTCTTCC CTTAGCAGCGTGTGAA AAACGAACAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47022 2.1e-09 308_[+1]_176 54879 9.2e-09 311_[+1]_173 37300 1.2e-08 31_[+1]_453 31877 1.4e-08 421_[+1]_63 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=4 47022 ( 309) CTTTGCATCTTGCGAA 1 54879 ( 312) CTTCGCAACATGCGAA 1 37300 ( 32) CTTTGCAGCTTACGAC 1 31877 ( 422) CTTAGCAGCGTGTGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4850 bayes= 10.2426 E= 1.9e+002 -865 214 -865 -865 -865 -865 -865 180 -865 -865 -865 180 -9 15 -865 80 -865 -865 217 -865 -865 214 -865 -865 191 -865 -865 -865 -9 -865 117 -20 -865 214 -865 -865 -9 -865 18 80 -865 -865 -865 180 -9 -865 176 -865 -865 173 -865 -20 -865 -865 217 -865 191 -865 -865 -865 150 15 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 4 E= 1.9e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.250000 0.250000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.500000 0.250000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.250000 0.500000 0.000000 0.000000 0.000000 1.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.750000 0.000000 0.250000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CTT[TAC]GCA[GAT]C[TAG]T[GA][CT]GA[AC] -------------------------------------------------------------------------------- Time 0.98 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 17 sites = 5 llr = 83 E-value = 2.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::2::::4:4::4: pos.-specific C 8:62:4:4:a:66:::2 probability G 2a4:a:64::6::a:68 matrix T :::8:442a::4::a:: bits 2.2 * * * * 2.0 * * * * 1.7 * * ** ** 1.5 ** * ** ** * Relative 1.3 ** * ** ** * Entropy 1.1 ***** * ********* (23.9 bits) 0.9 ***** * ********* 0.7 ***** *********** 0.4 ***************** 0.2 ***************** 0.0 ----------------- Multilevel CGCTGCGCTCGCCGTGG consensus G GC TTG ATA AC sequence A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ----------------- 44574 272 1.66e-09 AGTGCGCATG CGCTGCTCTCGCAGTGG AATCGACAAT 31877 68 1.01e-08 AGACGGGAAG CGGTGCTGTCATCGTGG TATGGTCGTT 42512 341 1.76e-08 GCAGGACTGC CGCCGTGCTCGTCGTAG CGTACTACTA 47022 71 3.41e-08 CCATAGTAAT GGGTGAGGTCGCCGTAG CTTTCTCTAT 37300 65 4.88e-08 ACGAATCATG CGCTGTGTTCACAGTGC AGCAGTCTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44574 1.7e-09 271_[+2]_212 31877 1e-08 67_[+2]_416 42512 1.8e-08 340_[+2]_143 47022 3.4e-08 70_[+2]_413 37300 4.9e-08 64_[+2]_419 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=17 seqs=5 44574 ( 272) CGCTGCTCTCGCAGTGG 1 31877 ( 68) CGGTGCTGTCATCGTGG 1 42512 ( 341) CGCCGTGCTCGTCGTAG 1 47022 ( 71) GGGTGAGGTCGCCGTAG 1 37300 ( 65) CGCTGTGTTCACAGTGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 17 n= 4840 bayes= 10.1691 E= 2.1e+002 -897 182 -15 -897 -897 -897 217 -897 -897 141 85 -897 -897 -17 -897 148 -897 -897 217 -897 -41 82 -897 48 -897 -897 144 48 -897 82 85 -52 -897 -897 -897 180 -897 214 -897 -897 59 -897 144 -897 -897 141 -897 48 59 141 -897 -897 -897 -897 217 -897 -897 -897 -897 180 59 -897 144 -897 -897 -17 185 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 17 nsites= 5 E= 2.1e+002 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 1.000000 0.000000 0.200000 0.400000 0.000000 0.400000 0.000000 0.000000 0.600000 0.400000 0.000000 0.400000 0.400000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.600000 0.000000 0.400000 0.400000 0.600000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.200000 0.800000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG]G[CG][TC]G[CTA][GT][CGT]TC[GA][CT][CA]GT[GA][GC] -------------------------------------------------------------------------------- Time 2.12 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 5 llr = 88 E-value = 1.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :a62a46:::::::2:26: pos.-specific C 2:42:44a22::::82::: probability G 8::6:2::64::2a:284a matrix T ::::::::24aa8::6::: bits 2.2 * * * 2.0 * * * * * 1.7 * * * ** * * 1.5 ** * * ** * * Relative 1.3 ** * * ** ** * * Entropy 1.1 *** * ** ***** *** (25.5 bits) 0.9 *** * ** ***** *** 0.7 ***** *** ********* 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel GAAGAAACGGTTTGCTGAG consensus C CA CC CT G ACAG sequence C G TC G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 44091 285 1.97e-09 AGGGCCAACC GAAGAGACGCTTTGCCGGG ATACTAATTT 42512 196 1.97e-09 GGCCTAAGTG GACGACCCGTTTTGATGGG AGCGGTTTTC 31877 442 4.97e-09 TGTGAAAAAC GAACAAACCGTTTGCGGAG TTCTTTACTT 31431 81 6.01e-09 GTTTCTAACT GAAGACCCTTTTTGCTAAG CTAAGAGTAA 54879 286 1.35e-08 TAAGACTTGC CACAAAACGGTTGGCTGAG TAGAATTCTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44091 2e-09 284_[+3]_197 42512 2e-09 195_[+3]_286 31877 5e-09 441_[+3]_40 31431 6e-09 80_[+3]_401 54879 1.4e-08 285_[+3]_196 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=5 44091 ( 285) GAAGAGACGCTTTGCCGGG 1 42512 ( 196) GACGACCCGTTTTGATGGG 1 31877 ( 442) GAACAAACCGTTTGCGGAG 1 31431 ( 81) GAAGACCCTTTTTGCTAAG 1 54879 ( 286) CACAAAACGGTTGGCTGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 4820 bayes= 10.1632 E= 1.8e+002 -897 -17 185 -897 191 -897 -897 -897 117 82 -897 -897 -41 -17 144 -897 191 -897 -897 -897 59 82 -15 -897 117 82 -897 -897 -897 214 -897 -897 -897 -17 144 -52 -897 -17 85 48 -897 -897 -897 180 -897 -897 -897 180 -897 -897 -15 148 -897 -897 217 -897 -41 182 -897 -897 -897 -17 -15 106 -41 -897 185 -897 117 -897 85 -897 -897 -897 217 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 5 E= 1.8e+002 0.000000 0.200000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.200000 0.200000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.400000 0.200000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.600000 0.200000 0.000000 0.200000 0.400000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 1.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.200000 0.200000 0.600000 0.200000 0.000000 0.800000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC]A[AC][GAC]A[ACG][AC]C[GCT][GTC]TT[TG]G[CA][TCG][GA][AG]G -------------------------------------------------------------------------------- Time 3.12 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 31431 8.35e-05 80_[+3(6.01e-09)]_401 42512 3.05e-10 195_[+3(1.97e-09)]_126_\ [+2(1.76e-08)]_143 31877 5.61e-14 67_[+2(1.01e-08)]_337_\ [+1(1.36e-08)]_4_[+3(4.97e-09)]_40 37300 5.75e-09 31_[+1(1.22e-08)]_17_[+2(4.88e-08)]_\ 419 47324 9.26e-01 500 44574 7.79e-05 271_[+2(1.66e-09)]_212 54879 3.39e-09 285_[+3(1.35e-08)]_7_[+1(9.22e-09)]_\ 173 47022 2.61e-09 70_[+2(3.41e-08)]_221_\ [+1(2.07e-09)]_176 34625 9.14e-01 500 44091 2.10e-05 284_[+3(1.97e-09)]_197 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************