******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/110/110.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 20585 1.0000 500 261556 1.0000 500 264824 1.0000 500 4629 1.0000 500 ThpsCp072 1.0000 500 ThpsCt011 1.0000 500 ThpsCt012 1.0000 500 ThpsCt014 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/110/110.seqs.fa -oc motifs/110 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4000 N= 8 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.329 C 0.167 G 0.185 T 0.318 Background letter frequencies (from dataset with add-one prior applied): A 0.329 C 0.167 G 0.186 T 0.318 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 18 sites = 4 llr = 77 E-value = 6.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::::::a3:5:::5:5 pos.-specific C 33::8:88:3a3:583a5 probability G :8a:3a3:::::853::: matrix T 8::a:::3:5:33::3:: bits 2.6 * * 2.3 * * * * 2.1 * * * * 1.8 * *** * * * Relative 1.5 ******** * *** * Entropy 1.3 ******** * *** * (27.9 bits) 1.0 ********* * *** ** 0.8 ********* * *** ** 0.5 ********* * *** ** 0.3 ****************** 0.0 ------------------ Multilevel TGGTCGCCATCAGCCACA consensus CC G GT A CTGGC C sequence C T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 20585 375 4.97e-12 TGCCTCAATA TGGTCGCCACCAGCCACC GAGGAGACAT 261556 471 2.61e-10 AGACCCCGAC CGGTCGCCAACCGGCACA TCTCAATTGG 264824 182 2.77e-09 CAACAACACA TGGTGGGTATCAGGCCCC TTAACGTGGG 4629 91 7.91e-09 TACAACGTCT TCGTCGCCATCTTCGTCA ACGTTCATCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 20585 5e-12 374_[+1]_108 261556 2.6e-10 470_[+1]_12 264824 2.8e-09 181_[+1]_301 4629 7.9e-09 90_[+1]_392 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=18 seqs=4 20585 ( 375) TGGTCGCCACCAGCCACC 1 261556 ( 471) CGGTCGCCAACCGGCACA 1 264824 ( 182) TGGTGGGTATCAGGCCCC 1 4629 ( 91) TCGTCGCCATCTTCGTCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 3864 bayes= 9.91439 E= 6.4e-001 -865 58 -865 123 -865 58 201 -865 -865 -865 243 -865 -865 -865 -865 165 -865 216 43 -865 -865 -865 243 -865 -865 216 43 -865 -865 216 -865 -35 160 -865 -865 -865 -40 58 -865 65 -865 258 -865 -865 60 58 -865 -35 -865 -865 201 -35 -865 158 143 -865 -865 216 43 -865 60 58 -865 -35 -865 258 -865 -865 60 158 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 4 E= 6.4e-001 0.000000 0.250000 0.000000 0.750000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.750000 0.000000 0.250000 1.000000 0.000000 0.000000 0.000000 0.250000 0.250000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.500000 0.250000 0.000000 0.250000 0.000000 0.000000 0.750000 0.250000 0.000000 0.500000 0.500000 0.000000 0.000000 0.750000 0.250000 0.000000 0.500000 0.250000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TC][GC]GT[CG]G[CG][CT]A[TAC]C[ACT][GT][CG][CG][ACT]C[AC] -------------------------------------------------------------------------------- Time 0.76 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 6 llr = 74 E-value = 3.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :5:::a::2::7 pos.-specific C 8::2a:a55::: probability G 2:3::::::a73 matrix T :578:::53:3: bits 2.6 * * 2.3 * * * 2.1 * * * 1.8 * * * * Relative 1.5 * *** * Entropy 1.3 * *** ** (17.9 bits) 1.0 * ****** *** 0.8 * ********** 0.5 ************ 0.3 ************ 0.0 ------------ Multilevel CATTCACCCGGA consensus TG TT TG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ ThpsCt012 4 3.16e-07 TCC CATTCACCTGGA GTAGCTAAAG 264824 59 5.72e-07 AGCGAGTGGT CTTCCACTCGGA CGAGTGAATC ThpsCt011 312 9.74e-07 ATTATTCGGG CTTTCACTTGGG TTCGCGTTAT 261556 431 1.16e-06 GCAGACAATC GAGTCACCCGGA ACTCATTTCT 4629 406 1.41e-06 TTATGTATTA CATTCACTCGTG GAGCTTCTGG 20585 348 3.59e-06 ACTAATGATT CTGTCACCAGTA GTAACTGCCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ThpsCt012 3.2e-07 3_[+2]_485 264824 5.7e-07 58_[+2]_430 ThpsCt011 9.7e-07 311_[+2]_177 261556 1.2e-06 430_[+2]_58 4629 1.4e-06 405_[+2]_83 20585 3.6e-06 347_[+2]_141 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=6 ThpsCt012 ( 4) CATTCACCTGGA 1 264824 ( 59) CTTCCACTCGGA 1 ThpsCt011 ( 312) CTTTCACTTGGG 1 261556 ( 431) GAGTCACCCGGA 1 4629 ( 406) CATTCACTCGTG 1 20585 ( 348) CTGTCACCAGTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3912 bayes= 9.79456 E= 3.5e+001 -923 232 -15 -923 60 -923 -923 65 -923 -923 84 107 -923 0 -923 139 -923 258 -923 -923 160 -923 -923 -923 -923 258 -923 -923 -923 158 -923 65 -98 158 -923 7 -923 -923 243 -923 -923 -923 184 7 102 -923 84 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 3.5e+001 0.000000 0.833333 0.166667 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 0.333333 0.666667 0.000000 0.166667 0.000000 0.833333 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.166667 0.500000 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.666667 0.000000 0.333333 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[AT][TG]TCAC[CT][CT]G[GT][AG] -------------------------------------------------------------------------------- Time 1.49 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 17 sites = 2 llr = 46 E-value = 1.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::::::5::::::5:: pos.-specific C ::aa5::::aaaa::aa probability G :a:::55:a:::::::: matrix T a:::5555:::::a5:: bits 2.6 ** **** ** 2.3 *** ***** ** 2.1 *** ***** ** 1.8 *** ***** ** Relative 1.5 **** ****** ** Entropy 1.3 **** ****** ** (33.3 bits) 1.0 ******* ****** ** 0.8 ******* ****** ** 0.5 ***************** 0.3 ***************** 0.0 ----------------- Multilevel TGCCCGGAGCCCCTACC consensus TTTT T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ----------------- 20585 442 2.26e-11 ACCTCTCTGG TGCCCTGAGCCCCTTCC CCAACCATTC 4629 432 8.07e-11 TTCTGGGTCT TGCCTGTTGCCCCTACC AGCTCATAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 20585 2.3e-11 441_[+3]_42 4629 8.1e-11 431_[+3]_52 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=17 seqs=2 20585 ( 442) TGCCCTGAGCCCCTTCC 1 4629 ( 432) TGCCTGTTGCCCCTACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 17 n= 3872 bayes= 10.9181 E= 1.6e+002 -765 -765 -765 165 -765 -765 242 -765 -765 258 -765 -765 -765 258 -765 -765 -765 158 -765 65 -765 -765 143 65 -765 -765 143 65 60 -765 -765 65 -765 -765 242 -765 -765 258 -765 -765 -765 258 -765 -765 -765 258 -765 -765 -765 258 -765 -765 -765 -765 -765 165 60 -765 -765 65 -765 258 -765 -765 -765 258 -765 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 17 nsites= 2 E= 1.6e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.500000 0.500000 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.500000 0.000000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TGCC[CT][GT][GT][AT]GCCCCT[AT]CC -------------------------------------------------------------------------------- Time 2.25 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 20585 4.78e-17 347_[+2(3.59e-06)]_15_\ [+1(4.97e-12)]_49_[+3(2.26e-11)]_42 261556 5.75e-09 430_[+2(1.16e-06)]_28_\ [+1(2.61e-10)]_12 264824 3.98e-08 58_[+2(5.72e-07)]_111_\ [+1(2.77e-09)]_301 4629 7.35e-14 90_[+1(7.91e-09)]_297_\ [+2(1.41e-06)]_14_[+3(8.07e-11)]_52 ThpsCp072 9.12e-01 500 ThpsCt011 1.50e-02 311_[+2(9.74e-07)]_177 ThpsCt012 3.52e-03 3_[+2(3.16e-07)]_485 ThpsCt014 4.98e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************