******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/170/170.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 50645 1.0000 500 42758 1.0000 500 47388 1.0000 500 47401 1.0000 500 54091 1.0000 500 43639 1.0000 500 49293 1.0000 500 30334 1.0000 500 49595 1.0000 500 49960 1.0000 500 12354 1.0000 500 46825 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/170/170.seqs.fa -oc motifs/170 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.241 C 0.254 G 0.254 T 0.251 Background letter frequencies (from dataset with add-one prior applied): A 0.241 C 0.254 G 0.254 T 0.251 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 8 llr = 103 E-value = 8.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :4a1:5a93::86:: pos.-specific C 6::9::::661:491 probability G 35::a4::1481::: matrix T 11:::1:1::11:19 bits 2.1 * * * 1.8 * * * 1.6 * * * 1.4 *** ** ** Relative 1.2 *** ** ** Entropy 1.0 *** ** * **** (18.5 bits) 0.8 *** ** ****** 0.6 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel CGACGAAACCGAACT consensus GA G AG C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 50645 385 5.07e-08 GCTTCCCTAC CGACGGAAGCGAACT CCTGCTTCTC 47388 389 6.36e-08 GTCGCGAAGC GAACGGAACGGAACT TTCCAACGGG 42758 224 1.21e-07 CGCAGTAGAG CAACGGAACCGAACC AACGACGGCA 49293 8 5.43e-07 ACTATGC CGACGAAAAGGGCCT ACACACCAAG 49595 145 6.53e-07 TAATGGAAAT CAACGTAACGTAACT ATACCTAGTT 47401 18 1.04e-06 AAGGACAATC CGAAGAATCCGACCT CGTTGCCCAT 43639 301 1.83e-06 ACAATTTTCT GGACGAAACCCTCCT TGCTGGAACA 12354 419 3.21e-06 GTCGGACCTT TTACGAAAACGAATT CTACTACCGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50645 5.1e-08 384_[+1]_101 47388 6.4e-08 388_[+1]_97 42758 1.2e-07 223_[+1]_262 49293 5.4e-07 7_[+1]_478 49595 6.5e-07 144_[+1]_341 47401 1e-06 17_[+1]_468 43639 1.8e-06 300_[+1]_185 12354 3.2e-06 418_[+1]_67 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=8 50645 ( 385) CGACGGAAGCGAACT 1 47388 ( 389) GAACGGAACGGAACT 1 42758 ( 224) CAACGGAACCGAACC 1 49293 ( 8) CGACGAAAAGGGCCT 1 49595 ( 145) CAACGTAACGTAACT 1 47401 ( 18) CGAAGAATCCGACCT 1 43639 ( 301) GGACGAAACCCTCCT 1 12354 ( 419) TTACGAAAACGAATT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 5832 bayes= 9.50779 E= 8.0e+002 -965 130 -2 -101 64 -965 98 -101 205 -965 -965 -965 -94 178 -965 -965 -965 -965 198 -965 105 -965 56 -101 205 -965 -965 -965 186 -965 -965 -101 5 130 -102 -965 -965 130 56 -965 -965 -102 156 -101 164 -965 -102 -101 137 56 -965 -965 -965 178 -965 -101 -965 -102 -965 180 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 8 E= 8.0e+002 0.000000 0.625000 0.250000 0.125000 0.375000 0.000000 0.500000 0.125000 1.000000 0.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.375000 0.125000 1.000000 0.000000 0.000000 0.000000 0.875000 0.000000 0.000000 0.125000 0.250000 0.625000 0.125000 0.000000 0.000000 0.625000 0.375000 0.000000 0.000000 0.125000 0.750000 0.125000 0.750000 0.000000 0.125000 0.125000 0.625000 0.375000 0.000000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.125000 0.000000 0.875000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CG][GA]ACG[AG]AA[CA][CG]GA[AC]CT -------------------------------------------------------------------------------- Time 1.35 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 12 llr = 126 E-value = 1.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 1::718319:832a57 pos.-specific C 53:3523513:53:11 probability G 45::::32:8225:3: matrix T :3a:4:13::::::23 bits 2.1 * * 1.8 * * 1.6 * * * 1.4 * * * * * Relative 1.2 * * *** * Entropy 1.0 ** * *** * (15.2 bits) 0.8 ** * *** * * 0.6 * **** ****** * 0.4 ****** ****** * 0.2 **************** 0.0 ---------------- Multilevel CGTACACCAGACGAAA consensus GC CT GT C AC GT sequence T A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 47388 447 3.02e-09 GAAACAGCGT CGTACACCAGACCAAA CGGTAGATTC 43639 485 5.20e-07 TAGTGGACGA GGTACATCAGACAAAA 46825 182 1.30e-06 ATCCGAGGGC CGTAAACGAGAAGAAA TCTGTTACAA 12354 172 3.46e-06 TCGATTCCAT CGTATCGTAGAAGAAT CATGCAAGCT 42758 182 6.09e-06 CATATTCGGA CCTACCACAGACGACA GATACGTGTC 50645 323 6.09e-06 GTCGGCCGCC GCTACAGCCGAGGAAA CCGACAACAG 54091 281 6.65e-06 GAGCGCCTTC CTTCCAACACACGAGT CGGGTATTCT 47401 437 6.65e-06 TGTCGTGGTA GCTATAGTAGGAGAGA TAGCGGGCTA 49960 472 8.62e-06 TCCTGGAGTT GGTATACTACAGCATA GTACACCAGC 49293 117 1.10e-05 CGTCGTGAAG GTTCTACGAGACAATA TGTAATGGAA 30334 291 3.88e-05 GTTGGCAGGT ATTCTAGCAGACCAGC ACAATCCCCA 49595 360 5.57e-05 TCGCTCGTTC CGTCCAAAACGACAAT TGTCGGATGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47388 3e-09 446_[+2]_38 43639 5.2e-07 484_[+2] 46825 1.3e-06 181_[+2]_303 12354 3.5e-06 171_[+2]_313 42758 6.1e-06 181_[+2]_303 50645 6.1e-06 322_[+2]_162 54091 6.7e-06 280_[+2]_204 47401 6.7e-06 436_[+2]_48 49960 8.6e-06 471_[+2]_13 49293 1.1e-05 116_[+2]_368 30334 3.9e-05 290_[+2]_194 49595 5.6e-05 359_[+2]_125 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=12 47388 ( 447) CGTACACCAGACCAAA 1 43639 ( 485) GGTACATCAGACAAAA 1 46825 ( 182) CGTAAACGAGAAGAAA 1 12354 ( 172) CGTATCGTAGAAGAAT 1 42758 ( 182) CCTACCACAGACGACA 1 50645 ( 323) GCTACAGCCGAGGAAA 1 54091 ( 281) CTTCCAACACACGAGT 1 47401 ( 437) GCTATAGTAGGAGAGA 1 49960 ( 472) GGTATACTACAGCATA 1 49293 ( 117) GTTCTACGAGACAATA 1 30334 ( 291) ATTCTAGCAGACCAGC 1 49595 ( 360) CGTCCAAAACGACAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 10.02 E= 1.2e+003 -153 98 71 -1023 -1023 -2 98 -1 -1023 -1023 -1023 199 147 39 -1023 -1023 -153 98 -1023 73 179 -61 -1023 -1023 5 39 39 -159 -153 98 -61 -1 193 -160 -1023 -1023 -1023 -2 156 -1023 179 -1023 -61 -1023 47 98 -61 -1023 -53 39 98 -1023 205 -1023 -1023 -1023 105 -160 -2 -59 147 -160 -1023 -1 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 12 E= 1.2e+003 0.083333 0.500000 0.416667 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.000000 0.000000 1.000000 0.666667 0.333333 0.000000 0.000000 0.083333 0.500000 0.000000 0.416667 0.833333 0.166667 0.000000 0.000000 0.250000 0.333333 0.333333 0.083333 0.083333 0.500000 0.166667 0.250000 0.916667 0.083333 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.833333 0.000000 0.166667 0.000000 0.333333 0.500000 0.166667 0.000000 0.166667 0.333333 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.083333 0.250000 0.166667 0.666667 0.083333 0.000000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG][GCT]T[AC][CT]A[CGA][CT]A[GC]A[CA][GC]A[AG][AT] -------------------------------------------------------------------------------- Time 2.84 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 4 llr = 78 E-value = 6.9e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :3:8::::::3:::::::3 pos.-specific C :53:a::::::::::3::: probability G ::5::33::8:a5:a:a:8 matrix T a333:88aa38:5a:8:a: bits 2.1 * * ** * ** ** 1.8 * * ** * ** ** 1.6 * * ** * ** ** 1.4 * * ** * ** ** Relative 1.2 * ********* ****** Entropy 1.0 * **************** (28.1 bits) 0.8 * **************** 0.6 * **************** 0.4 ******************* 0.2 ******************* 0.0 ------------------- Multilevel TCGACTTTTGTGGTGTGTG consensus ACT GG TA T C A sequence TT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 47388 338 9.05e-11 TCCTTGTCTC TCGACTGTTGTGTTGTGTG TAGTAGACGA 46825 54 4.31e-10 CCGACTTCGA TCGACGTTTGAGGTGTGTG GGATGCGTTC 54091 301 7.40e-10 ACGAGTCGGG TATTCTTTTGTGTTGTGTG AGCAGTGCAC 42758 463 4.53e-09 CACACGATTA TTCACTTTTTTGGTGCGTA ATTGGTTGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47388 9.1e-11 337_[+3]_144 46825 4.3e-10 53_[+3]_428 54091 7.4e-10 300_[+3]_181 42758 4.5e-09 462_[+3]_19 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=4 47388 ( 338) TCGACTGTTGTGTTGTGTG 1 46825 ( 54) TCGACGTTTGAGGTGTGTG 1 54091 ( 301) TATTCTTTTGTGTTGTGTG 1 42758 ( 463) TTCACTTTTTTGGTGCGTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 5784 bayes= 9.64806 E= 6.9e+003 -865 -865 -865 199 5 98 -865 -1 -865 -2 98 -1 164 -865 -865 -1 -865 198 -865 -865 -865 -865 -2 157 -865 -865 -2 157 -865 -865 -865 199 -865 -865 -865 199 -865 -865 156 -1 5 -865 -865 157 -865 -865 197 -865 -865 -865 98 99 -865 -865 -865 199 -865 -865 197 -865 -865 -2 -865 157 -865 -865 197 -865 -865 -865 -865 199 5 -865 156 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 4 E= 6.9e+003 0.000000 0.000000 0.000000 1.000000 0.250000 0.500000 0.000000 0.250000 0.000000 0.250000 0.500000 0.250000 0.750000 0.000000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.250000 0.000000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.250000 0.000000 0.750000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[CAT][GCT][AT]C[TG][TG]TT[GT][TA]G[GT]TG[TC]GT[GA] -------------------------------------------------------------------------------- Time 4.06 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50645 7.93e-06 322_[+2(6.09e-06)]_46_\ [+1(5.07e-08)]_101 42758 1.66e-10 181_[+2(6.09e-06)]_26_\ [+1(1.21e-07)]_224_[+3(4.53e-09)]_19 47388 1.72e-15 337_[+3(9.05e-11)]_32_\ [+1(6.36e-08)]_43_[+2(3.02e-09)]_38 47401 2.60e-05 17_[+1(1.04e-06)]_404_\ [+2(6.65e-06)]_48 54091 1.55e-07 280_[+2(6.65e-06)]_4_[+3(7.40e-10)]_\ 181 43639 1.68e-05 300_[+1(1.83e-06)]_11_\ [+2(8.98e-05)]_142_[+2(5.20e-07)] 49293 2.24e-05 7_[+1(5.43e-07)]_94_[+2(1.10e-05)]_\ 368 30334 1.78e-02 290_[+2(3.88e-05)]_194 49595 2.93e-04 144_[+1(6.53e-07)]_200_\ [+2(5.57e-05)]_27_[+1(9.47e-05)]_83 49960 2.31e-02 428_[+2(5.05e-05)]_27_\ [+2(8.62e-06)]_13 12354 1.54e-04 171_[+2(3.46e-06)]_231_\ [+1(3.21e-06)]_67 46825 2.04e-08 53_[+3(4.31e-10)]_109_\ [+2(1.30e-06)]_303 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************