******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/431/431.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 18258 1.0000 500 44917 1.0000 500 45222 1.0000 500 35424 1.0000 500 38908 1.0000 500 41506 1.0000 500 35145 1.0000 500 34479 1.0000 500 47142 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/431/431.seqs.fa -oc motifs/431 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.290 C 0.227 G 0.222 T 0.262 Background letter frequencies (from dataset with add-one prior applied): A 0.290 C 0.227 G 0.222 T 0.262 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 6 llr = 87 E-value = 6.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::::2:::33:738: pos.-specific C :8:25:::::::37:7 probability G :25:5::8372a::2: matrix T a:58:8a27:5::::3 bits 2.2 * 2.0 * * * 1.7 * * * 1.5 ** ** * Relative 1.3 ** * *** * * Entropy 1.1 ********** ***** (20.9 bits) 0.9 ********** ***** 0.7 ********** ***** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TCGTCTTGTGTGACAC consensus T G GAA CA T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 18258 106 4.95e-09 GATGGATCGG TCGTGTTGTGGGACAC CGTTGCAGAG 47142 216 6.19e-08 TTTGTGGAAC TCTTCTTGTATGACGC ATACCAGTCC 41506 419 6.19e-08 TTCGACAGGC TCGCCTTGGGTGCCAC GACCCAAGTA 35145 472 8.73e-08 TTTCGACAAA TCGTCTTTTGTGAAAC CCCATCGAGC 35424 365 1.48e-07 GCCGTGGATT TCTTGATGTGAGACAT CAGTTGTGGA 45222 210 1.01e-06 ATCGCAAGTT TGTTGTTGGAAGCAAT CCTTCTACTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 18258 4.9e-09 105_[+1]_379 47142 6.2e-08 215_[+1]_269 41506 6.2e-08 418_[+1]_66 35145 8.7e-08 471_[+1]_13 35424 1.5e-07 364_[+1]_120 45222 1e-06 209_[+1]_275 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=6 18258 ( 106) TCGTGTTGTGGGACAC 1 47142 ( 216) TCTTCTTGTATGACGC 1 41506 ( 419) TCGCCTTGGGTGCCAC 1 35145 ( 472) TCGTCTTTTGTGAAAC 1 35424 ( 365) TCTTGATGTGAGACAT 1 45222 ( 210) TGTTGTTGGAAGCAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4365 bayes= 9.95281 E= 6.0e+002 -923 -923 -923 193 -923 188 -41 -923 -923 -923 117 93 -923 -44 -923 167 -923 114 117 -923 -80 -923 -923 167 -923 -923 -923 193 -923 -923 191 -65 -923 -923 59 135 20 -923 159 -923 20 -923 -41 93 -923 -923 217 -923 120 56 -923 -923 20 155 -923 -923 152 -923 -41 -923 -923 155 -923 35 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 6.0e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.166667 0.000000 0.833333 0.000000 0.500000 0.500000 0.000000 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.333333 0.666667 0.333333 0.000000 0.666667 0.000000 0.333333 0.000000 0.166667 0.500000 0.000000 0.000000 1.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.666667 0.000000 0.333333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TC[GT]T[CG]TTG[TG][GA][TA]G[AC][CA]A[CT] -------------------------------------------------------------------------------- Time 0.88 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 5 llr = 77 E-value = 2.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::2:::62:::::2: pos.-specific C ::2262::82::2::2 probability G a:8:44a4::::6:28 matrix T :a:6:4:::8aa2a6: bits 2.2 * * 2.0 ** * ** * 1.7 ** * ** * 1.5 *** * ** * * Relative 1.3 *** * **** * * Entropy 1.1 *** * * **** * * (22.2 bits) 0.9 *** * ****** * * 0.7 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GTGTCGGACTTTGTTG consensus CAGT GAC C AC sequence C C T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 45222 379 1.26e-08 TTTAGAAGAT GTGACCGACTTTGTTG AACAAGTATC 41506 374 4.15e-08 AAAACCATCC GTGTCGGGATTTGTGG CCAAACACAC 38908 107 5.08e-08 ATTGACTCAG GTCCGTGACTTTGTTG TCAGACGGAA 47142 105 7.09e-08 TTTCCGGTCA GTGTGGGGCTTTTTTC CGACGCCTGA 34479 320 1.06e-07 GCGATTGTTA GTGTCTGACCTTCTAG CCGTATTTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45222 1.3e-08 378_[+2]_106 41506 4.2e-08 373_[+2]_111 38908 5.1e-08 106_[+2]_378 47142 7.1e-08 104_[+2]_380 34479 1.1e-07 319_[+2]_165 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=5 45222 ( 379) GTGACCGACTTTGTTG 1 41506 ( 374) GTGTCGGGATTTGTGG 1 38908 ( 107) GTCCGTGACTTTGTTG 1 47142 ( 105) GTGTGGGGCTTTTTTC 1 34479 ( 320) GTGTCTGACCTTCTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4365 bayes= 10.02 E= 2.0e+002 -897 -897 217 -897 -897 -897 -897 193 -897 -18 185 -897 -53 -18 -897 119 -897 140 85 -897 -897 -18 85 61 -897 -897 217 -897 105 -897 85 -897 -53 182 -897 -897 -897 -18 -897 161 -897 -897 -897 193 -897 -897 -897 193 -897 -18 144 -39 -897 -897 -897 193 -53 -897 -15 119 -897 -18 185 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 2.0e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.800000 0.000000 0.200000 0.200000 0.000000 0.600000 0.000000 0.600000 0.400000 0.000000 0.000000 0.200000 0.400000 0.400000 0.000000 0.000000 1.000000 0.000000 0.600000 0.000000 0.400000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.600000 0.200000 0.000000 0.000000 0.000000 1.000000 0.200000 0.000000 0.200000 0.600000 0.000000 0.200000 0.800000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GT[GC][TAC][CG][GTC]G[AG][CA][TC]TT[GCT]T[TAG][GC] -------------------------------------------------------------------------------- Time 1.60 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 8 llr = 96 E-value = 1.4e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1a:::35a4465:19: pos.-specific C 9:::144:513:851: probability G ::3a91::11::34:6 matrix T ::8::31::415:::4 bits 2.2 * 2.0 * 1.7 * * * 1.5 ** ** * Relative 1.3 ** ** * * * Entropy 1.1 ***** * * ** (17.4 bits) 0.9 ***** * ** ** 0.7 ***** ** ****** 0.4 ***** *** ****** 0.2 ***** ********** 0.0 ---------------- Multilevel CATGGCAACAAACCAG consensus G AC ATCTGG T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 41506 294 2.82e-09 TGCTCCTTAG CATGGCAACAATCGAG TCCCGAAACA 18258 24 4.88e-08 GCTGAACGGG CATGGCCAATCACCAG TAAGGTCGAT 38908 363 8.00e-07 GTCCGCATTT CATGGCAACTCTCCCT TTAACAGATA 35424 52 9.68e-07 AAACGAGCGA CATGGTCAGAAAGCAG GAAGGCTCTG 34479 411 1.65e-06 CCGAAAACTT CATGGACAAAAACAAT GCCGGAAAAA 47142 17 2.64e-06 TATAAATCGG CAGGGGAAACATCCAT TTCCGGCTTC 44917 68 8.39e-06 TTCTGATCGA AAGGGAAACTTACGAG GAGCTAACTG 35145 160 9.69e-06 ACTCCATGCA CATGCTTACGATGGAG ATCAAAATCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 41506 2.8e-09 293_[+3]_191 18258 4.9e-08 23_[+3]_461 38908 8e-07 362_[+3]_122 35424 9.7e-07 51_[+3]_433 34479 1.6e-06 410_[+3]_74 47142 2.6e-06 16_[+3]_468 44917 8.4e-06 67_[+3]_417 35145 9.7e-06 159_[+3]_325 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=8 41506 ( 294) CATGGCAACAATCGAG 1 18258 ( 24) CATGGCCAATCACCAG 1 38908 ( 363) CATGGCAACTCTCCCT 1 35424 ( 52) CATGGTCAGAAAGCAG 1 34479 ( 411) CATGGACAAAAACAAT 1 47142 ( 17) CAGGGGAAACATCCAT 1 44917 ( 68) AAGGGAAACTTACGAG 1 35145 ( 160) CATGCTTACGATGGAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4365 bayes= 9.08912 E= 1.4e+003 -121 195 -965 -965 179 -965 -965 -965 -965 -965 17 152 -965 -965 217 -965 -965 -86 198 -965 -21 73 -82 -7 79 73 -965 -107 179 -965 -965 -965 37 114 -82 -965 37 -86 -82 52 111 14 -965 -107 79 -965 -965 93 -965 172 17 -965 -121 114 76 -965 159 -86 -965 -965 -965 -965 149 52 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 1.4e+003 0.125000 0.875000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.125000 0.875000 0.000000 0.250000 0.375000 0.125000 0.250000 0.500000 0.375000 0.000000 0.125000 1.000000 0.000000 0.000000 0.000000 0.375000 0.500000 0.125000 0.000000 0.375000 0.125000 0.125000 0.375000 0.625000 0.250000 0.000000 0.125000 0.500000 0.000000 0.000000 0.500000 0.000000 0.750000 0.250000 0.000000 0.125000 0.500000 0.375000 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 0.000000 0.625000 0.375000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CA[TG]GG[CAT][AC]A[CA][AT][AC][AT][CG][CG]A[GT] -------------------------------------------------------------------------------- Time 2.30 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 18258 2.14e-09 23_[+3(4.88e-08)]_66_[+1(4.95e-09)]_\ 379 44917 6.22e-02 67_[+3(8.39e-06)]_417 45222 5.97e-07 209_[+1(1.01e-06)]_153_\ [+2(1.26e-08)]_106 35424 5.91e-08 51_[+3(9.68e-07)]_297_\ [+1(1.48e-07)]_120 38908 8.20e-07 106_[+2(5.08e-08)]_240_\ [+3(8.00e-07)]_122 41506 5.28e-13 293_[+3(2.82e-09)]_64_\ [+2(4.15e-08)]_29_[+1(6.19e-08)]_66 35145 1.74e-05 159_[+3(9.69e-06)]_296_\ [+1(8.73e-08)]_13 34479 1.16e-06 319_[+2(1.06e-07)]_75_\ [+3(1.65e-06)]_74 47142 5.33e-10 16_[+3(2.64e-06)]_72_[+2(7.09e-08)]_\ 95_[+1(6.19e-08)]_269 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************