******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/389/389.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9601 1.0000 500 13520 1.0000 500 36995 1.0000 500 37588 1.0000 500 14561 1.0000 500 48122 1.0000 500 30154 1.0000 500 54164 1.0000 500 19692 1.0000 500 45218 1.0000 500 35171 1.0000 500 45533 1.0000 500 45667 1.0000 500 45123 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/389/389.seqs.fa -oc motifs/389 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.291 C 0.228 G 0.217 T 0.264 Background letter frequencies (from dataset with add-one prior applied): A 0.291 C 0.228 G 0.217 T 0.264 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 6 llr = 107 E-value = 3.2e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::22:2:2::::5::::5:: pos.-specific C 7::87325:a83:3:732:: probability G :78::2::::27:2a:7::a matrix T 33::3383a:::55:3:3a: bits 2.2 * * * 2.0 ** * ** 1.8 ** * ** 1.5 * *** * ** Relative 1.3 ** * **** * * ** Entropy 1.1 ***** * **** *** ** (25.7 bits) 0.9 ***** * ***** *** ** 0.7 ***** * ********* ** 0.4 ***** ************** 0.2 ***** ************** 0.0 -------------------- Multilevel CGGCCCTCTCCGATGCGATG consensus TT TT T CTC TCT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 45533 160 3.38e-10 GAAAGCACGG TGGCCCTTTCCCTCGCGATG TGATGCTGCT 45218 149 4.59e-10 ATCACTGGCG CGACCTTCTCCGATGCGTTG GACATAGGAT 14561 379 2.80e-09 TTAGTTGTCA CGGCTCTTTCGGTTGTGATG TTTGGTGTCA 54164 15 6.16e-09 TACACCTTGA TGGATATCTCCGATGCGATG AACTTGCAAT 36995 39 1.03e-08 AAGTTTGCAT CTGCCGTCTCCGAGGTCCTG ACCATTGGAG 13520 189 1.92e-08 TCCCCACAAG CTGCCTCATCCCTCGCCTTG CCTTGGTTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45533 3.4e-10 159_[+1]_321 45218 4.6e-10 148_[+1]_332 14561 2.8e-09 378_[+1]_102 54164 6.2e-09 14_[+1]_466 36995 1e-08 38_[+1]_442 13520 1.9e-08 188_[+1]_292 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=6 45533 ( 160) TGGCCCTTTCCCTCGCGATG 1 45218 ( 149) CGACCTTCTCCGATGCGTTG 1 14561 ( 379) CGGCTCTTTCGGTTGTGATG 1 54164 ( 15) TGGATATCTCCGATGCGATG 1 36995 ( 39) CTGCCGTCTCCGAGGTCCTG 1 13520 ( 189) CTGCCTCATCCCTCGCCTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6734 bayes= 10.5788 E= 3.2e+000 -923 155 -923 34 -923 -923 162 34 -80 -923 194 -923 -80 187 -923 -923 -923 155 -923 34 -80 55 -38 34 -923 -45 -923 166 -80 113 -923 34 -923 -923 -923 192 -923 213 -923 -923 -923 187 -38 -923 -923 55 162 -923 78 -923 -923 92 -923 55 -38 92 -923 -923 220 -923 -923 155 -923 34 -923 55 162 -923 78 -45 -923 34 -923 -923 -923 192 -923 -923 220 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 3.2e+000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.666667 0.333333 0.166667 0.000000 0.833333 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.166667 0.333333 0.166667 0.333333 0.000000 0.166667 0.000000 0.833333 0.166667 0.500000 0.000000 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.333333 0.666667 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.333333 0.166667 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.333333 0.666667 0.000000 0.500000 0.166667 0.000000 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CT][GT]GC[CT][CT]T[CT]TCC[GC][AT][TC]G[CT][GC][AT]TG -------------------------------------------------------------------------------- Time 1.82 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 7 llr = 100 E-value = 2.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A aaa:1331::a141:1 pos.-specific C :::3371::::::3:3 probability G :::41:::aa:466a6 matrix T :::34:69:::4:::: bits 2.2 ** * 2.0 ** * 1.8 *** *** * 1.5 *** *** * Relative 1.3 *** **** * Entropy 1.1 *** * **** * * (20.6 bits) 0.9 *** * **** * * 0.7 *** * ********* 0.4 **** *********** 0.2 **************** 0.0 ---------------- Multilevel AAAGTCTTGGAGGGGG consensus CCAA TAC C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 45123 48 4.94e-09 ACAAAAACAA AAATCCTTGGATGGGG TGCAAGCACC 35171 470 3.05e-08 TCTTGTTTTA AAACTCATGGAGGGGC TCTATTCCTC 13520 327 3.05e-08 AAAGACAACG AAAGCCATGGATGCGG CAGATCCCGA 45533 104 9.08e-08 TTGTAGCGGT AAAGTCCTGGAAGGGG GGTTGATAGA 9601 301 1.94e-07 CTCATTGAAG AAACGCTTGGAGACGC CCCGAAGAAA 37588 179 5.28e-07 CATGCAAGGT AAATTATAGGATAGGG GATAAATCAG 30154 234 1.38e-06 AAATCTGGAG AAAGAATTGGAGAAGA GAGACGCAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45123 4.9e-09 47_[+2]_437 35171 3.1e-08 469_[+2]_15 13520 3.1e-08 326_[+2]_158 45533 9.1e-08 103_[+2]_381 9601 1.9e-07 300_[+2]_184 37588 5.3e-07 178_[+2]_306 30154 1.4e-06 233_[+2]_251 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=7 45123 ( 48) AAATCCTTGGATGGGG 1 35171 ( 470) AAACTCATGGAGGGGC 1 13520 ( 327) AAAGCCATGGATGCGG 1 45533 ( 104) AAAGTCCTGGAAGGGG 1 9601 ( 301) AAACGCTTGGAGACGC 1 37588 ( 179) AAATTATAGGATAGGG 1 30154 ( 234) AAAGAATTGGAGAAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 10.5266 E= 2.3e+001 178 -945 -945 -945 178 -945 -945 -945 178 -945 -945 -945 -945 33 98 11 -102 33 -60 70 -3 165 -945 -945 -3 -67 -945 111 -102 -945 -945 170 -945 -945 220 -945 -945 -945 220 -945 178 -945 -945 -945 -102 -945 98 70 56 -945 140 -945 -102 33 140 -945 -945 -945 220 -945 -102 33 140 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 2.3e+001 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.285714 0.428571 0.285714 0.142857 0.285714 0.142857 0.428571 0.285714 0.714286 0.000000 0.000000 0.285714 0.142857 0.000000 0.571429 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.142857 0.000000 0.428571 0.428571 0.428571 0.000000 0.571429 0.000000 0.142857 0.285714 0.571429 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.285714 0.571429 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- AAA[GCT][TC][CA][TA]TGGA[GT][GA][GC]G[GC] -------------------------------------------------------------------------------- Time 3.48 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 5 llr = 93 E-value = 1.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :4::::::::::2:4:::4: pos.-specific C :::424:a226a2a644:2: probability G a6:222a::82:::::68:a matrix T ::a464::8:2:6::6:24: bits 2.2 * ** * * * 2.0 * * ** * * * 1.8 * * ** * * * 1.5 * * ** * * * * Relative 1.3 * * **** * * * * Entropy 1.1 *** **** * ***** * (26.8 bits) 0.9 *** **** * ***** * 0.7 *** * ************ * 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GGTCTCGCTGCCTCCTGGAG consensus A TCT CCG A ACCTT sequence GGG T C C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 45123 271 1.36e-10 GTAACCGAAA GGTCTTGCTGTCTCCCGGAG TCCATTTCTT 35171 258 3.08e-10 CCGAATTAGT GATTTTGCTGCCTCATCGAG GCCGGATTAG 9601 401 1.16e-09 CGTAAGGAGG GATCGCGCTGCCCCCTGGCG TTACTTTCCT 13520 440 5.69e-09 GCAGAGTTGT GGTGCCGCCGGCTCCCCGTG CCGCCGAGTT 19692 322 1.26e-08 ATGGTGCGAT GGTTTGGCTCCCACATGTTG CTGCCAAAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45123 1.4e-10 270_[+3]_210 35171 3.1e-10 257_[+3]_223 9601 1.2e-09 400_[+3]_80 13520 5.7e-09 439_[+3]_41 19692 1.3e-08 321_[+3]_159 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=5 45123 ( 271) GGTCTTGCTGTCTCCCGGAG 1 35171 ( 258) GATTTTGCTGCCTCATCGAG 1 9601 ( 401) GATCGCGCTGCCCCCTGGCG 1 13520 ( 440) GGTGCCGCCGGCTCCCCGTG 1 19692 ( 322) GGTTTGGCTCCCACATGTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6734 bayes= 11.3382 E= 1.4e+002 -897 -897 220 -897 46 -897 147 -897 -897 -897 -897 192 -897 81 -12 60 -897 -19 -12 118 -897 81 -12 60 -897 -897 220 -897 -897 213 -897 -897 -897 -19 -897 160 -897 -19 188 -897 -897 139 -12 -40 -897 213 -897 -897 -54 -19 -897 118 -897 213 -897 -897 46 139 -897 -897 -897 81 -897 118 -897 81 147 -897 -897 -897 188 -40 46 -19 -897 60 -897 -897 220 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 5 E= 1.4e+002 0.000000 0.000000 1.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.200000 0.400000 0.000000 0.200000 0.200000 0.600000 0.000000 0.400000 0.200000 0.400000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.200000 0.800000 0.000000 0.000000 0.600000 0.200000 0.200000 0.000000 1.000000 0.000000 0.000000 0.200000 0.200000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 0.800000 0.200000 0.400000 0.200000 0.000000 0.400000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- G[GA]T[CTG][TCG][CTG]GC[TC][GC][CGT]C[TAC]C[CA][TC][GC][GT][ATC]G -------------------------------------------------------------------------------- Time 5.19 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9601 7.72e-09 300_[+2(1.94e-07)]_84_\ [+3(1.16e-09)]_80 13520 2.49e-13 188_[+1(1.92e-08)]_17_\ [+3(9.68e-05)]_81_[+2(3.05e-08)]_97_[+3(5.69e-09)]_41 36995 5.70e-05 38_[+1(1.03e-08)]_442 37588 7.61e-04 178_[+2(5.28e-07)]_306 14561 5.19e-05 378_[+1(2.80e-09)]_102 48122 6.30e-01 500 30154 1.58e-02 233_[+2(1.38e-06)]_17_\ [+2(5.42e-05)]_218 54164 1.24e-04 14_[+1(6.16e-09)]_466 19692 2.60e-04 321_[+3(1.26e-08)]_159 45218 3.65e-06 148_[+1(4.59e-10)]_332 35171 7.66e-10 257_[+3(3.08e-10)]_192_\ [+2(3.05e-08)]_15 45533 3.48e-10 103_[+2(9.08e-08)]_40_\ [+1(3.38e-10)]_321 45667 5.50e-01 500 45123 3.27e-11 47_[+2(4.94e-09)]_207_\ [+3(1.36e-10)]_210 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************