******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/361/361.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47432 1.0000 500 47894 1.0000 500 54915 1.0000 500 43378 1.0000 500 43636 1.0000 500 39605 1.0000 500 41069 1.0000 500 44133 1.0000 500 45185 1.0000 500 3843 1.0000 500 50163 1.0000 500 46779 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/361/361.seqs.fa -oc motifs/361 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6000 N= 12 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.274 C 0.239 G 0.225 T 0.263 Background letter frequencies (from dataset with add-one prior applied): A 0.273 C 0.239 G 0.225 T 0.263 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 4 llr = 80 E-value = 3.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 3:::::5::3:::5:::::: pos.-specific C :5::33358::::::3::3: probability G 8::::8:::8:a83::a:8a matrix T :5aa8:353:a:33a8:a:: bits 2.2 * * * 1.9 ** ** * ** * 1.7 ** ** * ** * 1.5 ** ** * ** * Relative 1.3 * ** * ***** * **** Entropy 1.1 ****** ****** ****** (29.0 bits) 0.9 ****** ****** ****** 0.6 ****** ****** ****** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GCTTTGACCGTGGATTGTGG consensus AT CCCTTA TG C C sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 39605 334 3.10e-11 TTCCCAGTCC GTTTTGACTGTGGATTGTGG GTCGCTGTAT 47894 121 2.98e-10 ATGGGATACT GCTTTGTTCGTGTTTTGTGG ACAGATCACA 54915 93 5.80e-10 TAGGTAGATT GTTTCGATCGTGGATCGTCG TGATCAAAAA 3843 195 2.11e-09 ACACATACAC ACTTTCCCCATGGGTTGTGG AGCATCCACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39605 3.1e-11 333_[+1]_147 47894 3e-10 120_[+1]_360 54915 5.8e-10 92_[+1]_388 3843 2.1e-09 194_[+1]_286 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=4 39605 ( 334) GTTTTGACTGTGGATTGTGG 1 47894 ( 121) GCTTTGTTCGTGTTTTGTGG 1 54915 ( 93) GTTTCGATCGTGGATCGTCG 1 3843 ( 195) ACTTTCCCCATGGGTTGTGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5772 bayes= 10.4939 E= 3.0e+002 -13 -865 174 -865 -865 107 -865 92 -865 -865 -865 192 -865 -865 -865 192 -865 7 -865 151 -865 7 174 -865 87 7 -865 -7 -865 107 -865 92 -865 165 -865 -7 -13 -865 174 -865 -865 -865 -865 192 -865 -865 215 -865 -865 -865 174 -7 87 -865 15 -7 -865 -865 -865 192 -865 7 -865 151 -865 -865 215 -865 -865 -865 -865 192 -865 7 174 -865 -865 -865 215 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 4 E= 3.0e+002 0.250000 0.000000 0.750000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.250000 0.750000 0.000000 0.500000 0.250000 0.000000 0.250000 0.000000 0.500000 0.000000 0.500000 0.000000 0.750000 0.000000 0.250000 0.250000 0.000000 0.750000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.500000 0.000000 0.250000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GA][CT]TT[TC][GC][ACT][CT][CT][GA]TG[GT][AGT]T[TC]GT[GC]G -------------------------------------------------------------------------------- Time 1.27 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 8 llr = 120 E-value = 3.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :1:31:66:31353:31:13: pos.-specific C :::15:4:94:64:68139:5 probability G a:a443::1:6:::3:3:::5 matrix T :9:3:8:4:431181:58:8: bits 2.2 * * 1.9 * * 1.7 * * 1.5 * * * * Relative 1.3 *** * * * Entropy 1.1 *** ** * * * **** (21.7 bits) 0.9 *** **** * *** **** 0.6 *** ***** ****** **** 0.4 *** ************ **** 0.2 ********************* 0.0 --------------------- Multilevel GTGGCTAACCGCATCCTTCTC consensus AGGCT TTACAGAGC AG sequence T A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 46779 467 2.38e-11 AGCACCAATG GTGGCGAACCGCATCCTTCTC CCTAATTAAA 50163 395 5.66e-09 AGTCCGAGAT GTGACTCTCTGCATGCCTCTG TCGTACGAAG 44133 231 6.44e-08 GGCAAAAAGG GTGGGTAAGCGCCTCATCCAG CTGCGCCCGC 39605 469 7.61e-08 GCACTGGAGT GTGTGTATCCTATTCCTCCTC TTGGTCTCGT 47432 263 9.69e-08 GCATGTGTAC GTGCATATCTGCAATCTTCTG TTTAGATAGA 3843 89 1.76e-07 TTGTTGTTCC GTGACTAACAAACAGCGTCTC TTTCGATACC 43378 443 1.88e-07 GGACTTTAGG GTGGGGCACATCATCCATATC CGTTTTACGT 54915 257 4.03e-07 TGGAATTCTC GAGTCTCACTGTCTCAGTCAG TCAAACCAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46779 2.4e-11 466_[+2]_13 50163 5.7e-09 394_[+2]_85 44133 6.4e-08 230_[+2]_249 39605 7.6e-08 468_[+2]_11 47432 9.7e-08 262_[+2]_217 3843 1.8e-07 88_[+2]_391 43378 1.9e-07 442_[+2]_37 54915 4e-07 256_[+2]_223 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=8 46779 ( 467) GTGGCGAACCGCATCCTTCTC 1 50163 ( 395) GTGACTCTCTGCATGCCTCTG 1 44133 ( 231) GTGGGTAAGCGCCTCATCCAG 1 39605 ( 469) GTGTGTATCCTATTCCTCCTC 1 47432 ( 263) GTGCATATCTGCAATCTTCTG 1 3843 ( 89) GTGACTAACAAACAGCGTCTC 1 43378 ( 443) GTGGGGCACATCATCCATATC 1 54915 ( 257) GAGTCTCACTGTCTCAGTCAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 5760 bayes= 9.48985 E= 3.6e+002 -965 -965 215 -965 -113 -965 -965 173 -965 -965 215 -965 -13 -93 74 -7 -113 107 74 -965 -965 -965 15 151 119 65 -965 -965 119 -965 -965 51 -965 187 -84 -965 -13 65 -965 51 -113 -965 147 -7 -13 139 -965 -107 87 65 -965 -107 -13 -965 -965 151 -965 139 15 -107 -13 165 -965 -965 -113 -93 15 93 -965 7 -965 151 -113 187 -965 -965 -13 -965 -965 151 -965 107 115 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 8 E= 3.6e+002 0.000000 0.000000 1.000000 0.000000 0.125000 0.000000 0.000000 0.875000 0.000000 0.000000 1.000000 0.000000 0.250000 0.125000 0.375000 0.250000 0.125000 0.500000 0.375000 0.000000 0.000000 0.000000 0.250000 0.750000 0.625000 0.375000 0.000000 0.000000 0.625000 0.000000 0.000000 0.375000 0.000000 0.875000 0.125000 0.000000 0.250000 0.375000 0.000000 0.375000 0.125000 0.000000 0.625000 0.250000 0.250000 0.625000 0.000000 0.125000 0.500000 0.375000 0.000000 0.125000 0.250000 0.000000 0.000000 0.750000 0.000000 0.625000 0.250000 0.125000 0.250000 0.750000 0.000000 0.000000 0.125000 0.125000 0.250000 0.500000 0.000000 0.250000 0.000000 0.750000 0.125000 0.875000 0.000000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.500000 0.500000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GTG[GAT][CG][TG][AC][AT]C[CTA][GT][CA][AC][TA][CG][CA][TG][TC]C[TA][CG] -------------------------------------------------------------------------------- Time 2.50 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 6 llr = 85 E-value = 6.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :2aa2:2a:22:22:3 pos.-specific C a:::3:::852227a: probability G :8::3a8::32:22:7 matrix T ::::2:::2:585::: bits 2.2 * * * 1.9 * ** * * * 1.7 * ** * * * 1.5 **** *** * Relative 1.3 **** **** * * Entropy 1.1 **** **** * ** (20.5 bits) 0.9 **** **** * *** 0.6 **** ***** * *** 0.4 **** ***** * *** 0.2 **************** 0.0 ---------------- Multilevel CGAACGGACCTTTCCG consensus G G A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 46779 390 6.66e-10 AGCCGTAGTG CGAACGGACGTTTCCG ATTCCGGTTT 43378 43 1.42e-07 TGGGATGAAA CAAAAGGACCTTGCCG AGTAGTGGCT 41069 114 1.98e-07 ACCGAAAACT CGAATGGATGCTTCCG CCAAGAAATG 39605 54 2.35e-07 ATCAAAAATG CGAAGGGACCTCAACG TCTGTCCACG 3843 400 2.59e-07 ACGGGTCCAC CGAACGGACCGTCGCA CGGCATGCCC 47894 356 5.37e-07 TCTTTGACTG CGAAGGAACAATTCCA TTTTGGGAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46779 6.7e-10 389_[+3]_95 43378 1.4e-07 42_[+3]_442 41069 2e-07 113_[+3]_371 39605 2.3e-07 53_[+3]_431 3843 2.6e-07 399_[+3]_85 47894 5.4e-07 355_[+3]_129 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=6 46779 ( 390) CGAACGGACGTTTCCG 1 43378 ( 43) CAAAAGGACCTTGCCG 1 41069 ( 114) CGAATGGATGCTTCCG 1 39605 ( 54) CGAAGGGACCTCAACG 1 3843 ( 400) CGAACGGACCGTCGCA 1 47894 ( 356) CGAAGGAACAATTCCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5820 bayes= 10.3682 E= 6.8e+002 -923 207 -923 -923 -71 -923 189 -923 187 -923 -923 -923 187 -923 -923 -923 -71 48 57 -66 -923 -923 215 -923 -71 -923 189 -923 187 -923 -923 -923 -923 180 -923 -66 -71 107 57 -923 -71 -52 -43 92 -923 -52 -923 166 -71 -52 -43 92 -71 148 -43 -923 -923 207 -923 -923 29 -923 157 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 6.8e+002 0.000000 1.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.333333 0.333333 0.166667 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.833333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.000000 0.166667 0.166667 0.500000 0.333333 0.000000 0.166667 0.166667 0.166667 0.500000 0.000000 0.166667 0.000000 0.833333 0.166667 0.166667 0.166667 0.500000 0.166667 0.666667 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CGAA[CG]GGAC[CG]TTTCC[GA] -------------------------------------------------------------------------------- Time 3.73 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47432 1.99e-03 262_[+2(9.69e-08)]_217 47894 9.35e-09 120_[+1(2.98e-10)]_215_\ [+3(5.37e-07)]_129 54915 4.02e-09 92_[+1(5.80e-10)]_144_\ [+2(4.03e-07)]_223 43378 3.71e-07 42_[+3(1.42e-07)]_384_\ [+2(1.88e-07)]_37 43636 3.18e-01 500 39605 4.56e-14 53_[+3(2.35e-07)]_264_\ [+1(3.10e-11)]_115_[+2(7.61e-08)]_11 41069 3.23e-03 113_[+3(1.98e-07)]_371 44133 6.75e-04 230_[+2(6.44e-08)]_249 45185 3.16e-01 500 3843 5.92e-12 88_[+2(1.76e-07)]_85_[+1(2.11e-09)]_\ 185_[+3(2.59e-07)]_85 50163 5.96e-06 394_[+2(5.66e-09)]_85 46779 1.19e-12 389_[+3(6.66e-10)]_61_\ [+2(2.38e-11)]_13 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************