******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/344/344.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 14986 1.0000 500 39081 1.0000 500 40280 1.0000 500 12456 1.0000 500 31402 1.0000 500 49922 1.0000 500 33150 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/344/344.seqs.fa -oc motifs/344 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.282 C 0.234 G 0.203 T 0.280 Background letter frequencies (from dataset with add-one prior applied): A 0.282 C 0.234 G 0.203 T 0.280 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 5 llr = 86 E-value = 3.0e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::242::a::::::46:2:: pos.-specific C :8::2:4:44:2:42::2a: probability G a:624a6:22a6:64:a2:8 matrix T :2242:::44:2a::4:4:2 bits 2.3 * * * * 2.1 * * * * * 1.8 * * * * * * * 1.6 * * * * * * * Relative 1.4 ** * * * * * ** Entropy 1.1 ** *** * ** * ** (24.8 bits) 0.9 ** *** * ** ** ** 0.7 *** *** **** ** ** 0.5 **** ************ ** 0.2 ***************** ** 0.0 -------------------- Multilevel GCGAGGGACCGGTGAAGTCG consensus TATA C TT C CGT A T sequence TGC GG T C C T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 40280 322 1.05e-10 GGTTAGTCAT GCGAGGGAGTGGTGAAGGCG TTCGAGAGTC 12456 102 2.07e-09 TCTTGTTTTC GCGGAGCATCGGTCGTGTCG CCGACAGCGA 14986 182 7.22e-09 CAGGTCCATC GCATGGCACCGGTCCAGACG GATCCATCGC 49922 75 1.51e-08 AGTCATTCTT GTGTTGGATTGCTGGAGCCG TTGGTTTGAG 39081 76 6.22e-08 GCTCCAATCT GCTACGGACGGTTGATGTCT TGGACTATCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40280 1e-10 321_[+1]_159 12456 2.1e-09 101_[+1]_379 14986 7.2e-09 181_[+1]_299 49922 1.5e-08 74_[+1]_406 39081 6.2e-08 75_[+1]_405 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=5 40280 ( 322) GCGAGGGAGTGGTGAAGGCG 1 12456 ( 102) GCGGAGCATCGGTCGTGTCG 1 14986 ( 182) GCATGGCACCGGTCCAGACG 1 49922 ( 75) GTGTTGGATTGCTGGAGCCG 1 39081 ( 76) GCTACGGACGGTTGATGTCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 3367 bayes= 10.3376 E= 3.0e+001 -897 -897 229 -897 -897 177 -897 -49 -49 -897 156 -49 50 -897 -2 51 -49 -23 97 -49 -897 -897 229 -897 -897 77 156 -897 182 -897 -897 -897 -897 77 -2 51 -897 77 -2 51 -897 -897 229 -897 -897 -23 156 -49 -897 -897 -897 183 -897 77 156 -897 50 -23 97 -897 109 -897 -897 51 -897 -897 229 -897 -49 -23 -2 51 -897 209 -897 -897 -897 -897 197 -49 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 5 E= 3.0e+001 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.200000 0.000000 0.600000 0.200000 0.400000 0.000000 0.200000 0.400000 0.200000 0.200000 0.400000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.400000 0.200000 0.400000 0.000000 0.400000 0.200000 0.400000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.600000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.600000 0.000000 0.400000 0.200000 0.400000 0.000000 0.600000 0.000000 0.000000 0.400000 0.000000 0.000000 1.000000 0.000000 0.200000 0.200000 0.200000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[CT][GAT][ATG][GACT]G[GC]A[CTG][CTG]G[GCT]T[GC][AGC][AT]G[TACG]C[GT] -------------------------------------------------------------------------------- Time 0.46 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 6 llr = 95 E-value = 7.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::7:5::285532:8:55: pos.-specific C aa::3::a:252:3::a252 probability G ::a22:8:5::23282:3:2 matrix T :::2552:3::2332::::7 bits 2.3 * 2.1 *** * * 1.8 *** * * 1.6 *** ** * * Relative 1.4 *** ** * * Entropy 1.1 *** ** * *** (22.7 bits) 0.9 *** *** ** *** * 0.7 **** ****** *** ** 0.5 *********** * ****** 0.2 ************* ****** 0.0 -------------------- Multilevel CCGATAGCGAAAACGACAAT consensus CT T C GT GC sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 40280 196 2.38e-09 TGGTCGTTTG CCGACTGCGAATACGACGAT CTCGCATTTC 49922 172 4.74e-09 TGATAGTATT CCGTTAGCTACATTGACACT GTAACGGACT 31402 348 9.77e-09 CAAACTCACA CCGATTGCTACCGCGACAAC CGAACCCGAT 14986 430 5.30e-08 ACGAATATGG CCGATTTCAACAAGGACGAT TTCTTGTCGT 33150 37 1.14e-07 CCGCTTTTGG CCGGGAGCGCAGTAGACACT TTTTTGTTTC 12456 122 1.52e-07 GGTCGTGTCG CCGACAGCGAAAGTTGCCCG GAGTCTGTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 40280 2.4e-09 195_[+2]_285 49922 4.7e-09 171_[+2]_309 31402 9.8e-09 347_[+2]_133 14986 5.3e-08 429_[+2]_51 33150 1.1e-07 36_[+2]_444 12456 1.5e-07 121_[+2]_359 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=6 40280 ( 196) CCGACTGCGAATACGACGAT 1 49922 ( 172) CCGTTAGCTACATTGACACT 1 31402 ( 348) CCGATTGCTACCGCGACAAC 1 14986 ( 430) CCGATTTCAACAAGGACGAT 1 33150 ( 37) CCGGGAGCGCAGTAGACACT 1 12456 ( 122) CCGACAGCGAAAGTTGCCCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 3367 bayes= 9.57786 E= 7.1e+001 -923 209 -923 -923 -923 209 -923 -923 -923 -923 230 -923 124 -923 -29 -75 -923 51 -29 83 83 -923 -923 83 -923 -923 203 -75 -923 209 -923 -923 -76 -923 130 25 156 -49 -923 -923 83 109 -923 -923 83 -49 -29 -75 24 -923 71 25 -76 51 -29 25 -923 -923 203 -75 156 -923 -29 -923 -923 209 -923 -923 83 -49 71 -923 83 109 -923 -923 -923 -49 -29 125 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 7.1e+001 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.166667 0.166667 0.000000 0.333333 0.166667 0.500000 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 0.833333 0.166667 0.000000 1.000000 0.000000 0.000000 0.166667 0.000000 0.500000 0.333333 0.833333 0.166667 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.500000 0.166667 0.166667 0.166667 0.333333 0.000000 0.333333 0.333333 0.166667 0.333333 0.166667 0.333333 0.000000 0.000000 0.833333 0.166667 0.833333 0.000000 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.166667 0.333333 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.166667 0.166667 0.666667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CCGA[TC][AT]GC[GT]A[AC]A[AGT][CT]GAC[AG][AC]T -------------------------------------------------------------------------------- Time 0.87 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 7 llr = 87 E-value = 1.6e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::9331::::::::: pos.-specific C :47:464:1:1a:3:: probability G 13::1::1::3:34a7 matrix T 933111499a6:73:3 bits 2.3 * 2.1 * * 1.8 * * * 1.6 * * * Relative 1.4 * *** * ** Entropy 1.1 * ** *** ** ** (17.9 bits) 0.9 * ** *** ** ** 0.7 * ** * ****** ** 0.5 **** *********** 0.2 **************** 0.0 ---------------- Multilevel TCCACCCTTTTCTGGG consensus GT AAT G GC T sequence T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 49922 206 9.75e-09 CGGACTGGAT TGCAACCTTTGCTGGG ACCAGAATGA 40280 435 2.51e-07 TTCACAGGCA TTCACATTTTTCGTGG GACTGAAGTT 14986 342 5.43e-07 GTACAGGACA TCCATTCTTTTCTCGG CGTTGGCTAC 39081 285 7.99e-07 CAACAACTGA TGTACAATTTGCTGGG AAATGCTGCC 31402 460 1.05e-06 GGCCAACACC GTCACCTTTTTCTCGT AATTTCCGAA 33150 3 4.34e-06 GC TCTTGCCTTTCCGGGG TGAGGATTCC 12456 82 4.85e-06 TTTACAGTCA TCCAACTGCTTCTTGT TTTCGCGGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49922 9.7e-09 205_[+3]_279 40280 2.5e-07 434_[+3]_50 14986 5.4e-07 341_[+3]_143 39081 8e-07 284_[+3]_200 31402 1e-06 459_[+3]_25 33150 4.3e-06 2_[+3]_482 12456 4.9e-06 81_[+3]_403 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=7 49922 ( 206) TGCAACCTTTGCTGGG 1 40280 ( 435) TTCACATTTTTCGTGG 1 14986 ( 342) TCCATTCTTTTCTCGG 1 39081 ( 285) TGTACAATTTGCTGGG 1 31402 ( 460) GTCACCTTTTTCTCGT 1 33150 ( 3) TCTTGCCTTTCCGGGG 1 12456 ( 82) TCCAACTGCTTCTTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 3395 bayes= 9.52561 E= 1.6e+003 -945 -945 -51 161 -945 87 49 3 -945 161 -945 3 160 -945 -945 -97 2 87 -51 -97 2 128 -945 -97 -98 87 -945 61 -945 -945 -51 161 -945 -71 -945 161 -945 -945 -945 183 -945 -71 49 103 -945 209 -945 -945 -945 -945 49 135 -945 29 107 3 -945 -945 230 -945 -945 -945 181 3 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 1.6e+003 0.000000 0.000000 0.142857 0.857143 0.000000 0.428571 0.285714 0.285714 0.000000 0.714286 0.000000 0.285714 0.857143 0.000000 0.000000 0.142857 0.285714 0.428571 0.142857 0.142857 0.285714 0.571429 0.000000 0.142857 0.142857 0.428571 0.000000 0.428571 0.000000 0.000000 0.142857 0.857143 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.285714 0.571429 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.285714 0.714286 0.000000 0.285714 0.428571 0.285714 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.714286 0.285714 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[CGT][CT]A[CA][CA][CT]TTT[TG]C[TG][GCT]G[GT] -------------------------------------------------------------------------------- Time 1.28 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 14986 1.22e-11 181_[+1(7.22e-09)]_8_[+1(8.65e-05)]_\ 112_[+3(5.43e-07)]_72_[+2(5.30e-08)]_51 39081 1.14e-06 75_[+1(6.22e-08)]_189_\ [+3(7.99e-07)]_200 40280 5.77e-15 195_[+2(2.38e-09)]_106_\ [+1(1.05e-10)]_93_[+3(2.51e-07)]_50 12456 7.93e-11 81_[+3(4.85e-06)]_4_[+1(2.07e-09)]_\ [+2(1.52e-07)]_359 31402 5.11e-07 269_[+2(6.29e-05)]_58_\ [+2(9.77e-09)]_92_[+3(1.05e-06)]_25 49922 5.69e-14 74_[+1(1.51e-08)]_77_[+2(4.74e-09)]_\ 14_[+3(9.75e-09)]_279 33150 1.42e-05 2_[+3(4.34e-06)]_18_[+2(1.14e-07)]_\ 444 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************