******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/480/480.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9180 1.0000 500 6934 1.0000 500 54015 1.0000 500 14198 1.0000 500 47730 1.0000 500 22187 1.0000 500 38526 1.0000 500 1484 1.0000 500 32923 1.0000 500 15764 1.0000 500 49550 1.0000 500 16490 1.0000 500 16854 1.0000 500 44153 1.0000 500 4209 1.0000 500 50804 1.0000 500 51703 1.0000 500 44796 1.0000 500 35627 1.0000 500 35659 1.0000 500 42543 1.0000 500 38929 1.0000 500 42845 1.0000 500 40791 1.0000 500 34790 1.0000 500 46064 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/480/480.seqs.fa -oc motifs/480 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 26 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 13000 N= 26 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.256 C 0.259 G 0.236 T 0.248 Background letter frequencies (from dataset with add-one prior applied): A 0.256 C 0.259 G 0.236 T 0.248 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 15 sites = 10 llr = 134 E-value = 2.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a:712:a22:3:a1: pos.-specific C :a:::7:8:12a:2: probability G ::1911::26:::5: matrix T ::2:72::635::2a bits 2.1 * 1.9 ** * ** * 1.7 ** * * ** * 1.5 ** * * ** * Relative 1.2 ** * ** ** * Entropy 1.0 ** * ** ** * (19.3 bits) 0.8 ******** * ** * 0.6 ********** ** * 0.4 ************* * 0.2 *************** 0.0 --------------- Multilevel ACAGTCACTGTCAGT consensus T AT AATA C sequence G C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 34790 462 9.79e-09 ATGTTAAGGT ACAGTCACAGTCAGT CAATGACTGT 49550 156 3.58e-08 ATGTAATATA ACAGTCAATGACAGT GAAGATTTAT 38929 244 7.00e-08 CATTTGTTTC ACAGTCACTCACAGT CAGACTGAAA 35627 281 1.55e-07 GCATTTTCTC ACAGTGACAGTCAGT GAAATACTGT 44153 474 2.79e-07 ATCATCGGAC ACTGACACTGCCAGT GCGTGAGGTG 32923 299 5.20e-07 ACAGCGAGAA ACAGGTACTGTCACT CTCCGACGGC 54015 440 5.20e-07 ACCAAATCCC ACAGTCACGTACAAT TCTATTCTCT 42845 27 7.56e-07 CCAAATGCTT ACAGTTAATGCCATT GCTACATAGG 1484 32 1.32e-06 GATATTCTCG ACTATCACTTTCACT GGAAGACAGT 14198 478 2.24e-06 GCCGCGGAAC ACGGACACGTTCATT TGGTGGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 34790 9.8e-09 461_[+1]_24 49550 3.6e-08 155_[+1]_330 38929 7e-08 243_[+1]_242 35627 1.6e-07 280_[+1]_205 44153 2.8e-07 473_[+1]_12 32923 5.2e-07 298_[+1]_187 54015 5.2e-07 439_[+1]_46 42845 7.6e-07 26_[+1]_459 1484 1.3e-06 31_[+1]_454 14198 2.2e-06 477_[+1]_8 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=15 seqs=10 34790 ( 462) ACAGTCACAGTCAGT 1 49550 ( 156) ACAGTCAATGACAGT 1 38929 ( 244) ACAGTCACTCACAGT 1 35627 ( 281) ACAGTGACAGTCAGT 1 44153 ( 474) ACTGACACTGCCAGT 1 32923 ( 299) ACAGGTACTGTCACT 1 54015 ( 440) ACAGTCACGTACAAT 1 42845 ( 27) ACAGTTAATGCCATT 1 1484 ( 32) ACTATCACTTTCACT 1 14198 ( 478) ACGGACACGTTCATT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 12636 bayes= 10.5539 E= 2.6e+000 196 -997 -997 -997 -997 195 -997 -997 145 -997 -124 -31 -135 -997 193 -997 -36 -997 -124 150 -997 143 -124 -31 196 -997 -997 -997 -36 162 -997 -997 -36 -997 -24 127 -997 -137 134 27 23 -37 -997 101 -997 195 -997 -997 196 -997 -997 -997 -135 -37 108 -31 -997 -997 -997 201 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 10 E= 2.6e+000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.700000 0.000000 0.100000 0.200000 0.100000 0.000000 0.900000 0.000000 0.200000 0.000000 0.100000 0.700000 0.000000 0.700000 0.100000 0.200000 1.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.000000 0.200000 0.600000 0.000000 0.100000 0.600000 0.300000 0.300000 0.200000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.100000 0.200000 0.500000 0.200000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- AC[AT]G[TA][CT]A[CA][TAG][GT][TAC]CA[GCT]T -------------------------------------------------------------------------------- Time 5.85 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 5 llr = 95 E-value = 2.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::82:::::42:24::2:4a pos.-specific C :::8:::::4:6:4:::::: probability G :8::84:822648::a8:6: matrix T a22:26a28:2::2a::a:: bits 2.1 * * ** * 1.9 * * ** * * 1.7 * * ** * * 1.5 * * ** * * Relative 1.2 ***** *** * **** * Entropy 1.0 ********* ** ****** (27.3 bits) 0.8 ********* ** ****** 0.6 ********* *** ****** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel TGACGTTGTAGCGATGGTGA consensus TTATG TGCAGAC A A sequence GT T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 15764 302 6.85e-11 GTTTCATCGC TGACGTTGTCGCAATGGTGA CAATACCGGT 9180 108 1.16e-10 ATTCCCTTGT TGACGTTGTAGGGATGATGA CTTGGTGGCT 32923 189 2.31e-09 CTTTCGTAAT TGACTGTGTATCGTTGGTAA AAAAAAACGA 49550 71 3.57e-09 CTGGGGGAAA TGAAGTTTGCGCGCTGGTAA AAAAGTGATG 44153 312 6.15e-09 AACAATCTCT TTTCGGTGTGAGGCTGGTGA AGAACGGGCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 15764 6.8e-11 301_[+2]_179 9180 1.2e-10 107_[+2]_373 32923 2.3e-09 188_[+2]_292 49550 3.6e-09 70_[+2]_410 44153 6.2e-09 311_[+2]_169 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=5 15764 ( 302) TGACGTTGTCGCAATGGTGA 1 9180 ( 108) TGACGTTGTAGGGATGATGA 1 32923 ( 189) TGACTGTGTATCGTTGGTAA 1 49550 ( 71) TGAAGTTTGCGCGCTGGTAA 1 44153 ( 312) TTTCGGTGTGAGGCTGGTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 12506 bayes= 11.5395 E= 2.7e+002 -897 -897 -897 201 -897 -897 176 -31 164 -897 -897 -31 -36 162 -897 -897 -897 -897 176 -31 -897 -897 76 127 -897 -897 -897 201 -897 -897 176 -31 -897 -897 -24 169 64 62 -24 -897 -36 -897 134 -31 -897 121 76 -897 -36 -897 176 -897 64 62 -897 -31 -897 -897 -897 201 -897 -897 208 -897 -36 -897 176 -897 -897 -897 -897 201 64 -897 134 -897 196 -897 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 5 E= 2.7e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.800000 0.000000 0.000000 0.200000 0.200000 0.800000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.200000 0.800000 0.400000 0.400000 0.200000 0.000000 0.200000 0.000000 0.600000 0.200000 0.000000 0.600000 0.400000 0.000000 0.200000 0.000000 0.800000 0.000000 0.400000 0.400000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 0.000000 1.000000 0.400000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[GT][AT][CA][GT][TG]T[GT][TG][ACG][GAT][CG][GA][ACT]TG[GA]T[GA]A -------------------------------------------------------------------------------- Time 11.86 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 11 llr = 139 E-value = 3.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :1:46842a5218:a8 pos.-specific C :1:21255::8::3:: probability G 185:1:23:3:9:7:: matrix T 9:552::::2::2::2 bits 2.1 1.9 * * 1.7 * * * * 1.5 * * * * Relative 1.2 ** * * ****** Entropy 1.0 *** * * ****** (18.3 bits) 0.8 *** * * ****** 0.6 *** * ********* 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TGTTAACCAACGAGAA consensus GA AG G C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 32923 283 9.03e-09 ACAGACGTGG TGGAAAACAGCGAGAA ACAGGTACTG 44796 313 1.60e-07 TCACAAATAT TGGTACGGAACGAGAA TTTCCTGTTA 42543 390 1.78e-07 AATCCCGTCC TGGAACAAAACGAGAA CTTTGCGAAC 9180 249 1.78e-07 TCGATCTAAC TGTAAACGAGCGAGAT ATAATCCTGC 35627 149 6.63e-07 TTCGAGTCAC TGTCAAGCATCGTGAA TTTAGAGTAC 46064 340 7.25e-07 GTCACGCACG TCGTGACCAACGAGAA GACACGCTAC 51703 253 7.84e-07 CTCTGGTCGG TGGTCAACAGAGAGAA CGTGAGTGAA 50804 352 8.52e-07 CGACAACGAG TATCAAACAACGACAA GAAGCCGTAT 42845 352 1.17e-06 GAATGACGCA GGTTAACAAAAGAGAA ACTTGATGAC 22187 197 1.81e-06 GGAAAATGAG TGTTTACCAACGTCAT CATTTTGTCG 44153 291 5.18e-06 ACACAATACG TGTATACGATCAACAA TCTCTTTTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32923 9e-09 282_[+3]_202 44796 1.6e-07 312_[+3]_172 42543 1.8e-07 389_[+3]_95 9180 1.8e-07 248_[+3]_236 35627 6.6e-07 148_[+3]_336 46064 7.2e-07 339_[+3]_145 51703 7.8e-07 252_[+3]_232 50804 8.5e-07 351_[+3]_133 42845 1.2e-06 351_[+3]_133 22187 1.8e-06 196_[+3]_288 44153 5.2e-06 290_[+3]_194 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=11 32923 ( 283) TGGAAAACAGCGAGAA 1 44796 ( 313) TGGTACGGAACGAGAA 1 42543 ( 390) TGGAACAAAACGAGAA 1 9180 ( 249) TGTAAACGAGCGAGAT 1 35627 ( 149) TGTCAAGCATCGTGAA 1 46064 ( 340) TCGTGACCAACGAGAA 1 51703 ( 253) TGGTCAACAGAGAGAA 1 50804 ( 352) TATCAAACAACGACAA 1 42845 ( 352) GGTTAACAAAAGAGAA 1 22187 ( 197) TGTTTACCAACGTCAT 1 44153 ( 291) TGTATACGATCAACAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 12610 bayes= 10.517 E= 3.2e+002 -1010 -1010 -138 187 -149 -151 179 -1010 -1010 -1010 94 114 51 -51 -1010 87 131 -151 -138 -45 167 -51 -1010 -1010 51 81 -38 -1010 -49 107 21 -1010 196 -1010 -1010 -1010 109 -1010 21 -45 -49 166 -1010 -1010 -149 -1010 194 -1010 167 -1010 -1010 -45 -1010 7 162 -1010 196 -1010 -1010 -1010 167 -1010 -1010 -45 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 3.2e+002 0.000000 0.000000 0.090909 0.909091 0.090909 0.090909 0.818182 0.000000 0.000000 0.000000 0.454545 0.545455 0.363636 0.181818 0.000000 0.454545 0.636364 0.090909 0.090909 0.181818 0.818182 0.181818 0.000000 0.000000 0.363636 0.454545 0.181818 0.000000 0.181818 0.545455 0.272727 0.000000 1.000000 0.000000 0.000000 0.000000 0.545455 0.000000 0.272727 0.181818 0.181818 0.818182 0.000000 0.000000 0.090909 0.000000 0.909091 0.000000 0.818182 0.000000 0.000000 0.181818 0.000000 0.272727 0.727273 0.000000 1.000000 0.000000 0.000000 0.000000 0.818182 0.000000 0.000000 0.181818 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TG[TG][TA]AA[CA][CG]A[AG]CGA[GC]AA -------------------------------------------------------------------------------- Time 17.68 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9180 7.80e-10 107_[+2(1.16e-10)]_121_\ [+3(1.78e-07)]_236 6934 4.30e-01 500 54015 8.15e-03 3_[+1(3.41e-05)]_85_[+1(8.25e-06)]_\ 321_[+1(5.20e-07)]_46 14198 1.09e-02 477_[+1(2.24e-06)]_8 47730 9.02e-01 500 22187 1.34e-02 196_[+3(1.81e-06)]_288 38526 8.63e-01 500 1484 2.76e-03 31_[+1(1.32e-06)]_454 32923 7.67e-13 188_[+2(2.31e-09)]_74_\ [+3(9.03e-09)]_[+1(5.20e-07)]_187 15764 7.93e-08 301_[+2(6.85e-11)]_179 49550 2.53e-09 70_[+2(3.57e-09)]_65_[+1(3.58e-08)]_\ 330 16490 9.23e-01 500 16854 6.75e-01 500 44153 4.12e-10 290_[+3(5.18e-06)]_5_[+2(6.15e-09)]_\ 142_[+1(2.79e-07)]_12 4209 8.37e-01 500 50804 1.65e-03 351_[+3(8.52e-07)]_133 51703 6.07e-03 252_[+3(7.84e-07)]_232 44796 8.05e-04 312_[+3(1.60e-07)]_172 35627 1.84e-06 148_[+3(6.63e-07)]_116_\ [+1(1.55e-07)]_205 35659 2.05e-01 500 42543 4.21e-04 362_[+3(3.03e-05)]_11_\ [+3(1.78e-07)]_95 38929 1.05e-03 243_[+1(7.00e-08)]_242 42845 7.81e-06 26_[+1(7.56e-07)]_310_\ [+3(1.17e-06)]_133 40791 3.84e-01 500 34790 4.33e-05 461_[+1(9.79e-09)]_24 46064 1.03e-03 339_[+3(7.25e-07)]_145 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************