******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/93/93.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 36312 1.0000 500 46579 1.0000 500 21608 1.0000 500 5527 1.0000 500 48645 1.0000 500 48660 1.0000 500 4593 1.0000 500 16421 1.0000 500 16719 1.0000 500 45088 1.0000 500 11781 1.0000 500 45912 1.0000 500 7107 1.0000 500 48792 1.0000 500 36311 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/93/93.seqs.fa -oc motifs/93 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 15 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7500 N= 15 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.237 G 0.223 T 0.269 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.237 G 0.223 T 0.269 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 8 llr = 116 E-value = 2.7e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :35::5a9::41:::: pos.-specific C 91::a::::a5:346: probability G :3:a:5:14:1:5::a matrix T 145:::::6::9364: bits 2.2 ** * * 2.0 ** * * * 1.7 ** * * * 1.5 * ** * * * Relative 1.3 * ** ** * * * Entropy 1.1 * ******* * *** (20.9 bits) 0.9 * ******** * *** 0.7 * ************** 0.4 * ************** 0.2 * ************** 0.0 ---------------- Multilevel CTAGCAAATCCTGTCG consensus AT G G A CCT sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 7107 75 2.65e-09 CTATTCATGT CAAGCGAATCCTGTCG TATGACAAGA 16719 131 9.94e-09 CTGACGGCTC CTTGCAAATCCTGTTG AGACCGCCGT 4593 131 9.94e-09 CTGACGGCTC CTTGCAAATCCTGTTG AGACCGCCGT 48645 53 1.72e-07 TGATGGACAC CCAGCGAAGCATTTCG ACCAGGAAAA 46579 48 1.72e-07 AGAAACATTC CAAGCAAAGCATCCCG CCTAATGTAA 36311 63 3.18e-07 GTATGGAAAT CTTGCGAGTCATCCCG GTAATTAAAG 45088 249 4.33e-07 TCGATACTAT CGTGCGAAGCGTTCTG CTTCGGATCC 45912 235 7.07e-07 ACCGTGTGGA TGAGCAAATCCAGTCG TAGTGGATAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 7107 2.7e-09 74_[+1]_410 16719 9.9e-09 130_[+1]_354 4593 9.9e-09 130_[+1]_354 48645 1.7e-07 52_[+1]_432 46579 1.7e-07 47_[+1]_437 36311 3.2e-07 62_[+1]_422 45088 4.3e-07 248_[+1]_236 45912 7.1e-07 234_[+1]_250 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=8 7107 ( 75) CAAGCGAATCCTGTCG 1 16719 ( 131) CTTGCAAATCCTGTTG 1 4593 ( 131) CTTGCAAATCCTGTTG 1 48645 ( 53) CCAGCGAAGCATTTCG 1 46579 ( 48) CAAGCAAAGCATCCCG 1 36311 ( 63) CTTGCGAGTCATCCCG 1 45088 ( 249) CGTGCGAAGCGTTCTG 1 45912 ( 235) TGAGCAAATCCAGTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7275 bayes= 9.82714 E= 2.7e-001 -965 189 -965 -111 -12 -92 17 48 88 -965 -965 89 -965 -965 217 -965 -965 208 -965 -965 88 -965 117 -965 188 -965 -965 -965 169 -965 -83 -965 -965 -965 75 121 -965 208 -965 -965 47 108 -83 -965 -112 -965 -965 170 -965 8 117 -11 -965 66 -965 121 -965 140 -965 48 -965 -965 217 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 8 E= 2.7e-001 0.000000 0.875000 0.000000 0.125000 0.250000 0.125000 0.250000 0.375000 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.000000 0.000000 0.375000 0.625000 0.000000 1.000000 0.000000 0.000000 0.375000 0.500000 0.125000 0.000000 0.125000 0.000000 0.000000 0.875000 0.000000 0.250000 0.500000 0.250000 0.000000 0.375000 0.000000 0.625000 0.000000 0.625000 0.000000 0.375000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- C[TAG][AT]GC[AG]AA[TG]C[CA]T[GCT][TC][CT]G -------------------------------------------------------------------------------- Time 1.81 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 7 llr = 119 E-value = 5.0e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::a:6:::31:7149a999 pos.-specific C :1a:4:1:a:17:431:::: probability G :3::6:93::43:33:::11 matrix T a6:::4:7:73:31:::1:: bits 2.2 * * 2.0 * ** * * 1.7 * ** * * 1.5 * ** * * * Relative 1.3 * ** * * * ***** Entropy 1.1 * *** **** ** ***** (24.6 bits) 0.9 * ******** ** ***** 0.7 ********** ** ***** 0.4 ********** ** ****** 0.2 ******************** 0.0 -------------------- Multilevel TTCAGAGTCTGCACAAAAAA consensus G CT G ATGTGC sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 16719 85 1.80e-10 CTCCAGGATC TGCAGTGGCTGCACAAAAAA GTGTCAGTAT 4593 85 1.80e-10 CTCCAGGATC TGCAGTGGCTGCACAAAAAA GTATCAGTAT 48792 23 2.40e-09 AGAAGCCAAA TTCACAGTCAACAGAAAAAA TAACATATTG 48645 389 4.60e-09 CCTTCGTTCA TTCACAGTCATCATCAAAAA ACCCAGCGTA 48660 96 1.84e-08 ACGGCAATTG TCCAGAGTCTTCTCGAATAA TTGAGATTTG 21608 426 4.51e-08 TTGCATAACG TTCACTGTCTCGAACAAAAG TCGATCTTAC 36311 415 1.11e-07 ACCCGGCACG TTCAGACTCTGGTGGCAAGA TGTGGATAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 16719 1.8e-10 84_[+2]_396 4593 1.8e-10 84_[+2]_396 48792 2.4e-09 22_[+2]_458 48645 4.6e-09 388_[+2]_92 48660 1.8e-08 95_[+2]_385 21608 4.5e-08 425_[+2]_55 36311 1.1e-07 414_[+2]_66 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=7 16719 ( 85) TGCAGTGGCTGCACAAAAAA 1 4593 ( 85) TGCAGTGGCTGCACAAAAAA 1 48792 ( 23) TTCACAGTCAACAGAAAAAA 1 48645 ( 389) TTCACAGTCATCATCAAAAA 1 48660 ( 96) TCCAGAGTCTTCTCGAATAA 1 21608 ( 426) TTCACTGTCTCGAACAAAAG 1 36311 ( 415) TTCAGACTCTGGTGGCAAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7215 bayes= 9.85175 E= 5.0e-001 -945 -945 -945 189 -945 -73 36 108 -945 208 -945 -945 188 -945 -945 -945 -945 86 136 -945 107 -945 -945 67 -945 -73 194 -945 -945 -945 36 141 -945 208 -945 -945 7 -945 -945 141 -92 -73 94 8 -945 159 36 -945 140 -945 -945 8 -92 86 36 -91 66 27 36 -945 166 -73 -945 -945 188 -945 -945 -945 166 -945 -945 -91 166 -945 -64 -945 166 -945 -64 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 7 E= 5.0e-001 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.285714 0.571429 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.428571 0.571429 0.000000 0.571429 0.000000 0.000000 0.428571 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 0.285714 0.714286 0.000000 1.000000 0.000000 0.000000 0.285714 0.000000 0.000000 0.714286 0.142857 0.142857 0.428571 0.285714 0.000000 0.714286 0.285714 0.000000 0.714286 0.000000 0.000000 0.285714 0.142857 0.428571 0.285714 0.142857 0.428571 0.285714 0.285714 0.000000 0.857143 0.142857 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.857143 0.000000 0.000000 0.142857 0.857143 0.000000 0.142857 0.000000 0.857143 0.000000 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[TG]CA[GC][AT]G[TG]C[TA][GT][CG][AT][CG][ACG]AAAAA -------------------------------------------------------------------------------- Time 3.72 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 5 llr = 101 E-value = 1.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 4::a24:8:222:862a::6: pos.-specific C 628:::::::62:248:::2: probability G :8::8:a:a8:6a::::88:a matrix T ::2::6:2::2::::::222: bits 2.2 * * * * 2.0 * * * * * * 1.7 * * * * * * 1.5 * * * * * * * Relative 1.3 **** * ** ** **** * Entropy 1.1 ***** **** ******* * (29.3 bits) 0.9 ********** ******* * 0.7 ********************* 0.4 ********************* 0.2 ********************* 0.0 --------------------- Multilevel CGCAGTGAGGCGGAACAGGAG consensus ACT AA T AAA CCA TTC sequence TC T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 16719 198 1.02e-13 AGAAAACATT CGCAGTGAGGCGGAACAGGAG TTGCGGACTT 4593 198 3.49e-12 AGAAAACATT CGCAATGAGGCGGAACAGGAG TTGCGGACTT 48645 246 1.10e-09 TCAAAACTAG AGTAGAGAGGTGGCACAGGCG AGGTGTTCAA 11781 105 3.60e-09 GAATAAGTAG CCCAGAGTGAAAGACCAGGAG GAGCCACAGG 5527 20 3.78e-09 ATGGTCGGTG AGCAGTGAGGCCGACAATTTG CCTGTGAGTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 16719 1e-13 197_[+3]_282 4593 3.5e-12 197_[+3]_282 48645 1.1e-09 245_[+3]_234 11781 3.6e-09 104_[+3]_375 5527 3.8e-09 19_[+3]_460 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=5 16719 ( 198) CGCAGTGAGGCGGAACAGGAG 1 4593 ( 198) CGCAATGAGGCGGAACAGGAG 1 48645 ( 246) AGTAGAGAGGTGGCACAGGCG 1 11781 ( 105) CCCAGAGTGAAAGACCAGGAG 1 5527 ( 20) AGCAGTGAGGCCGACAATTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7200 bayes= 10.7426 E= 1.0e+000 56 134 -897 -897 -897 -24 184 -897 -897 176 -897 -43 188 -897 -897 -897 -44 -897 184 -897 56 -897 -897 115 -897 -897 216 -897 156 -897 -897 -43 -897 -897 216 -897 -44 -897 184 -897 -44 134 -897 -43 -44 -24 143 -897 -897 -897 216 -897 156 -24 -897 -897 114 76 -897 -897 -44 176 -897 -897 188 -897 -897 -897 -897 -897 184 -43 -897 -897 184 -43 114 -24 -897 -43 -897 -897 216 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 1.0e+000 0.400000 0.600000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.800000 0.000000 0.200000 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.400000 0.000000 0.000000 0.600000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.200000 0.600000 0.000000 0.200000 0.200000 0.200000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.800000 0.200000 0.600000 0.200000 0.000000 0.200000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CA][GC][CT]A[GA][TA]G[AT]G[GA][CAT][GAC]G[AC][AC][CA]A[GT][GT][ACT]G -------------------------------------------------------------------------------- Time 5.33 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36312 2.07e-01 500 46579 4.17e-03 47_[+1(1.72e-07)]_437 21608 5.06e-04 425_[+2(4.51e-08)]_55 5527 6.12e-05 19_[+3(3.78e-09)]_460 48645 7.01e-14 52_[+1(1.72e-07)]_177_\ [+3(1.10e-09)]_122_[+2(4.60e-09)]_92 48660 3.97e-04 95_[+2(1.84e-08)]_385 4593 8.63e-19 84_[+2(1.80e-10)]_26_[+1(9.94e-09)]_\ 51_[+3(3.49e-12)]_282 16421 4.39e-01 500 16719 2.90e-20 84_[+2(1.80e-10)]_26_[+1(9.94e-09)]_\ 51_[+3(1.02e-13)]_282 45088 1.83e-03 248_[+1(4.33e-07)]_236 11781 2.21e-05 104_[+3(3.60e-09)]_375 45912 9.72e-04 234_[+1(7.07e-07)]_250 7107 3.69e-05 74_[+1(2.65e-09)]_410 48792 9.64e-05 22_[+2(2.40e-09)]_458 36311 9.00e-07 62_[+1(3.18e-07)]_336_\ [+2(1.11e-07)]_66 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************