******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/236/236.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 43034 1.0000 500 46436 1.0000 500 48492 1.0000 500 48495 1.0000 500 15563 1.0000 500 45503 1.0000 500 45564 1.0000 500 38937 1.0000 500 43146 1.0000 500 37305 1.0000 500 46783 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/236/236.seqs.fa -oc motifs/236 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.283 C 0.246 G 0.217 T 0.253 Background letter frequencies (from dataset with add-one prior applied): A 0.283 C 0.246 G 0.217 T 0.253 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 9 llr = 95 E-value = 2.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :a:21a:9::83 pos.-specific C 1:28::1:::22 probability G 2:8:6:9:a6:2 matrix T 7:::3::1:4:2 bits 2.2 * 2.0 * 1.8 * ** * 1.5 * ** * Relative 1.3 *** **** Entropy 1.1 *** ****** (15.2 bits) 0.9 **** ****** 0.7 *********** 0.4 *********** 0.2 *********** 0.0 ------------ Multilevel TAGCGAGAGGAA consensus G CAT TCC sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 45564 152 3.67e-07 CCTACCTACC TAGCTAGAGGAA TATCCGGAAC 48492 456 4.79e-07 TCAACAAAAT TAGCGAGAGTAT CTGAAAATTC 46783 316 7.96e-07 AACGCTACAG TAGCTAGAGTAA GCCATTTAAA 38937 398 3.25e-06 GTGTCTGAGA CAGCGAGAGGAT GGTCGAACAA 43146 214 4.59e-06 CTTTTGAAGA GAGAGAGAGGAA AACACTCTTC 48495 237 7.10e-06 TTATTCGAAC TAGCTAGTGGAG CGTTCCCATC 46436 246 1.84e-05 CAAAAATCGA TAGCAAGAGTCC GTTCTATGTA 15563 209 2.76e-05 GACGATCATC TACCGACAGTAC CGTCAACGCG 43034 358 4.05e-05 ATTGGAATTC GACAGAGAGGCG TTAATCATGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45564 3.7e-07 151_[+1]_337 48492 4.8e-07 455_[+1]_33 46783 8e-07 315_[+1]_173 38937 3.2e-06 397_[+1]_91 43146 4.6e-06 213_[+1]_275 48495 7.1e-06 236_[+1]_252 46436 1.8e-05 245_[+1]_243 15563 2.8e-05 208_[+1]_280 43034 4e-05 357_[+1]_131 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=9 45564 ( 152) TAGCTAGAGGAA 1 48492 ( 456) TAGCGAGAGTAT 1 46783 ( 316) TAGCTAGAGTAA 1 38937 ( 398) CAGCGAGAGGAT 1 43146 ( 214) GAGAGAGAGGAA 1 48495 ( 237) TAGCTAGTGGAG 1 46436 ( 246) TAGCAAGAGTCC 1 15563 ( 209) TACCGACAGTAC 1 43034 ( 358) GACAGAGAGGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 5379 bayes= 9.3553 E= 2.9e+002 -982 -115 3 140 182 -982 -982 -982 -982 -15 184 -982 -35 166 -982 -982 -135 -982 135 40 182 -982 -982 -982 -982 -115 203 -982 165 -982 -982 -118 -982 -982 220 -982 -982 -982 135 81 146 -15 -982 -982 23 -15 3 -19 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 2.9e+002 0.000000 0.111111 0.222222 0.666667 1.000000 0.000000 0.000000 0.000000 0.000000 0.222222 0.777778 0.000000 0.222222 0.777778 0.000000 0.000000 0.111111 0.000000 0.555556 0.333333 1.000000 0.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.000000 0.888889 0.000000 0.000000 0.111111 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.555556 0.444444 0.777778 0.222222 0.000000 0.000000 0.333333 0.222222 0.222222 0.222222 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TG]A[GC][CA][GT]AGAG[GT][AC][ACGT] -------------------------------------------------------------------------------- Time 1.14 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 5 llr = 71 E-value = 5.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :28:::22::2::: pos.-specific C ::2:::24a:::a8 probability G a8:8::44:26a:: matrix T :::2aa2::82::2 bits 2.2 * * 2.0 * ** * ** 1.8 * ** * ** 1.5 * *** * ** Relative 1.3 ** *** ** *** Entropy 1.1 ****** ** *** (20.3 bits) 0.9 ****** ** *** 0.7 ****** ****** 0.4 ****** ******* 0.2 ************** 0.0 -------------- Multilevel GGAGTTGCCTGGCC consensus ACT AG GA T sequence CA T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 45503 218 3.30e-09 AAGAATTTTA GGAGTTGCCTGGCC AACTCCGTCT 45564 370 5.49e-08 TACTCGCATC GGCGTTCGCTGGCC TGACTTATTC 46783 239 2.40e-07 TGAATACGAA GGATTTGGCTGGCT CACAAGCTTT 37305 342 3.61e-07 AAGCTTGAAA GGAGTTTCCGAGCC ACACTCGAAA 43034 408 9.16e-07 TTTCGTCATC GAAGTTAACTTGCC ACAACAGAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45503 3.3e-09 217_[+2]_269 45564 5.5e-08 369_[+2]_117 46783 2.4e-07 238_[+2]_248 37305 3.6e-07 341_[+2]_145 43034 9.2e-07 407_[+2]_79 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=5 45503 ( 218) GGAGTTGCCTGGCC 1 45564 ( 370) GGCGTTCGCTGGCC 1 46783 ( 239) GGATTTGGCTGGCT 1 37305 ( 342) GGAGTTTCCGAGCC 1 43034 ( 408) GAAGTTAACTTGCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 5357 bayes= 10.3157 E= 5.6e+002 -897 -897 220 -897 -50 -897 188 -897 150 -30 -897 -897 -897 -897 188 -34 -897 -897 -897 198 -897 -897 -897 198 -50 -30 88 -34 -50 70 88 -897 -897 202 -897 -897 -897 -897 -12 166 -50 -897 146 -34 -897 -897 220 -897 -897 202 -897 -897 -897 170 -897 -34 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 5 E= 5.6e+002 0.000000 0.000000 1.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.200000 0.200000 0.400000 0.200000 0.200000 0.400000 0.400000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.200000 0.000000 0.600000 0.200000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[GA][AC][GT]TT[GACT][CGA]C[TG][GAT]GC[CT] -------------------------------------------------------------------------------- Time 2.34 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 7 llr = 109 E-value = 8.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 31::14:4::7633::71:: pos.-specific C ::1:91a:4::37:9a::4: probability G 7:36::::641::31::14a matrix T :964:4:6:611:4::371: bits 2.2 * 2.0 * * * 1.8 * * * 1.5 * ** * Relative 1.3 ** * * ** * Entropy 1.1 ** ** * ** * *** * (22.4 bits) 0.9 ** ** **** * **** * 0.7 ***** ***** * ****** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel GTTGCACTGTAACTCCATCG consensus A GT T ACG CAA T G sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 38937 360 1.16e-09 GGCTCTTTGT GTTTCTCTCGATCTCCATGG TCGTTGCCGT 46436 338 2.73e-09 TCCAGGGACG GTTGCACTGGAAAACCTTGG TTGGGAACCA 15563 476 5.77e-09 ACGATGAAGA GTCTCTCTGTGACTCCATGG TGGCG 48495 459 5.02e-08 GGCACACAAA ATTGCTCACTACAGCCTTCG AGCGAAACGC 45503 28 5.88e-08 GGCCAACAAT ATTGCACAGTAACGCCAATG CCTTGTTTTT 37305 469 2.22e-07 CACGGCAGAC GAGGAACAGGTACTCCATCG ATTGTGAGAA 43146 31 2.61e-07 TACTGTAAAG GTGTCCCTCTACCAGCAGCG TATATACCTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 38937 1.2e-09 359_[+3]_121 46436 2.7e-09 337_[+3]_143 15563 5.8e-09 475_[+3]_5 48495 5e-08 458_[+3]_22 45503 5.9e-08 27_[+3]_453 37305 2.2e-07 468_[+3]_12 43146 2.6e-07 30_[+3]_450 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=7 38937 ( 360) GTTTCTCTCGATCTCCATGG 1 46436 ( 338) GTTGCACTGGAAAACCTTGG 1 15563 ( 476) GTCTCTCTGTGACTCCATGG 1 48495 ( 459) ATTGCTCACTACAGCCTTCG 1 45503 ( 28) ATTGCACAGTAACGCCAATG 1 37305 ( 469) GAGGAACAGGTACTCCATCG 1 43146 ( 31) GTGTCCCTCTACCAGCAGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 5291 bayes= 10.1664 E= 8.3e+002 1 -945 171 -945 -99 -945 -945 176 -945 -78 39 117 -945 -945 139 76 -99 180 -945 -945 60 -78 -945 76 -945 202 -945 -945 60 -945 -945 117 -945 80 139 -945 -945 -945 98 117 133 -945 -61 -82 101 21 -945 -82 1 153 -945 -945 1 -945 39 76 -945 180 -61 -945 -945 202 -945 -945 133 -945 -945 18 -99 -945 -61 150 -945 80 98 -82 -945 -945 220 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 7 E= 8.3e+002 0.285714 0.000000 0.714286 0.000000 0.142857 0.000000 0.000000 0.857143 0.000000 0.142857 0.285714 0.571429 0.000000 0.000000 0.571429 0.428571 0.142857 0.857143 0.000000 0.000000 0.428571 0.142857 0.000000 0.428571 0.000000 1.000000 0.000000 0.000000 0.428571 0.000000 0.000000 0.571429 0.000000 0.428571 0.571429 0.000000 0.000000 0.000000 0.428571 0.571429 0.714286 0.000000 0.142857 0.142857 0.571429 0.285714 0.000000 0.142857 0.285714 0.714286 0.000000 0.000000 0.285714 0.000000 0.285714 0.428571 0.000000 0.857143 0.142857 0.000000 0.000000 1.000000 0.000000 0.000000 0.714286 0.000000 0.000000 0.285714 0.142857 0.000000 0.142857 0.714286 0.000000 0.428571 0.428571 0.142857 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA]T[TG][GT]C[AT]C[TA][GC][TG]A[AC][CA][TAG]CC[AT]T[CG]G -------------------------------------------------------------------------------- Time 3.38 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 43034 2.64e-04 357_[+1(4.05e-05)]_38_\ [+2(9.16e-07)]_79 46436 7.37e-07 245_[+1(1.84e-05)]_80_\ [+3(2.73e-09)]_143 48492 4.13e-04 455_[+1(4.79e-07)]_33 48495 5.62e-06 236_[+1(7.10e-06)]_210_\ [+3(5.02e-08)]_22 15563 4.61e-06 208_[+1(2.76e-05)]_255_\ [+3(5.77e-09)]_5 45503 9.05e-09 27_[+3(5.88e-08)]_170_\ [+2(3.30e-09)]_269 45564 1.91e-07 151_[+1(3.67e-07)]_206_\ [+2(5.49e-08)]_117 38937 3.07e-08 359_[+3(1.16e-09)]_18_\ [+1(3.25e-06)]_91 43146 2.63e-05 30_[+3(2.61e-07)]_163_\ [+1(4.59e-06)]_275 37305 3.24e-06 341_[+2(3.61e-07)]_113_\ [+3(2.22e-07)]_12 46783 6.78e-06 238_[+2(2.40e-07)]_63_\ [+1(7.96e-07)]_173 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************