******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/159/159.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 13700 1.0000 500 48982 1.0000 500 44005 1.0000 500 11722 1.0000 500 34418 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/159/159.seqs.fa -oc motifs/159 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 5 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 2500 N= 5 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.260 C 0.244 G 0.234 T 0.261 Background letter frequencies (from dataset with add-one prior applied): A 0.260 C 0.244 G 0.234 T 0.261 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 3 llr = 57 E-value = 9.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a:::::a::::::::: pos.-specific C :7::37:::::a::a: probability G :3a:73:3:a7:a::: matrix T :::a:::7a:3::a:a bits 2.1 * * ** * 1.9 * ** * ** ***** 1.7 * ** * ** ***** 1.5 * ** * ** ***** Relative 1.3 * *** * ** ***** Entropy 1.0 **************** (27.6 bits) 0.8 **************** 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel ACGTGCATTGGCGTCT consensus G CG G T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 11722 162 2.04e-10 TGGTTCGCCA ACGTGCATTGGCGTCT GCTATTGTCT 34418 315 2.55e-09 CAAACTATTA ACGTGCAGTGTCGTCT GTCACTGTCA 44005 36 4.35e-09 CATTGTCCGT AGGTCGATTGGCGTCT TCCTTATCCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11722 2e-10 161_[+1]_323 34418 2.6e-09 314_[+1]_170 44005 4.3e-09 35_[+1]_449 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=3 11722 ( 162) ACGTGCATTGGCGTCT 1 34418 ( 315) ACGTGCAGTGTCGTCT 1 44005 ( 36) AGGTCGATTGGCGTCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 2425 bayes= 9.31551 E= 9.8e+001 194 -823 -823 -823 -823 144 51 -823 -823 -823 209 -823 -823 -823 -823 193 -823 45 151 -823 -823 144 51 -823 194 -823 -823 -823 -823 -823 51 135 -823 -823 -823 193 -823 -823 209 -823 -823 -823 151 35 -823 203 -823 -823 -823 -823 209 -823 -823 -823 -823 193 -823 203 -823 -823 -823 -823 -823 193 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 3 E= 9.8e+001 1.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.666667 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[CG]GT[GC][CG]A[TG]TG[GT]CGTCT -------------------------------------------------------------------------------- Time 0.28 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 5 llr = 72 E-value = 8.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 8:::2:::2:::4::: pos.-specific C 2822:2a::6:6:8:2 probability G :28:26:::::26::8 matrix T :::862:a84a2:2a: bits 2.1 * 1.9 ** * * 1.7 ** * * 1.5 ** * * Relative 1.3 **** *** * *** Entropy 1.0 **** ***** **** (20.9 bits) 0.8 **** ***** **** 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel ACGTTGCTTCTCGCTG consensus CGCCAC AT GAT C sequence GT T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 11722 332 1.12e-09 TATGAACGAA ACGTTGCTTTTCACTG CACTTCCAAT 44005 127 1.18e-08 TGATGACGGA ACGTGCCTTCTCGCTG CAGGCGAAGG 34418 473 1.06e-07 CACGTAGTTG ACGTATCTTCTCGCTC GCTTCCACTC 13700 301 2.93e-07 CCCATAAAAT AGGCTGCTTTTGACTG TGTGCGTCTG 48982 337 1.12e-06 GACCAAAGCA CCCTTGCTACTTGTTG GCGACACCCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11722 1.1e-09 331_[+2]_153 44005 1.2e-08 126_[+2]_358 34418 1.1e-07 472_[+2]_12 13700 2.9e-07 300_[+2]_184 48982 1.1e-06 336_[+2]_148 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=5 11722 ( 332) ACGTTGCTTTTCACTG 1 44005 ( 127) ACGTGCCTTCTCGCTG 1 34418 ( 473) ACGTATCTTCTCGCTC 1 13700 ( 301) AGGCTGCTTTTGACTG 1 48982 ( 337) CCCTTGCTACTTGTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 2425 bayes= 9.17088 E= 8.5e+002 162 -29 -897 -897 -897 171 -23 -897 -897 -29 177 -897 -897 -29 -897 161 -38 -897 -23 120 -897 -29 136 -38 -897 203 -897 -897 -897 -897 -897 193 -38 -897 -897 161 -897 129 -897 61 -897 -897 -897 193 -897 129 -23 -38 62 -897 136 -897 -897 171 -897 -38 -897 -897 -897 193 -897 -29 177 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 8.5e+002 0.800000 0.200000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.200000 0.000000 0.800000 0.200000 0.000000 0.200000 0.600000 0.000000 0.200000 0.600000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.600000 0.000000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.200000 0.200000 0.400000 0.000000 0.600000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.800000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AC][CG][GC][TC][TAG][GCT]CT[TA][CT]T[CGT][GA][CT]T[GC] -------------------------------------------------------------------------------- Time 0.61 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 5 llr = 60 E-value = 9.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::8::6a4:4: pos.-specific C 28:2a:::::42 probability G :2a::a4:6a22 matrix T 8::::::::::6 bits 2.1 * ** * 1.9 * ** * * 1.7 * ** * * 1.5 * ** * * Relative 1.3 ****** * * Entropy 1.0 ********** (17.3 bits) 0.8 ********** 0.6 ********** * 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TCGACGAAGGAT consensus CG C G A CC sequence GG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 48982 50 1.55e-07 ATGGTCAGTC TCGACGGAGGCT CGGTCGGTTG 13700 218 7.80e-07 TGACGAGGTA TCGACGAAAGGT GTTCTTGTCA 44005 100 2.21e-06 TGGACTGTGG TGGACGAAAGCT GCATTTGATG 11722 254 3.65e-06 GTTTGGGAAG TCGCCGAAGGAG CCAAGCTAGC 34418 404 5.08e-06 TCGCCGCCCA CCGACGGAGGAC ACGTGAGAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48982 1.5e-07 49_[+3]_439 13700 7.8e-07 217_[+3]_271 44005 2.2e-06 99_[+3]_389 11722 3.6e-06 253_[+3]_235 34418 5.1e-06 403_[+3]_85 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=5 48982 ( 50) TCGACGGAGGCT 1 13700 ( 218) TCGACGAAAGGT 1 44005 ( 100) TGGACGAAAGCT 1 11722 ( 254) TCGCCGAAGGAG 1 34418 ( 404) CCGACGGAGGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 2445 bayes= 8.93074 E= 9.7e+002 -897 -29 -897 161 -897 171 -23 -897 -897 -897 209 -897 162 -29 -897 -897 -897 203 -897 -897 -897 -897 209 -897 120 -897 77 -897 194 -897 -897 -897 62 -897 136 -897 -897 -897 209 -897 62 71 -23 -897 -897 -29 -23 120 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 5 E= 9.7e+002 0.000000 0.200000 0.000000 0.800000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.000000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.400000 0.200000 0.000000 0.000000 0.200000 0.200000 0.600000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TC][CG]G[AC]CG[AG]A[GA]G[ACG][TCG] -------------------------------------------------------------------------------- Time 0.92 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 13700 6.83e-06 217_[+3(7.80e-07)]_71_\ [+2(2.93e-07)]_184 48982 2.17e-06 49_[+3(1.55e-07)]_275_\ [+2(1.12e-06)]_58_[+3(7.57e-05)]_78 44005 7.10e-12 35_[+1(4.35e-09)]_48_[+3(2.21e-06)]_\ 15_[+2(1.18e-08)]_358 11722 6.88e-14 161_[+1(2.04e-10)]_76_\ [+3(3.65e-06)]_66_[+2(1.12e-09)]_153 34418 7.34e-11 314_[+1(2.55e-09)]_73_\ [+3(5.08e-06)]_57_[+2(1.06e-07)]_12 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************