******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/1/1.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42567 1.0000 500 48310 1.0000 500 45345 1.0000 500 44744 1.0000 500 45210 1.0000 500 49962 1.0000 500 39150 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/1/1.seqs.fa -oc motifs/1 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.248 G 0.229 T 0.253 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.248 G 0.229 T 0.253 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 7 llr = 80 E-value = 5.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::7:::1::79 pos.-specific C :4a::a::4::: probability G 4:::1:9:3a:1 matrix T 66:39:193:3: bits 2.1 * 1.9 * * * 1.7 * * * 1.5 * *** * Relative 1.3 * **** * * Entropy 1.1 ******** *** (16.4 bits) 0.9 ******** *** 0.6 ******** *** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTCATCGTCGAA consensus GC T G T sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 45345 362 6.12e-07 CACACTCCAG TCCATCGTTGAA GGCCCCCACT 49962 309 7.87e-07 GACAGTCGAC GCCATCGTGGAA CGGCGCGTGC 42567 85 1.74e-06 AGTATTATGA GTCTTCGTTGAA ACTCCGGCGT 45210 52 3.54e-06 AAGGCATCGA TTCATCGAGGAA AGTCTCTAAG 48310 40 4.55e-06 TGATTCCTAT GCCATCTTCGAA GTGGGTCATT 39150 31 5.02e-06 CACAGAGCTC TTCATCGTCGTG TTCCACAATA 44744 54 1.06e-05 CTTGCCTTTG TTCTGCGTCGTA TCGGACAGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45345 6.1e-07 361_[+1]_127 49962 7.9e-07 308_[+1]_180 42567 1.7e-06 84_[+1]_404 45210 3.5e-06 51_[+1]_437 48310 4.6e-06 39_[+1]_449 39150 5e-06 30_[+1]_458 44744 1.1e-05 53_[+1]_435 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=7 45345 ( 362) TCCATCGTTGAA 1 49962 ( 309) GCCATCGTGGAA 1 42567 ( 85) GTCTTCGTTGAA 1 45210 ( 52) TTCATCGAGGAA 1 48310 ( 40) GCCATCTTCGAA 1 39150 ( 31) TTCATCGTCGTG 1 44744 ( 54) TTCTGCGTCGTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3423 bayes= 8.93074 E= 5.2e+001 -945 -945 90 118 -945 79 -945 118 -945 201 -945 -945 140 -945 -945 18 -945 -945 -68 176 -945 201 -945 -945 -945 -945 190 -82 -92 -945 -945 176 -945 79 32 18 -945 -945 213 -945 140 -945 -945 18 166 -945 -68 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 5.2e+001 0.000000 0.000000 0.428571 0.571429 0.000000 0.428571 0.000000 0.571429 0.000000 1.000000 0.000000 0.000000 0.714286 0.000000 0.000000 0.285714 0.000000 0.000000 0.142857 0.857143 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.142857 0.000000 0.000000 0.857143 0.000000 0.428571 0.285714 0.285714 0.000000 0.000000 1.000000 0.000000 0.714286 0.000000 0.000000 0.285714 0.857143 0.000000 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TG][TC]C[AT]TCGT[CGT]G[AT]A -------------------------------------------------------------------------------- Time 0.46 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 6 llr = 82 E-value = 1.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::8:::2255::2: pos.-specific C :3:::2:2:3:a::: probability G ::a::5a:322:27: matrix T a7:2a3:75:3:82a bits 2.1 * * 1.9 * * * * * * 1.7 * * * * * * 1.5 * * * * * * Relative 1.3 * *** * ** * Entropy 1.1 ***** * ** * (19.6 bits) 0.9 ***** * **** 0.6 ********* **** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel TTGATGGTTAACTGT consensus C T GCT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 39150 415 3.38e-09 TCCATTTCTG TTGATGGTTCACTGT AAGCACGGAA 49962 76 1.92e-07 AATCCTATAC TTGATTGATCTCTGT ACCGAAATGT 45210 11 3.41e-07 CGGAAACACG TCGTTTGTGAACTGT GGAAAGGCAA 44744 315 4.74e-07 TACTGATTTC TTGATGGTTGGCGGT CGACCATGGA 45345 165 6.78e-07 CGGAGCACGA TCGATCGTGAACTAT ACTAGTTCAC 48310 268 1.12e-06 CCTTGTGCTC TTGATGGCAATCTTT TGTACATGCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39150 3.4e-09 414_[+2]_71 49962 1.9e-07 75_[+2]_410 45210 3.4e-07 10_[+2]_475 44744 4.7e-07 314_[+2]_171 45345 6.8e-07 164_[+2]_321 48310 1.1e-06 267_[+2]_218 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=6 39150 ( 415) TTGATGGTTCACTGT 1 49962 ( 76) TTGATTGATCTCTGT 1 45210 ( 11) TCGTTTGTGAACTGT 1 44744 ( 315) TTGATGGTTGGCGGT 1 45345 ( 165) TCGATCGTGAACTAT 1 48310 ( 268) TTGATGGCAATCTTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 3402 bayes= 9.5928 E= 1.8e+002 -923 -923 -923 198 -923 43 -923 140 -923 -923 213 -923 162 -923 -923 -60 -923 -923 -923 198 -923 -57 113 40 -923 -923 213 -923 -70 -57 -923 140 -70 -923 54 98 88 43 -46 -923 88 -923 -46 40 -923 201 -923 -923 -923 -923 -46 172 -70 -923 154 -60 -923 -923 -923 198 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 6 E= 1.8e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.500000 0.333333 0.000000 0.000000 1.000000 0.000000 0.166667 0.166667 0.000000 0.666667 0.166667 0.000000 0.333333 0.500000 0.500000 0.333333 0.166667 0.000000 0.500000 0.000000 0.166667 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.166667 0.000000 0.666667 0.166667 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[TC]GAT[GT]GT[TG][AC][AT]CTGT -------------------------------------------------------------------------------- Time 0.91 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 5 llr = 75 E-value = 3.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::2:::::8::::a6: pos.-specific C :42:28:::28:a:44 probability G 8:::228:28:8:::6 matrix T 266a6:2a::22:::: bits 2.1 1.9 * * ** 1.7 * * ** 1.5 * * ** Relative 1.3 * * ********* Entropy 1.1 ** * *********** (21.5 bits) 0.9 ** * *********** 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GTTTTCGTAGCGCAAG consensus TCA CGT GCTT CC sequence C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 42567 455 2.86e-09 GAAACATCGA GTTTTGGTAGCGCAAG TTTTCCACTT 45345 51 1.90e-08 CACCCAAAGT GCATTCGTAGCGCACC GCAACGGCAC 48310 346 3.09e-08 GGATGGATTC GTTTTCTTGGCGCAAG AGACGCAAAA 45210 257 1.94e-07 CATTTACGAA GCCTCCGTACCGCACG CTCGGCTACC 39150 185 5.58e-07 GCTTTTATTG TTTTGCGTAGTTCAAC TTGAGCTATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42567 2.9e-09 454_[+3]_30 45345 1.9e-08 50_[+3]_434 48310 3.1e-08 345_[+3]_139 45210 1.9e-07 256_[+3]_228 39150 5.6e-07 184_[+3]_300 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=5 42567 ( 455) GTTTTGGTAGCGCAAG 1 45345 ( 51) GCATTCGTAGCGCACC 1 48310 ( 346) GTTTTCTTGGCGCAAG 1 45210 ( 257) GCCTCCGTACCGCACG 1 39150 ( 185) TTTTGCGTAGTTCAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 3395 bayes= 10.3496 E= 3.7e+002 -897 -897 180 -34 -897 69 -897 124 -43 -31 -897 124 -897 -897 -897 198 -897 -31 -19 124 -897 169 -19 -897 -897 -897 180 -34 -897 -897 -897 198 156 -897 -19 -897 -897 -31 180 -897 -897 169 -897 -34 -897 -897 180 -34 -897 201 -897 -897 188 -897 -897 -897 115 69 -897 -897 -897 69 139 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 5 E= 3.7e+002 0.000000 0.000000 0.800000 0.200000 0.000000 0.400000 0.000000 0.600000 0.200000 0.200000 0.000000 0.600000 0.000000 0.000000 0.000000 1.000000 0.000000 0.200000 0.200000 0.600000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GT][TC][TAC]T[TCG][CG][GT]T[AG][GC][CT][GT]CA[AC][GC] -------------------------------------------------------------------------------- Time 1.33 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42567 2.35e-07 84_[+1(1.74e-06)]_358_\ [+3(2.86e-09)]_30 48310 6.01e-09 39_[+1(4.55e-06)]_216_\ [+2(1.12e-06)]_63_[+3(3.09e-08)]_139 45345 3.76e-10 50_[+3(1.90e-08)]_98_[+2(6.78e-07)]_\ 182_[+1(6.12e-07)]_127 44744 9.99e-05 3_[+2(8.69e-05)]_35_[+1(1.06e-05)]_\ 249_[+2(4.74e-07)]_171 45210 8.66e-09 10_[+2(3.41e-07)]_26_[+1(3.54e-06)]_\ 193_[+3(1.94e-07)]_228 49962 4.79e-06 75_[+2(1.92e-07)]_218_\ [+1(7.87e-07)]_180 39150 4.45e-10 30_[+1(5.02e-06)]_142_\ [+3(5.58e-07)]_214_[+2(3.38e-09)]_71 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************