******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/316/316.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 32087 1.0000 500 55080 1.0000 500 49938 1.0000 500 52685 1.0000 500 45392 1.0000 500 49200 1.0000 500 48626 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/316/316.seqs.fa -oc motifs/316 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.291 C 0.220 G 0.206 T 0.283 Background letter frequencies (from dataset with add-one prior applied): A 0.291 C 0.220 G 0.206 T 0.283 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 7 llr = 81 E-value = 1.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a9999:14:a:: pos.-specific C :::::a1:1:a3 probability G ::111:66:::: matrix T :1::::1:9::7 bits 2.3 * * 2.1 * * 1.8 * * ** 1.6 * * ** Relative 1.4 * **** *** Entropy 1.1 ****** ***** (16.8 bits) 0.9 ****** ***** 0.7 ****** ***** 0.5 ************ 0.2 ************ 0.0 ------------ Multilevel AAAAACGGTACT consensus A C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 55080 42 9.99e-08 GCTCTACCTA AAAAACGGTACT TGTAGCTCGC 49200 274 9.62e-07 AGGTCTGCAG AAAAACTGTACT TTCTTAGACT 32087 364 1.65e-06 GGCGGTCCAT AAAAACCATACT TACGAAACTC 45392 489 2.07e-06 TATCTTAGCA AAAAACGGCACC 52685 221 2.71e-06 CTCTTAACAA AAAAACAATACT CAGCAATTTA 48626 38 6.36e-06 GAAGCGCAGA AAGGACGATACT AAGTCTCATG 49938 347 9.66e-06 TGGGATCCAG ATAAGCGGTACC TTGTTACAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 55080 1e-07 41_[+1]_447 49200 9.6e-07 273_[+1]_215 32087 1.7e-06 363_[+1]_125 45392 2.1e-06 488_[+1] 52685 2.7e-06 220_[+1]_268 48626 6.4e-06 37_[+1]_451 49938 9.7e-06 346_[+1]_142 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=7 55080 ( 42) AAAAACGGTACT 1 49200 ( 274) AAAAACTGTACT 1 32087 ( 364) AAAAACCATACT 1 45392 ( 489) AAAAACGGCACC 1 52685 ( 221) AAAAACAATACT 1 48626 ( 38) AAGGACGATACT 1 49938 ( 347) ATAAGCGGTACC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3423 bayes= 8.93074 E= 1.1e+001 178 -945 -945 -945 156 -945 -945 -98 156 -945 -53 -945 156 -945 -53 -945 156 -945 -53 -945 -945 218 -945 -945 -102 -62 147 -98 56 -945 147 -945 -945 -62 -945 160 178 -945 -945 -945 -945 218 -945 -945 -945 37 -945 134 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 1.1e+001 1.000000 0.000000 0.000000 0.000000 0.857143 0.000000 0.000000 0.142857 0.857143 0.000000 0.142857 0.000000 0.857143 0.000000 0.142857 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.142857 0.571429 0.142857 0.428571 0.000000 0.571429 0.000000 0.000000 0.142857 0.000000 0.857143 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.000000 0.714286 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- AAAAACG[GA]TAC[TC] -------------------------------------------------------------------------------- Time 0.45 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 7 llr = 83 E-value = 1.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 1::4:::::1::3: pos.-specific C 91311:::a:::1: probability G :361::1::::731 matrix T :6139a9a:9a339 bits 2.3 * 2.1 * 1.8 * ** * 1.6 * * ** * Relative 1.4 * ***** ** * Entropy 1.1 * ******** * (17.1 bits) 0.9 * * ******** * 0.7 *** ******** * 0.5 *** ******** * 0.2 *** ******** * 0.0 -------------- Multilevel CTGATTTTCTTGAT consensus GCT TG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 45392 69 5.07e-09 GTTTCTACTT CTGATTTTCTTGGT TTTGAATTGG 32087 387 6.08e-08 ACGAAACTCA CGGATTTTCTTGAT TCTTCCGAAT 49938 23 5.94e-07 CAATTCCAAG CTGTTTTTCTTGTG AACAATTATA 55080 392 1.37e-06 TTTCTTCTCT CTCTTTTTCTTTAT GTATGTGTTA 49200 114 3.24e-06 TATCCCCTTC CGGACTGTCTTGTT CTCAATATAA 52685 55 1.09e-05 CACATCATTA CTCGTTTTCATTCT TACAGTTAGC 48626 460 1.29e-05 TTTTCTTATC ACTCTTTTCTTGGT CACTTGCTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45392 5.1e-09 68_[+2]_418 32087 6.1e-08 386_[+2]_100 49938 5.9e-07 22_[+2]_464 55080 1.4e-06 391_[+2]_95 49200 3.2e-06 113_[+2]_373 52685 1.1e-05 54_[+2]_432 48626 1.3e-05 459_[+2]_27 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=7 45392 ( 69) CTGATTTTCTTGGT 1 32087 ( 387) CGGATTTTCTTGAT 1 49938 ( 23) CTGTTTTTCTTGTG 1 55080 ( 392) CTCTTTTTCTTTAT 1 49200 ( 114) CGGACTGTCTTGTT 1 52685 ( 55) CTCGTTTTCATTCT 1 48626 ( 460) ACTCTTTTCTTGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 3409 bayes= 8.92481 E= 1.2e+001 -102 196 -945 -945 -945 -62 47 101 -945 37 147 -98 56 -62 -53 1 -945 -62 -945 160 -945 -945 -945 182 -945 -945 -53 160 -945 -945 -945 182 -945 218 -945 -945 -102 -945 -945 160 -945 -945 -945 182 -945 -945 179 1 -3 -62 47 1 -945 -945 -53 160 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 7 E= 1.2e+001 0.142857 0.857143 0.000000 0.000000 0.000000 0.142857 0.285714 0.571429 0.000000 0.285714 0.571429 0.142857 0.428571 0.142857 0.142857 0.285714 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.714286 0.285714 0.285714 0.142857 0.285714 0.285714 0.000000 0.000000 0.142857 0.857143 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[TG][GC][AT]TTTTCTT[GT][AGT]T -------------------------------------------------------------------------------- Time 0.93 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 21 sites = 6 llr = 96 E-value = 4.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a2::2a:85282::5:3:2:2 pos.-specific C :2:a3:3:2:2::2:322:28 probability G ::7:2:5228:3::55:85:: matrix T :73:3:2:2::5a8:25:38: bits 2.3 * 2.1 * 1.8 * * * * 1.6 * * * * * * Relative 1.4 * * * * * * * Entropy 1.1 * ** * * ** *** * ** (23.1 bits) 0.9 * ** * * ** *** * ** 0.7 **** *** ** **** **** 0.5 **** *** ************ 0.2 **** **************** 0.0 --------------------- Multilevel ATGCCAGAAGATTTAGTGGTC consensus T T C G GCA T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45392 5 2.68e-10 TCGT ATGCAAGAAGATTTGTTGGTC TTCTTTTGGA 32087 274 4.05e-09 TCTGGTTAAG ATTCTACACGATTTAGAGTTC ATCGGTTGAC 49938 292 1.21e-08 CTTCGGGGTA ATTCGATGAGAGTTGCTGGTC GAGGCATGAC 49200 21 5.98e-08 ATTCACATCT ACGCTACAAGATTCAGACTTC TCTAATTAAC 52685 117 9.46e-08 GGGATCGAAC ATGCCAGAGAAATTAGTGACC GCGAGCGATC 48626 308 1.07e-07 GAATGTAGGA AAGCCAGATGCGTTGCCGGTA AATTCCTTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45392 2.7e-10 4_[+3]_475 32087 4e-09 273_[+3]_206 49938 1.2e-08 291_[+3]_188 49200 6e-08 20_[+3]_459 52685 9.5e-08 116_[+3]_363 48626 1.1e-07 307_[+3]_172 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=21 seqs=6 45392 ( 5) ATGCAAGAAGATTTGTTGGTC 1 32087 ( 274) ATTCTACACGATTTAGAGTTC 1 49938 ( 292) ATTCGATGAGAGTTGCTGGTC 1 49200 ( 21) ACGCTACAAGATTCAGACTTC 1 52685 ( 117) ATGCCAGAGAAATTAGTGACC 1 48626 ( 308) AAGCCAGATGCGTTGCCGGTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 3360 bayes= 8.94579 E= 4.9e+002 178 -923 -923 -923 -80 -40 -923 124 -923 -923 169 24 -923 218 -923 -923 -80 60 -31 24 178 -923 -923 -923 -923 60 128 -76 152 -923 -31 -923 78 -40 -31 -76 -80 -923 201 -923 152 -40 -923 -923 -80 -923 69 82 -923 -923 -923 182 -923 -40 -923 156 78 -923 128 -923 -923 60 128 -76 20 -40 -923 82 -923 -40 201 -923 -80 -923 128 24 -923 -40 -923 156 -80 192 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 4.9e+002 1.000000 0.000000 0.000000 0.000000 0.166667 0.166667 0.000000 0.666667 0.000000 0.000000 0.666667 0.333333 0.000000 1.000000 0.000000 0.000000 0.166667 0.333333 0.166667 0.333333 1.000000 0.000000 0.000000 0.000000 0.000000 0.333333 0.500000 0.166667 0.833333 0.000000 0.166667 0.000000 0.500000 0.166667 0.166667 0.166667 0.166667 0.000000 0.833333 0.000000 0.833333 0.166667 0.000000 0.000000 0.166667 0.000000 0.333333 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.000000 0.833333 0.500000 0.000000 0.500000 0.000000 0.000000 0.333333 0.500000 0.166667 0.333333 0.166667 0.000000 0.500000 0.000000 0.166667 0.833333 0.000000 0.166667 0.000000 0.500000 0.333333 0.000000 0.166667 0.000000 0.833333 0.166667 0.833333 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AT[GT]C[CT]A[GC]AAGA[TG]TT[AG][GC][TA]G[GT]TC -------------------------------------------------------------------------------- Time 1.41 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32087 2.33e-11 273_[+3(4.05e-09)]_69_\ [+1(1.65e-06)]_11_[+2(6.08e-08)]_47_[+2(7.02e-05)]_39 55080 3.24e-06 41_[+1(9.99e-08)]_338_\ [+2(1.37e-06)]_95 49938 2.80e-09 22_[+2(5.94e-07)]_255_\ [+3(1.21e-08)]_34_[+1(9.66e-06)]_142 52685 8.32e-08 54_[+2(1.09e-05)]_48_[+3(9.46e-08)]_\ 83_[+1(2.71e-06)]_268 45392 2.16e-13 4_[+3(2.68e-10)]_43_[+2(5.07e-09)]_\ 377_[+2(7.01e-06)]_15_[+1(2.07e-06)] 49200 6.96e-09 20_[+3(5.98e-08)]_72_[+2(3.24e-06)]_\ 146_[+1(9.62e-07)]_215 48626 2.36e-07 37_[+1(6.36e-06)]_258_\ [+3(1.07e-07)]_131_[+2(1.29e-05)]_27 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************