******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/17/17.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 21296 1.0000 500 28652 1.0000 500 32866 1.0000 500 54998 1.0000 500 25867 1.0000 500 45990 1.0000 500 47530 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/17/17.seqs.fa -oc motifs/17 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 7 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 3500 N= 7 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.246 C 0.269 G 0.223 T 0.262 Background letter frequencies (from dataset with add-one prior applied): A 0.246 C 0.269 G 0.223 T 0.262 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 7 llr = 78 E-value = 1.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1::::::4197: pos.-specific C 1::1:::341:a probability G ::1::3a:4:3: matrix T 7a99a7:3:::: bits 2.2 * 1.9 * * * * 1.7 * * * * 1.5 * * * * * Relative 1.3 **** * *** Entropy 1.1 ****** *** (16.2 bits) 0.9 ******* *** 0.6 ******* **** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTTTTTGACAAC consensus G CG G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47530 269 1.42e-07 GCGTGTACGT TTTTTTGACAAC CCCCCGAGTC 32866 450 2.81e-07 ACAGGCCTCA TTTTTTGCGAAC ATATTGCCAA 28652 210 1.77e-06 GCTGACATCG TTTTTGGAGAGC TGGGCGTGGA 45990 355 2.79e-06 TCGACACGCC CTTTTTGCGAAC GGTTGACTGG 54998 441 4.31e-06 GATCCTGTGC TTTCTTGTCAAC ACCGTACACT 25867 91 1.72e-05 GCTCTTTTGT TTTTTGGTCCGC CTTCACCATC 21296 149 1.81e-05 ACGGAATGTT ATGTTTGAAAAC ACTACCTACC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47530 1.4e-07 268_[+1]_220 32866 2.8e-07 449_[+1]_39 28652 1.8e-06 209_[+1]_279 45990 2.8e-06 354_[+1]_134 54998 4.3e-06 440_[+1]_48 25867 1.7e-05 90_[+1]_398 21296 1.8e-05 148_[+1]_340 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=7 47530 ( 269) TTTTTTGACAAC 1 32866 ( 450) TTTTTTGCGAAC 1 28652 ( 210) TTTTTGGAGAGC 1 45990 ( 355) CTTTTTGCGAAC 1 54998 ( 441) TTTCTTGTCAAC 1 25867 ( 91) TTTTTGGTCCGC 1 21296 ( 149) ATGTTTGAAAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3423 bayes= 8.93074 E= 1.0e+002 -78 -91 -945 145 -945 -945 -945 193 -945 -945 -64 171 -945 -91 -945 171 -945 -945 -945 193 -945 -945 36 145 -945 -945 216 -945 80 9 -945 12 -78 67 94 -945 180 -91 -945 -945 154 -945 36 -945 -945 189 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 1.0e+002 0.142857 0.142857 0.000000 0.714286 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.142857 0.857143 0.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.285714 0.714286 0.000000 0.000000 1.000000 0.000000 0.428571 0.285714 0.000000 0.285714 0.142857 0.428571 0.428571 0.000000 0.857143 0.142857 0.000000 0.000000 0.714286 0.000000 0.285714 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TTTTT[TG]G[ACT][CG]A[AG]C -------------------------------------------------------------------------------- Time 0.44 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 5 llr = 64 E-value = 3.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :8::42:::::8 pos.-specific C ::::642::6:: probability G a2:a:::::4a: matrix T ::a::48aa::2 bits 2.2 * * * 1.9 * ** ** * 1.7 * ** ** * 1.5 * ** ** * Relative 1.3 **** *** ** Entropy 1.1 ***** ****** (18.5 bits) 0.9 ***** ****** 0.6 ***** ****** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GATGCCTTTCGA consensus G ATC G T sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 25867 215 2.22e-07 CAATGAGTTG GATGCCTTTGGA CCCGCCCCGA 54998 55 2.22e-07 CCTTACGAAA GATGCTTTTGGA TTTCCAGTTG 47530 205 3.33e-07 CGTTGAAACG GATGACTTTCGA CCGTGCTCGT 32866 404 1.50e-06 GTTCTATGGA GATGATCTTCGA GTTTCTTGGA 21296 49 3.87e-06 TCGTGTCGTA GGTGCATTTCGT GTCATCTGTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25867 2.2e-07 214_[+2]_274 54998 2.2e-07 54_[+2]_434 47530 3.3e-07 204_[+2]_284 32866 1.5e-06 403_[+2]_85 21296 3.9e-06 48_[+2]_440 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=5 25867 ( 215) GATGCCTTTGGA 1 54998 ( 55) GATGCTTTTGGA 1 47530 ( 205) GATGACTTTCGA 1 32866 ( 404) GATGATCTTCGA 1 21296 ( 49) GGTGCATTTCGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3423 bayes= 9.66888 E= 3.1e+002 -897 -897 216 -897 170 -897 -16 -897 -897 -897 -897 193 -897 -897 216 -897 70 116 -897 -897 -30 57 -897 61 -897 -43 -897 161 -897 -897 -897 193 -897 -897 -897 193 -897 116 84 -897 -897 -897 216 -897 170 -897 -897 -39 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 5 E= 3.1e+002 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.200000 0.400000 0.000000 0.400000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.800000 0.000000 0.000000 0.200000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- G[AG]TG[CA][CTA][TC]TT[CG]G[AT] -------------------------------------------------------------------------------- Time 0.93 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 7 llr = 76 E-value = 1.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a3494:94119a pos.-specific C :66:3::61:1: probability G :1::1a1:79:: matrix T :::11::::::: bits 2.2 * 1.9 * * * 1.7 * * * 1.5 * * ** *** Relative 1.3 * * ** *** Entropy 1.1 * * ** *** (15.7 bits) 0.9 * ** ******* 0.6 **** ******* 0.4 **** ******* 0.2 ************ 0.0 ------------ Multilevel ACCAAGACGGAA consensus AA C A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 25867 464 2.28e-07 ATCACCAATT ACCACGACGGAA CGCCGACAAC 54998 275 2.28e-07 ACGAGTGTCA ACCACGACGGAA GATACCACCG 32866 84 5.69e-06 TTGTTCCGTC AAAAAGGAGGAA TGGCGGTGGA 47530 332 6.09e-06 TTCCGCGAAA AAAAAGAACGAA GCGGATATTC 21296 97 8.58e-06 GTCGCTCCAG ACATTGACGGAA CCAATCCACA 45990 14 1.34e-05 ATTCGGCCGA ACCAAGAAAGCA GGCATCCCTC 28652 5 1.60e-05 ACAC AGCAGGACGAAA GTGGTCACGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25867 2.3e-07 463_[+3]_25 54998 2.3e-07 274_[+3]_214 32866 5.7e-06 83_[+3]_405 47530 6.1e-06 331_[+3]_157 21296 8.6e-06 96_[+3]_392 45990 1.3e-05 13_[+3]_475 28652 1.6e-05 4_[+3]_484 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=7 25867 ( 464) ACCACGACGGAA 1 54998 ( 275) ACCACGACGGAA 1 32866 ( 84) AAAAAGGAGGAA 1 47530 ( 332) AAAAAGAACGAA 1 21296 ( 97) ACATTGACGGAA 1 45990 ( 14) ACCAAGAAAGCA 1 28652 ( 5) AGCAGGACGAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 3423 bayes= 9.53747 E= 1.1e+002 202 -945 -945 -945 22 109 -64 -945 80 109 -945 -945 180 -945 -945 -87 80 9 -64 -87 -945 -945 216 -945 180 -945 -64 -945 80 109 -945 -945 -78 -91 168 -945 -78 -945 194 -945 180 -91 -945 -945 202 -945 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 7 E= 1.1e+002 1.000000 0.000000 0.000000 0.000000 0.285714 0.571429 0.142857 0.000000 0.428571 0.571429 0.000000 0.000000 0.857143 0.000000 0.000000 0.142857 0.428571 0.285714 0.142857 0.142857 0.000000 0.000000 1.000000 0.000000 0.857143 0.000000 0.142857 0.000000 0.428571 0.571429 0.000000 0.000000 0.142857 0.142857 0.714286 0.000000 0.142857 0.000000 0.857143 0.000000 0.857143 0.142857 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- A[CA][CA]A[AC]GA[CA]GGAA -------------------------------------------------------------------------------- Time 1.33 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21296 1.07e-05 48_[+2(3.87e-06)]_36_[+3(8.58e-06)]_\ 40_[+1(1.81e-05)]_340 28652 4.01e-04 4_[+3(1.60e-05)]_193_[+1(1.77e-06)]_\ 279 32866 7.44e-08 83_[+3(5.69e-06)]_308_\ [+2(1.50e-06)]_34_[+1(2.81e-07)]_39 54998 8.22e-09 54_[+2(2.22e-07)]_208_\ [+3(2.28e-07)]_154_[+1(4.31e-06)]_48 25867 2.92e-08 90_[+1(1.72e-05)]_112_\ [+2(2.22e-07)]_237_[+3(2.28e-07)]_25 45990 6.23e-04 13_[+3(1.34e-05)]_329_\ [+1(2.79e-06)]_134 47530 1.06e-08 204_[+2(3.33e-07)]_52_\ [+1(1.42e-07)]_51_[+3(6.09e-06)]_157 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************