******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/417/417.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 4950 1.0000 500 48891 1.0000 500 50055 1.0000 500 44920 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/417/417.seqs.fa -oc motifs/417 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 4 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 2000 N= 4 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.266 C 0.245 G 0.221 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.266 C 0.246 G 0.221 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 18 sites = 4 llr = 69 E-value = 5.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::aa:3:::3::8385:: pos.-specific C a:::55a8::3835353: probability G :a::3::3a883::::5a matrix T ::::33:::::::3::3: bits 2.2 * * * 2.0 **** * * * 1.7 **** * * * 1.5 **** * * * Relative 1.3 **** ****** * Entropy 1.1 **** ******* * * (24.8 bits) 0.9 **** ******* ** * 0.7 **** ******* **** 0.4 ****************** 0.2 ****************** 0.0 ------------------ Multilevel CGAACCCCGGGCACAAGG consensus GA G ACGCACCC sequence TT T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 4950 34 2.31e-09 TTTGCTTTGA CGAAGCCCGGGCAACCGG AGAATATAAT 50055 24 6.06e-09 ATCCGGGAAT CGAACCCGGGGCCCAACG ATGGCAACGT 48891 362 8.40e-09 GAGAAGCATT CGAATACCGGGGATACGG GCCGAGTCAA 44920 339 2.31e-08 GCCTCAAACT CGAACTCCGACCACAATG CGAATTTTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 4950 2.3e-09 33_[+1]_449 50055 6.1e-09 23_[+1]_459 48891 8.4e-09 361_[+1]_121 44920 2.3e-08 338_[+1]_144 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=18 seqs=4 4950 ( 34) CGAAGCCCGGGCAACCGG 1 50055 ( 24) CGAACCCGGGGCCCAACG 1 48891 ( 362) CGAATACCGGGGATACGG 1 44920 ( 339) CGAACTCCGACCACAATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 1932 bayes= 8.91289 E= 5.6e+001 -865 202 -865 -865 -865 -865 217 -865 191 -865 -865 -865 191 -865 -865 -865 -865 102 18 -10 -9 102 -865 -10 -865 202 -865 -865 -865 161 18 -865 -865 -865 217 -865 -9 -865 176 -865 -865 3 176 -865 -865 161 18 -865 149 3 -865 -865 -9 102 -865 -10 149 3 -865 -865 91 102 -865 -865 -865 3 118 -10 -865 -865 217 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 4 E= 5.6e+001 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.250000 0.500000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.750000 0.250000 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.500000 0.000000 0.250000 0.750000 0.250000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CGAA[CGT][CAT]C[CG]G[GA][GC][CG][AC][CAT][AC][AC][GCT]G -------------------------------------------------------------------------------- Time 0.19 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 4 llr = 63 E-value = 5.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::8:338:::::::aa pos.-specific C :3:8:3:::333:a:: probability G :53::5:::858a::: matrix T a3:38:3aa:3::::: bits 2.2 * 2.0 * ** **** 1.7 * ** **** 1.5 * ** **** Relative 1.3 * *** ***** Entropy 1.1 * *** **** ***** (22.6 bits) 0.9 * *** **** ***** 0.7 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel TGACTGATTGGGGCAA consensus CGTAAT CCC sequence T C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 50055 127 2.53e-09 TATATACCAA TGATTGATTGGGGCAA AGCGTGAACG 44920 14 3.71e-08 TGCGCACGCT TCACTGATTGTCGCAA ACCGTATTTC 4950 332 5.43e-08 TTCATCGCGC TGACTATTTCGGGCAA GAGCACCACA 48891 35 1.49e-07 GCGTCTGTCC TTGCACATTGCGGCAA AAGAAATCCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50055 2.5e-09 126_[+2]_358 44920 3.7e-08 13_[+2]_471 4950 5.4e-08 331_[+2]_153 48891 1.5e-07 34_[+2]_450 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=4 50055 ( 127) TGATTGATTGGGGCAA 1 44920 ( 14) TCACTGATTGTCGCAA 1 4950 ( 332) TGACTATTTCGGGCAA 1 48891 ( 35) TTGCACATTGCGGCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 1940 bayes= 8.91886 E= 5.0e+002 -865 -865 -865 190 -865 3 118 -10 149 -865 18 -865 -865 161 -865 -10 -9 -865 -865 149 -9 3 118 -865 149 -865 -865 -10 -865 -865 -865 190 -865 -865 -865 190 -865 3 176 -865 -865 3 118 -10 -865 3 176 -865 -865 -865 217 -865 -865 202 -865 -865 191 -865 -865 -865 191 -865 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 4 E= 5.0e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.500000 0.250000 0.750000 0.000000 0.250000 0.000000 0.000000 0.750000 0.000000 0.250000 0.250000 0.000000 0.000000 0.750000 0.250000 0.250000 0.500000 0.000000 0.750000 0.000000 0.000000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.250000 0.500000 0.250000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[GCT][AG][CT][TA][GAC][AT]TT[GC][GCT][GC]GCAA -------------------------------------------------------------------------------- Time 0.33 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 19 sites = 4 llr = 69 E-value = 5.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::3a:38:3a:a3::3:: pos.-specific C 5:88::53:3:a::3:858 probability G 5a3::83:83:::5:a::3 matrix T :::::3::33:::38::5: bits 2.2 * * 2.0 * * *** * 1.7 * * *** * 1.5 * * *** * Relative 1.3 ** ** * *** * * Entropy 1.1 ****** ** *** *** * (25.0 bits) 0.9 ****** ** *** ***** 0.7 ****** ** *** ***** 0.4 ********* ********* 0.2 ********* ********* 0.0 ------------------- Multilevel CGCCAGCAGAACAGTGCCC consensus G GA TACTC AC ATG sequence G G T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 44920 253 3.31e-10 CATCATCTCT GGCCAGCAGCACAGTGACC GGACTGATTC 4950 207 1.21e-09 GCATTCTGAT CGCCAGAAGTACAATGCTC AAAAGGACAA 48891 110 1.87e-08 ACCAATGAAT GGGCAGCATGACATCGCTC TAGCTTGAAA 50055 405 4.20e-08 ATCGTTTGTA CGCAATGCGAACAGTGCCG ATGGTAGGCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 44920 3.3e-10 252_[+3]_229 4950 1.2e-09 206_[+3]_275 48891 1.9e-08 109_[+3]_372 50055 4.2e-08 404_[+3]_77 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=19 seqs=4 44920 ( 253) GGCCAGCAGCACAGTGACC 1 4950 ( 207) CGCCAGAAGTACAATGCTC 1 48891 ( 110) GGGCAGCATGACATCGCTC 1 50055 ( 405) CGCAATGCGAACAGTGCCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 1928 bayes= 8.90989 E= 5.6e+002 -865 102 118 -865 -865 -865 217 -865 -865 161 18 -865 -9 161 -865 -865 191 -865 -865 -865 -865 -865 176 -10 -9 102 18 -865 149 3 -865 -865 -865 -865 176 -10 -9 3 18 -10 191 -865 -865 -865 -865 202 -865 -865 191 -865 -865 -865 -9 -865 118 -10 -865 3 -865 149 -865 -865 217 -865 -9 161 -865 -865 -865 102 -865 90 -865 161 18 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 4 E= 5.6e+002 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.250000 0.750000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.250000 0.500000 0.250000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.250000 0.250000 0.250000 0.250000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.250000 0.000000 0.500000 0.250000 0.000000 0.250000 0.000000 0.750000 0.000000 0.000000 1.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.750000 0.250000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG]G[CG][CA]A[GT][CAG][AC][GT][ACGT]ACA[GAT][TC]G[CA][CT][CG] -------------------------------------------------------------------------------- Time 0.47 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 4950 1.34e-14 33_[+1(2.31e-09)]_155_\ [+3(1.21e-09)]_106_[+2(5.43e-08)]_153 48891 1.59e-12 34_[+2(1.49e-07)]_59_[+3(1.87e-08)]_\ 233_[+1(8.40e-09)]_121 50055 5.29e-14 23_[+1(6.06e-09)]_85_[+2(2.53e-09)]_\ 262_[+3(4.20e-08)]_77 44920 2.44e-14 13_[+2(3.71e-08)]_223_\ [+3(3.31e-10)]_67_[+1(2.31e-08)]_144 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************