******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/291/291.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 3062 1.0000 500 47119 1.0000 500 29317 1.0000 500 43678 1.0000 500 49540 1.0000 500 27639 1.0000 500 45316 1.0000 500 43852 1.0000 500 49478 1.0000 500 44896 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/291/291.seqs.fa -oc motifs/291 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.284 C 0.226 G 0.229 T 0.261 Background letter frequencies (from dataset with add-one prior applied): A 0.284 C 0.226 G 0.229 T 0.261 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 21 sites = 6 llr = 101 E-value = 8.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 2::2:2223:7:2:2:a8::8 pos.-specific C :7:535:3:22a:22::28:: probability G 8::3:253:8:::837::2a: matrix T :3a:72327:2:8:33::::2 bits 2.1 * * 1.9 * * * 1.7 * * * * 1.5 * * * * * * ** Relative 1.3 * * * *** ***** Entropy 1.1 *** * ** *** ****** (24.2 bits) 0.9 *** * ** *** ****** 0.6 ***** * ****** ****** 0.4 ***** * ****** ****** 0.2 ************** ****** 0.0 --------------------- Multilevel GCTCTCGCTGACTGGGAACGA consensus T GC TGA TT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 45316 392 7.13e-10 CGATTTGCTT GCTCTGATTGACTGTGAACGA CCAAGTTTGT 49540 477 2.44e-09 TCGCAGAATC GCTCCATCTGACTGATAACGA ATC 29317 118 4.84e-09 TACAATGCTA GTTGCTGGTGACTGTGAAGGA AACAAGAGTC 43852 348 7.59e-09 TGTCTCACGA ACTGTCTGTCACTGCGAACGA GCGAGCGAGC 49478 100 3.66e-08 TGAGCGGACA GCTCTCGCAGCCACGGAACGT ACGGCGATAC 3062 345 5.94e-08 ACGCGGGGCC GTTATCGAAGTCTGGTACCGA GCCAAAAAGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45316 7.1e-10 391_[+1]_88 49540 2.4e-09 476_[+1]_3 29317 4.8e-09 117_[+1]_362 43852 7.6e-09 347_[+1]_132 49478 3.7e-08 99_[+1]_380 3062 5.9e-08 344_[+1]_135 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=21 seqs=6 45316 ( 392) GCTCTGATTGACTGTGAACGA 1 49540 ( 477) GCTCCATCTGACTGATAACGA 1 29317 ( 118) GTTGCTGGTGACTGTGAAGGA 1 43852 ( 348) ACTGTCTGTCACTGCGAACGA 1 49478 ( 100) GCTCTCGCAGCCACGGAACGT 1 3062 ( 345) GTTATCGAAGTCTGGTACCGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 10.09 E= 8.2e+001 -77 -923 186 -923 -923 156 -923 35 -923 -923 -923 194 -77 114 54 -923 -923 56 -923 135 -77 114 -46 -64 -77 -923 112 35 -77 56 54 -64 23 -923 -923 135 -923 -44 186 -923 123 -44 -923 -64 -923 214 -923 -923 -77 -923 -923 167 -923 -44 186 -923 -77 -44 54 35 -923 -923 154 35 182 -923 -923 -923 155 -44 -923 -923 -923 188 -46 -923 -923 -923 212 -923 155 -923 -923 -64 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 8.2e+001 0.166667 0.000000 0.833333 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.000000 1.000000 0.166667 0.500000 0.333333 0.000000 0.000000 0.333333 0.000000 0.666667 0.166667 0.500000 0.166667 0.166667 0.166667 0.000000 0.500000 0.333333 0.166667 0.333333 0.333333 0.166667 0.333333 0.000000 0.000000 0.666667 0.000000 0.166667 0.833333 0.000000 0.666667 0.166667 0.000000 0.166667 0.000000 1.000000 0.000000 0.000000 0.166667 0.000000 0.000000 0.833333 0.000000 0.166667 0.833333 0.000000 0.166667 0.166667 0.333333 0.333333 0.000000 0.000000 0.666667 0.333333 1.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.000000 0.166667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- G[CT]T[CG][TC]C[GT][CG][TA]GACTG[GT][GT]AACGA -------------------------------------------------------------------------------- Time 0.95 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 6 llr = 100 E-value = 2.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::2a3:5522:3:32:::2:: pos.-specific C 253:2a3:::a2a:2:a::55 probability G :55:3:2::::5:358:8225 matrix T 8:::2::588:::322:273: bits 2.1 * * * * 1.9 * * * * 1.7 * * * * * 1.5 * * * * *** Relative 1.3 * * * *** * *** Entropy 1.1 ** * * *** * *** * (24.0 bits) 0.9 ** * * **** * *** * 0.6 **** * ****** ****** 0.4 **** ********* ****** 0.2 **** **************** 0.0 --------------------- Multilevel TCGAACAATTCGCAGGCGTCC consensus GC G CT A G TG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 29317 147 1.78e-10 GAAACAAGAG TCGAGCAATTCGCTAGCGTCC GATTGCGTGT 45316 356 2.46e-10 TTCGAAACAT TGCAACCATTCACAGGCGTCC AGTGTCGATT 49478 181 1.31e-08 TGTTTGTGAA TCGAGCATATCACGGGCGATG ACCCGAGGGT 27639 279 2.28e-08 CAAGACAGAT TCGATCATTTCCCTTGCTTCC TCAGCTTCTG 43852 14 6.80e-08 GCGAGAGCAC CGAAACGATTCGCGGTCGTTG TGTTACATGG 3062 322 8.83e-08 ACTCAAGATC TGCACCCTTACGCACGCGGGG CCGTTATCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 29317 1.8e-10 146_[+2]_333 45316 2.5e-10 355_[+2]_124 49478 1.3e-08 180_[+2]_299 27639 2.3e-08 278_[+2]_201 43852 6.8e-08 13_[+2]_466 3062 8.8e-08 321_[+2]_158 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=6 29317 ( 147) TCGAGCAATTCGCTAGCGTCC 1 45316 ( 356) TGCAACCATTCACAGGCGTCC 1 49478 ( 181) TCGAGCATATCACGGGCGATG 1 27639 ( 279) TCGATCATTTCCCTTGCTTCC 1 43852 ( 14) CGAAACGATTCGCGGTCGTTG 1 3062 ( 322) TGCACCCTTACGCACGCGGGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 10.7426 E= 2.2e+002 -923 -44 -923 167 -923 114 112 -923 -77 56 112 -923 182 -923 -923 -923 23 -44 54 -64 -923 214 -923 -923 82 56 -46 -923 82 -923 -923 94 -77 -923 -923 167 -77 -923 -923 167 -923 214 -923 -923 23 -44 112 -923 -923 214 -923 -923 23 -923 54 35 -77 -44 112 -64 -923 -923 186 -64 -923 214 -923 -923 -923 -923 186 -64 -77 -923 -46 135 -923 114 -46 35 -923 114 112 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 6 E= 2.2e+002 0.000000 0.166667 0.000000 0.833333 0.000000 0.500000 0.500000 0.000000 0.166667 0.333333 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.166667 0.333333 0.166667 0.000000 1.000000 0.000000 0.000000 0.500000 0.333333 0.166667 0.000000 0.500000 0.000000 0.000000 0.500000 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.833333 0.000000 1.000000 0.000000 0.000000 0.333333 0.166667 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.333333 0.333333 0.166667 0.166667 0.500000 0.166667 0.000000 0.000000 0.833333 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.166667 0.000000 0.166667 0.666667 0.000000 0.500000 0.166667 0.333333 0.000000 0.500000 0.500000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[CG][GC]A[AG]C[AC][AT]TTC[GA]C[AGT]GGCGT[CT][CG] -------------------------------------------------------------------------------- Time 2.01 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 18 sites = 5 llr = 83 E-value = 2.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::2:a68:4a8446:: pos.-specific C a4:a48::2a2:22648: probability G :6::22:2::4::4::2a matrix T ::a:2::2:::::::::: bits 2.1 * * * * 1.9 * ** * * 1.7 * ** * * * * 1.5 * ** ** * * ** Relative 1.3 * ** ** * * ** Entropy 1.1 **** ** ** ** **** (24.1 bits) 0.9 **** ** ** ** **** 0.6 **** ** ** ** **** 0.4 **** ************* 0.2 ****************** 0.0 ------------------ Multilevel CGTCCCAAACAAAACACG consensus C AG GC G CGACG sequence G T C C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 45316 45 2.95e-09 CGGCATAGGT CGTCGCAGACAAAGCACG ATGAGCACTT 43852 403 5.57e-09 CCACCATGAA CCTCCCAAACAACACCCG ACCTCGGGGA 49540 221 2.26e-08 AGGCCATCGT CCTCCCATCCGAACCACG GCCGTTTCGG 29317 219 2.46e-08 CTTTTGCTGA CGTCACAAACGAAAACGG CAGCATACCG 3062 169 3.07e-08 AAGGGCAAAG CGTCTGAAACCAAGAACG ATGGTCAAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45316 2.9e-09 44_[+3]_438 43852 5.6e-09 402_[+3]_80 49540 2.3e-08 220_[+3]_262 29317 2.5e-08 218_[+3]_264 3062 3.1e-08 168_[+3]_314 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=18 seqs=5 45316 ( 45) CGTCGCAGACAAAGCACG 1 43852 ( 403) CCTCCCAAACAACACCCG 1 49540 ( 221) CCTCCCATCCGAACCACG 1 29317 ( 219) CGTCACAAACGAAAACGG 1 3062 ( 169) CGTCTGAAACCAAGAACG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 4830 bayes= 10.1662 E= 2.1e+002 -897 214 -897 -897 -897 82 139 -897 -897 -897 -897 194 -897 214 -897 -897 -50 82 -20 -38 -897 182 -20 -897 182 -897 -897 -897 108 -897 -20 -38 149 -18 -897 -897 -897 214 -897 -897 50 -18 80 -897 182 -897 -897 -897 149 -18 -897 -897 50 -18 80 -897 50 140 -897 -897 108 82 -897 -897 -897 182 -20 -897 -897 -897 212 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 5 E= 2.1e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.400000 0.200000 0.200000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.600000 0.000000 0.200000 0.200000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.200000 0.400000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.400000 0.200000 0.400000 0.000000 0.400000 0.600000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- C[GC]TC[CAGT][CG]A[AGT][AC]C[AGC]A[AC][AGC][CA][AC][CG]G -------------------------------------------------------------------------------- Time 3.09 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3062 9.55e-12 85_[+3(3.92e-05)]_65_[+3(3.07e-08)]_\ 135_[+2(8.83e-08)]_2_[+1(5.94e-08)]_135 47119 7.99e-01 500 29317 2.04e-15 117_[+1(4.84e-09)]_8_[+2(1.78e-10)]_\ 51_[+3(2.46e-08)]_264 43678 4.40e-01 500 49540 2.54e-09 220_[+3(2.26e-08)]_238_\ [+1(2.44e-09)]_3 27639 1.07e-04 278_[+2(2.28e-08)]_201 45316 5.90e-17 44_[+3(2.95e-09)]_293_\ [+2(2.46e-10)]_15_[+1(7.13e-10)]_88 43852 2.16e-13 13_[+2(6.80e-08)]_313_\ [+1(7.59e-09)]_34_[+3(5.57e-09)]_80 49478 4.40e-09 99_[+1(3.66e-08)]_60_[+2(1.31e-08)]_\ 299 44896 5.27e-01 500 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************