******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/376/376.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1945 1.0000 500 21968 1.0000 500 25125 1.0000 500 263244 1.0000 500 264395 1.0000 500 268895 1.0000 500 3863 1.0000 500 41392 1.0000 500 5763 1.0000 500 9619 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/376/376.seqs.fa -oc motifs/376 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.281 C 0.235 G 0.225 T 0.259 Background letter frequencies (from dataset with add-one prior applied): A 0.281 C 0.235 G 0.225 T 0.259 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 13 sites = 8 llr = 98 E-value = 4.2e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 3419:369:95:a pos.-specific C 8:::::4::::1: probability G :59:98:1a159: matrix T :1:11:::::::: bits 2.2 * 1.9 * 1.7 * * 1.5 * * * ** Relative 1.3 * **** *** ** Entropy 1.1 * *********** (17.7 bits) 0.9 * *********** 0.6 ************* 0.4 ************* 0.2 ************* 0.0 ------------- Multilevel CGGAGGAAGAAGA consensus AA AC G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- 9619 35 5.21e-08 TGACAATCAT CAGAGGAAGAGGA GGACAAAGAT 21968 147 5.21e-08 ACAGCATGTG CAGAGGAAGAGGA CCCATGTTGG 3863 69 4.60e-07 TGGCTTCGGA CGGTGGAAGAAGA GGTGCTGGTT 5763 455 6.70e-07 TGAGTAGCAT CGGATGCAGAGGA TGAGACGGAT 41392 61 1.51e-06 GGTTTATAGA AGGAGACAGAAGA CGCCTTTCCT 1945 16 2.12e-06 AATATAAGGG AGGAGGCAGGAGA AAGGTTCATG 25125 129 3.80e-06 GCCTGAATTG CAAAGAAAGAAGA AGGCCCGAGA 263244 183 6.57e-06 GTTGTTTCAA CTGAGGAGGAGCA AACGATACAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9619 5.2e-08 34_[+1]_453 21968 5.2e-08 146_[+1]_341 3863 4.6e-07 68_[+1]_419 5763 6.7e-07 454_[+1]_33 41392 1.5e-06 60_[+1]_427 1945 2.1e-06 15_[+1]_472 25125 3.8e-06 128_[+1]_359 263244 6.6e-06 182_[+1]_305 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=13 seqs=8 9619 ( 35) CAGAGGAAGAGGA 1 21968 ( 147) CAGAGGAAGAGGA 1 3863 ( 69) CGGTGGAAGAAGA 1 5763 ( 455) CGGATGCAGAGGA 1 41392 ( 61) AGGAGACAGAAGA 1 1945 ( 16) AGGAGGCAGGAGA 1 25125 ( 129) CAAAGAAAGAAGA 1 263244 ( 183) CTGAGGAGGAGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 4880 bayes= 9.2503 E= 4.2e-001 -17 167 -965 -965 41 -965 115 -105 -117 -965 196 -965 164 -965 -965 -105 -965 -965 196 -105 -17 -965 174 -965 115 67 -965 -965 164 -965 -84 -965 -965 -965 215 -965 164 -965 -84 -965 83 -965 115 -965 -965 -91 196 -965 183 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 8 E= 4.2e-001 0.250000 0.750000 0.000000 0.000000 0.375000 0.000000 0.500000 0.125000 0.125000 0.000000 0.875000 0.000000 0.875000 0.000000 0.000000 0.125000 0.000000 0.000000 0.875000 0.125000 0.250000 0.000000 0.750000 0.000000 0.625000 0.375000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.125000 0.875000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CA][GA]GAG[GA][AC]AGA[AG]GA -------------------------------------------------------------------------------- Time 1.14 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 15 sites = 6 llr = 88 E-value = 3.3e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :a:52:2::::8:2: pos.-specific C ::::2::::28:2:3 probability G a:a::8:2a522::7 matrix T :::57288:3::88: bits 2.2 * * * 1.9 * * * 1.7 *** * 1.5 *** * * * Relative 1.3 *** **** ***** Entropy 1.1 *** **** ***** (21.1 bits) 0.9 **** **** ***** 0.6 *************** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel GAGATGTTGGCATTG consensus T T C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 263244 216 1.38e-08 GCCAATGAAA GAGAAGTTGGCATTG ATTTCGGGAG 3863 110 2.07e-08 TCTTTGGGTG GAGATGATGGCATTG TCTTCATCGA 268895 73 7.20e-08 TGAAATCTAG GAGTTGTTGTCACTC ATGGTAAAAG 5763 241 1.27e-07 ACTGTGCTTG GAGTCGTTGTCGTTG GTCTTCACAA 264395 79 2.53e-07 TGTCTCATGA GAGTTGTTGGGATAC TCTGTACAGG 1945 476 3.64e-07 GTGAGGCCGT GAGATTTGGCCATTG GCCATTATCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 263244 1.4e-08 215_[+2]_270 3863 2.1e-08 109_[+2]_376 268895 7.2e-08 72_[+2]_413 5763 1.3e-07 240_[+2]_245 264395 2.5e-07 78_[+2]_407 1945 3.6e-07 475_[+2]_10 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=15 seqs=6 263244 ( 216) GAGAAGTTGGCATTG 1 3863 ( 110) GAGATGATGGCATTG 1 268895 ( 73) GAGTTGTTGTCACTC 1 5763 ( 241) GAGTCGTTGTCGTTG 1 264395 ( 79) GAGTTGTTGGGATAC 1 1945 ( 476) GAGATTTGGCCATTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 4860 bayes= 10.1079 E= 3.3e+000 -923 -923 215 -923 183 -923 -923 -923 -923 -923 215 -923 83 -923 -923 95 -75 -50 -923 136 -923 -923 189 -63 -75 -923 -923 168 -923 -923 -43 168 -923 -923 215 -923 -923 -50 115 36 -923 182 -43 -923 157 -923 -43 -923 -923 -50 -923 168 -75 -923 -923 168 -923 50 157 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 6 E= 3.3e+000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.166667 0.166667 0.000000 0.666667 0.000000 0.000000 0.833333 0.166667 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 1.000000 0.000000 0.000000 0.166667 0.500000 0.333333 0.000000 0.833333 0.166667 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 0.166667 0.000000 0.833333 0.166667 0.000000 0.000000 0.833333 0.000000 0.333333 0.666667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- GAG[AT]TGTTG[GT]CATT[GC] -------------------------------------------------------------------------------- Time 2.30 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 20 sites = 6 llr = 107 E-value = 2.4e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A a8:28::7::5278322::a pos.-specific C ::35:a::8335327:87:: probability G ::53::a::72::::8::a: matrix T :22:2::32::3:::::3:: bits 2.2 ** * 1.9 ** * 1.7 * ** ** 1.5 * ** * * ** Relative 1.3 ** *** ** * ** ** Entropy 1.1 ** *** ** ******** (25.6 bits) 0.9 ** ****** ******** 0.6 ********** ********* 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel AAGCACGACGACAACGCCGA consensus CG T CCTC A T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 268895 431 2.45e-11 CGCCCCCGTC AAGCACGACGATCACGCCGA CAGTCACGAT 41392 433 4.57e-11 CACCCCCGTC AAGCACGACGACCACGCTGA CAGTCACGAT 264395 456 5.30e-09 TGTGTCTAGA AACAACGATGCCAAAGCCGA GGTTGCTGAC 5763 334 1.38e-08 TGTATGTAAC AAGGACGTCGACAACAATGA GTAACAAGAC 21968 11 2.11e-08 CCTGCACCAA AACGACGACCGAACAGCCGA CTGTGGTTGA 1945 180 2.63e-08 GGGTGAAAAA ATTCTCGTCCCTAACGCCGA TCCCTAACAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 268895 2.4e-11 430_[+3]_50 41392 4.6e-11 432_[+3]_48 264395 5.3e-09 455_[+3]_25 5763 1.4e-08 333_[+3]_147 21968 2.1e-08 10_[+3]_470 1945 2.6e-08 179_[+3]_301 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=20 seqs=6 268895 ( 431) AAGCACGACGATCACGCCGA 1 41392 ( 433) AAGCACGACGACCACGCTGA 1 264395 ( 456) AACAACGATGCCAAAGCCGA 1 5763 ( 334) AAGGACGTCGACAACAATGA 1 21968 ( 11) AACGACGACCGAACAGCCGA 1 1945 ( 180) ATTCTCGTCCCTAACGCCGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 4810 bayes= 9.30354 E= 2.4e+000 183 -923 -923 -923 157 -923 -923 -63 -923 50 115 -63 -75 109 57 -923 157 -923 -923 -63 -923 209 -923 -923 -923 -923 215 -923 124 -923 -923 36 -923 182 -923 -63 -923 50 157 -923 83 50 -43 -923 -75 109 -923 36 124 50 -923 -923 157 -50 -923 -923 25 150 -923 -923 -75 -923 189 -923 -75 182 -923 -923 -923 150 -923 36 -923 -923 215 -923 183 -923 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 6 E= 2.4e+000 1.000000 0.000000 0.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.333333 0.500000 0.166667 0.166667 0.500000 0.333333 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.666667 0.000000 0.000000 0.333333 0.000000 0.833333 0.000000 0.166667 0.000000 0.333333 0.666667 0.000000 0.500000 0.333333 0.166667 0.000000 0.166667 0.500000 0.000000 0.333333 0.666667 0.333333 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- AA[GC][CG]ACG[AT]C[GC][AC][CT][AC]A[CA]GC[CT]GA -------------------------------------------------------------------------------- Time 3.29 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1945 8.96e-10 15_[+1(2.12e-06)]_151_\ [+3(2.63e-08)]_276_[+2(3.64e-07)]_10 21968 4.34e-08 10_[+3(2.11e-08)]_116_\ [+1(5.21e-08)]_341 25125 3.15e-02 128_[+1(3.80e-06)]_359 263244 2.28e-06 182_[+1(6.57e-06)]_20_\ [+2(1.38e-08)]_270 264395 1.16e-08 78_[+2(2.53e-07)]_362_\ [+3(5.30e-09)]_25 268895 5.75e-11 72_[+2(7.20e-08)]_343_\ [+3(2.45e-11)]_50 3863 1.97e-07 68_[+1(4.60e-07)]_28_[+2(2.07e-08)]_\ 376 41392 3.66e-09 60_[+1(1.51e-06)]_359_\ [+3(4.57e-11)]_48 5763 6.28e-11 240_[+2(1.27e-07)]_78_\ [+3(1.38e-08)]_101_[+1(6.70e-07)]_33 9619 1.27e-04 34_[+1(5.21e-08)]_453 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************