******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/25/25.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 19058 1.0000 500 24146 1.0000 500 25834 1.0000 500 260936 1.0000 500 268526 1.0000 500 269927 1.0000 500 270240 1.0000 500 33466 1.0000 500 3524 1.0000 500 40293 1.0000 500 5205 1.0000 500 7719 1.0000 500 8537 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/25/25.seqs.fa -oc motifs/25 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.261 C 0.237 G 0.242 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.261 C 0.237 G 0.242 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 10 llr = 130 E-value = 1.4e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :128::8:1231:3:1 pos.-specific C :::::11::::1:16: probability G a881a21a1362a3:9 matrix T :1:1:7::8516:34: bits 2.1 * * * * 1.9 * * * * 1.7 * * * * * 1.5 * * * * * Relative 1.2 * * * * * * Entropy 1.0 ***** *** * ** (18.8 bits) 0.8 ********* * ** 0.6 ********* * * ** 0.4 ************* ** 0.2 ************* ** 0.0 ---------------- Multilevel GGGAGTAGTTGTGACG consensus A G GAG GT sequence A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 25834 425 2.10e-10 AGGAGGTGAC GGGAGTAGTTGTGGCG TGAATGAGGT 270240 357 6.41e-09 AAACTGAAAG GGGAGTAGTGATGACG ACGTGATTTT 260936 327 6.41e-09 AAACTGAAAG GGGAGTAGTGATGACG ACGTGATTTT 269927 410 1.38e-07 AGCCAAGTGA GGGAGGAGGTGTGTCG ACGACATCCG 268526 270 4.71e-07 GGATTATTTT GGGAGTCGTAGTGCTG TTACAAGGGA 5205 55 9.67e-07 TCAGAAGAGT GGAAGTAGAGATGGTG GAGGCGAAAT 24146 73 1.83e-06 AGCTTGAAGG GAGGGTAGTTGAGTCG GATGAGAGAG 8537 104 3.48e-06 TACATAAATT GGGTGGGGTTGGGATG AAGGCAGCTT 33466 222 3.48e-06 GACGCTTCGA GTAAGTAGTTGGGGCA ATTGTTGTGT 40293 401 3.70e-06 TACTACTATT GGGAGCAGTATCGTTG CTTGCAACTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25834 2.1e-10 424_[+1]_60 270240 6.4e-09 356_[+1]_128 260936 6.4e-09 326_[+1]_158 269927 1.4e-07 409_[+1]_75 268526 4.7e-07 269_[+1]_215 5205 9.7e-07 54_[+1]_430 24146 1.8e-06 72_[+1]_412 8537 3.5e-06 103_[+1]_381 33466 3.5e-06 221_[+1]_263 40293 3.7e-06 400_[+1]_84 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=10 25834 ( 425) GGGAGTAGTTGTGGCG 1 270240 ( 357) GGGAGTAGTGATGACG 1 260936 ( 327) GGGAGTAGTGATGACG 1 269927 ( 410) GGGAGGAGGTGTGTCG 1 268526 ( 270) GGGAGTCGTAGTGCTG 1 5205 ( 55) GGAAGTAGAGATGGTG 1 24146 ( 73) GAGGGTAGTTGAGTCG 1 8537 ( 104) GGGTGGGGTTGGGATG 1 33466 ( 222) GTAAGTAGTTGGGGCA 1 40293 ( 401) GGGAGCAGTATCGTTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 9.54997 E= 1.4e-001 -997 -997 204 -997 -138 -997 172 -138 -38 -997 172 -997 161 -997 -127 -138 -997 -997 204 -997 -997 -124 -28 143 161 -124 -127 -997 -997 -997 204 -997 -138 -997 -127 162 -38 -997 31 94 20 -997 131 -138 -138 -124 -28 121 -997 -997 204 -997 20 -124 31 21 -997 134 -997 62 -138 -997 189 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 1.4e-001 0.000000 0.000000 1.000000 0.000000 0.100000 0.000000 0.800000 0.100000 0.200000 0.000000 0.800000 0.000000 0.800000 0.000000 0.100000 0.100000 0.000000 0.000000 1.000000 0.000000 0.000000 0.100000 0.200000 0.700000 0.800000 0.100000 0.100000 0.000000 0.000000 0.000000 1.000000 0.000000 0.100000 0.000000 0.100000 0.800000 0.200000 0.000000 0.300000 0.500000 0.300000 0.000000 0.600000 0.100000 0.100000 0.100000 0.200000 0.600000 0.000000 0.000000 1.000000 0.000000 0.300000 0.100000 0.300000 0.300000 0.000000 0.600000 0.000000 0.400000 0.100000 0.000000 0.900000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- GG[GA]AG[TG]AGT[TGA][GA][TG]G[AGT][CT]G -------------------------------------------------------------------------------- Time 1.53 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 20 sites = 8 llr = 124 E-value = 4.5e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :34:531:683::9::6::9 pos.-specific C 95364:9a1:599199:4a1 probability G 1::111::1:31::::31:: matrix T :343:6::13::1:1115:: bits 2.1 * * 1.9 * * 1.7 * * 1.5 * ** ***** ** Relative 1.2 * ** ***** ** Entropy 1.0 * ** * ***** ** (22.4 bits) 0.8 * * ** * ***** ** 0.6 * ***** *********** 0.4 ******************** 0.2 ******************** 0.0 -------------------- Multilevel CCACATCCAACCCACCATCA consensus ATTCA TA GC sequence TC G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 269927 370 7.44e-11 CAAATGAATA CTTCATCCAAGCCACCATCA CCAGCTTCAA 25834 191 6.03e-09 TATCAATACT CCATCACCAACGCACCATCA ACATCCAACC 8537 335 8.07e-09 ACCTCTGCAG CCACAAACAACCCACCAGCA CCCTGATGCA 40293 259 2.12e-08 CTTTTCATCT GCACATCCCAACCACCGTCA CTCGGCACCT 3524 100 3.28e-08 ACCTTATCAT CACCCTCCGACCCACCGTCC TTCTCATCCA 7719 46 1.58e-07 GCCGTTGCCC CACTGTCCAACCCATCTCCA CTAGCATGTT 5205 449 1.93e-07 CTTCATTCTG CCTCATCCTTGCTCCCACCA CGACAACGCC 260936 452 3.13e-07 GCACCGATCT CTTGCGCCATACCACTACCA CGTTACACAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 269927 7.4e-11 369_[+2]_111 25834 6e-09 190_[+2]_290 8537 8.1e-09 334_[+2]_146 40293 2.1e-08 258_[+2]_222 3524 3.3e-08 99_[+2]_381 7719 1.6e-07 45_[+2]_435 5205 1.9e-07 448_[+2]_32 260936 3.1e-07 451_[+2]_29 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=20 seqs=8 269927 ( 370) CTTCATCCAAGCCACCATCA 1 25834 ( 191) CCATCACCAACGCACCATCA 1 8537 ( 335) CCACAAACAACCCACCAGCA 1 40293 ( 259) GCACATCCCAACCACCGTCA 1 3524 ( 100) CACCCTCCGACCCACCGTCC 1 7719 ( 46) CACTGTCCAACCCATCTCCA 1 5205 ( 449) CCTCATCCTTGCTCCCACCA 1 260936 ( 452) CTTGCGCCATACCACTACCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 6253 bayes= 9.60849 E= 4.5e-001 -965 189 -95 -965 -6 108 -965 -6 52 8 -965 53 -965 140 -95 -6 94 66 -95 -965 -6 -965 -95 126 -106 189 -965 -965 -965 208 -965 -965 126 -92 -95 -105 152 -965 -965 -6 -6 108 5 -965 -965 189 -95 -965 -965 189 -965 -105 174 -92 -965 -965 -965 189 -965 -105 -965 189 -965 -105 126 -965 5 -105 -965 66 -95 94 -965 208 -965 -965 174 -92 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 8 E= 4.5e-001 0.000000 0.875000 0.125000 0.000000 0.250000 0.500000 0.000000 0.250000 0.375000 0.250000 0.000000 0.375000 0.000000 0.625000 0.125000 0.250000 0.500000 0.375000 0.125000 0.000000 0.250000 0.000000 0.125000 0.625000 0.125000 0.875000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.625000 0.125000 0.125000 0.125000 0.750000 0.000000 0.000000 0.250000 0.250000 0.500000 0.250000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.875000 0.000000 0.125000 0.875000 0.125000 0.000000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.875000 0.000000 0.125000 0.625000 0.000000 0.250000 0.125000 0.000000 0.375000 0.125000 0.500000 0.000000 1.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[CAT][ATC][CT][AC][TA]CCA[AT][CAG]CCACC[AG][TC]CA -------------------------------------------------------------------------------- Time 3.03 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 9 llr = 116 E-value = 4.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :41:31:7142::3:: pos.-specific C 8::a::92:41:74:a probability G 232:191:912a::9: matrix T :27:6::1::4:321: bits 2.1 * * * 1.9 * * * 1.7 * * * * 1.5 * ** * * ** Relative 1.2 * * ** * * ** Entropy 1.0 * * ** * ** ** (18.6 bits) 0.8 * ** **** ** ** 0.6 * ******** ** ** 0.4 ********** ***** 0.2 **************** 0.0 ---------------- Multilevel CATCTGCAGATGCCGC consensus GGG A C CA TA sequence T G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 270240 252 1.54e-08 CGCCACGAGA CGTCTGCAGCTGTAGC GATGGGCGCG 260936 222 1.54e-08 CGCCACGAGA CGTCTGCAGCTGTAGC GATGGGCGCG 268526 419 4.97e-08 TTACTTGATA CTTCTGCAGAGGCTGC AAAGTTGGGG 8537 170 3.33e-07 GATTATCTTT CATCGGCTGATGCCGC AAGTCAGCGA 7719 349 6.74e-07 TTACGCAAGT GATCAGCCGCCGCCGC CGACGACGAC 269927 317 7.37e-07 CTCGAAAATT GAGCAGCAGAGGCTGC ACGAATGGCA 33466 397 1.44e-06 CCCGCCTACG CTGCTGCAAAAGCAGC TGCCGATGCG 40293 360 3.84e-06 GCGAAGACGG CAACAACCGCTGTCGC TGTCTACAAC 3524 208 4.72e-06 TCATCAACGT CGTCTGGAGGAGCCTC CCCCTTCAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 270240 1.5e-08 251_[+3]_233 260936 1.5e-08 221_[+3]_263 268526 5e-08 418_[+3]_66 8537 3.3e-07 169_[+3]_315 7719 6.7e-07 348_[+3]_136 269927 7.4e-07 316_[+3]_168 33466 1.4e-06 396_[+3]_88 40293 3.8e-06 359_[+3]_125 3524 4.7e-06 207_[+3]_277 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=9 270240 ( 252) CGTCTGCAGCTGTAGC 1 260936 ( 222) CGTCTGCAGCTGTAGC 1 268526 ( 419) CTTCTGCAGAGGCTGC 1 8537 ( 170) CATCGGCTGATGCCGC 1 7719 ( 349) GATCAGCCGCCGCCGC 1 269927 ( 317) GAGCAGCAGAGGCTGC 1 33466 ( 397) CTGCTGCAAAAGCAGC 1 40293 ( 360) CAACAACCGCTGTCGC 1 3524 ( 208) CGTCTGGAGGAGCCTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 9.58478 E= 4.6e+000 -982 172 -12 -982 77 -982 46 -23 -123 -982 -12 136 -982 208 -982 -982 35 -982 -112 109 -123 -982 187 -982 -982 191 -112 -982 135 -9 -982 -122 -123 -982 187 -982 77 91 -112 -982 -23 -109 -12 77 -982 -982 204 -982 -982 149 -982 36 35 91 -982 -23 -982 -982 187 -122 -982 208 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 4.6e+000 0.000000 0.777778 0.222222 0.000000 0.444444 0.000000 0.333333 0.222222 0.111111 0.000000 0.222222 0.666667 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.111111 0.555556 0.111111 0.000000 0.888889 0.000000 0.000000 0.888889 0.111111 0.000000 0.666667 0.222222 0.000000 0.111111 0.111111 0.000000 0.888889 0.000000 0.444444 0.444444 0.111111 0.000000 0.222222 0.111111 0.222222 0.444444 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.333333 0.444444 0.000000 0.222222 0.000000 0.000000 0.888889 0.111111 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG][AGT][TG]C[TA]GC[AC]G[AC][TAG]G[CT][CAT]GC -------------------------------------------------------------------------------- Time 4.73 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 19058 7.21e-01 500 24146 1.16e-02 72_[+1(1.83e-06)]_412 25834 2.49e-11 190_[+2(6.03e-09)]_66_\ [+2(2.60e-05)]_128_[+1(2.10e-10)]_60 260936 2.06e-12 221_[+3(1.54e-08)]_89_\ [+1(6.41e-09)]_109_[+2(3.13e-07)]_29 268526 7.97e-07 269_[+1(4.71e-07)]_133_\ [+3(4.97e-08)]_66 269927 5.48e-13 226_[+3(9.65e-05)]_11_\ [+3(4.65e-05)]_47_[+3(7.37e-07)]_37_[+2(7.44e-11)]_20_[+1(1.38e-07)]_75 270240 7.17e-09 251_[+3(1.54e-08)]_89_\ [+1(6.41e-09)]_128 33466 2.15e-05 221_[+1(3.48e-06)]_159_\ [+3(1.44e-06)]_88 3524 4.35e-06 99_[+2(3.28e-08)]_88_[+3(4.72e-06)]_\ 244_[+2(7.48e-05)]_13 40293 1.07e-08 258_[+2(2.12e-08)]_81_\ [+3(3.84e-06)]_25_[+1(3.70e-06)]_84 5205 6.30e-06 54_[+1(9.67e-07)]_378_\ [+2(1.93e-07)]_32 7719 3.66e-06 45_[+2(1.58e-07)]_283_\ [+3(6.74e-07)]_136 8537 4.32e-10 103_[+1(3.48e-06)]_50_\ [+3(3.33e-07)]_149_[+2(8.07e-09)]_146 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************