******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/262/262.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 10079 1.0000 500 11941 1.0000 500 22348 1.0000 500 30533 1.0000 500 3539 1.0000 500 37819 1.0000 500 7870 1.0000 500 8360 1.0000 500 9476 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/262/262.seqs.fa -oc motifs/262 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 4500 N= 9 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.264 C 0.238 G 0.236 T 0.262 Background letter frequencies (from dataset with add-one prior applied): A 0.264 C 0.238 G 0.236 T 0.262 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 9 llr = 96 E-value = 1.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :1:4::1:::29 pos.-specific C 1:::3:2::::: probability G 6:114a:2:a8: matrix T 39942:78a::1 bits 2.1 * * 1.9 * ** 1.7 * ** 1.5 ** * ** * Relative 1.2 ** * ***** Entropy 1.0 ** * ***** (15.4 bits) 0.8 ** ******* 0.6 **** ******* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GTTAGGTTTGGA consensus T TC CG A sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 30533 151 2.52e-07 AGTTCGTATT GTTTCGTTTGGA CCTCAAACTG 3539 382 6.71e-07 TCTTCACCGC TTTACGTTTGGA GAAGAGAGCT 7870 133 1.48e-06 TTGGGTCAAT GTTAGGTTTGAA GTTTGTGTGA 9476 152 1.93e-06 ATGATTTTTT CTTTGGTTTGGA TGGAGAATAG 11941 333 5.00e-06 GATTCTAGTT GTTTGGCTTGAA CTCTACTTTC 10079 167 5.65e-06 ATCTCGTCCA TTTATGTGTGGA TGTCTACTGC 37819 429 1.72e-05 TGCCTAGTTT GATTTGCTTGGA GGTATCTCAC 22348 10 2.08e-05 CAGTGCTGG TTTGGGTTTGGT ATTTTTGAGG 8360 348 4.17e-05 CATGTGACGC GTGACGAGTGGA GTGAGCGTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 30533 2.5e-07 150_[+1]_338 3539 6.7e-07 381_[+1]_107 7870 1.5e-06 132_[+1]_356 9476 1.9e-06 151_[+1]_337 11941 5e-06 332_[+1]_156 10079 5.6e-06 166_[+1]_322 37819 1.7e-05 428_[+1]_60 22348 2.1e-05 9_[+1]_479 8360 4.2e-05 347_[+1]_141 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=9 30533 ( 151) GTTTCGTTTGGA 1 3539 ( 382) TTTACGTTTGGA 1 7870 ( 133) GTTAGGTTTGAA 1 9476 ( 152) CTTTGGTTTGGA 1 11941 ( 333) GTTTGGCTTGAA 1 10079 ( 167) TTTATGTGTGGA 1 37819 ( 429) GATTTGCTTGGA 1 22348 ( 10) TTTGGGTTTGGT 1 8360 ( 348) GTGACGAGTGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4401 bayes= 8.93074 E= 1.2e+001 -982 -110 123 35 -125 -982 -982 176 -982 -982 -109 176 75 -982 -109 76 -982 49 91 -24 -982 -982 208 -982 -125 -10 -982 135 -982 -982 -9 157 -982 -982 -982 193 -982 -982 208 -982 -25 -982 172 -982 175 -982 -982 -124 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 1.2e+001 0.000000 0.111111 0.555556 0.333333 0.111111 0.000000 0.000000 0.888889 0.000000 0.000000 0.111111 0.888889 0.444444 0.000000 0.111111 0.444444 0.000000 0.333333 0.444444 0.222222 0.000000 0.000000 1.000000 0.000000 0.111111 0.222222 0.000000 0.666667 0.000000 0.000000 0.222222 0.777778 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.222222 0.000000 0.777778 0.000000 0.888889 0.000000 0.000000 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GT]TT[AT][GCT]G[TC][TG]TG[GA]A -------------------------------------------------------------------------------- Time 0.84 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 18 sites = 9 llr = 114 E-value = 1.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 6::3116:47313::673 pos.-specific C 182296:a42394:8427 probability G 328:::3:1:1::a2:1: matrix T :::4:31::12:2::::: bits 2.1 * * 1.9 * * 1.7 * * 1.5 * * * * Relative 1.2 ** * * * ** Entropy 1.0 ** * * * *** * (18.3 bits) 0.8 ** * * * * ***** 0.6 *** ****** * ***** 0.4 ********** ******* 0.2 ********** ******* 0.0 ------------------ Multilevel ACGTCCACAAACCGCAAC consensus GGCA TG CCC A GCCA sequence C T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------ 11941 480 9.81e-10 CAACCGCTGT GCGTCCACAACCAGCAAC ACC 30533 469 1.62e-09 TCCAGCCTTC ACGTCCGCCATCCGCCAC ACCTCAACGC 10079 252 1.94e-07 GACAACTCAC ACGCACACCAACTGCAAC TACGTGCTCA 9476 344 8.11e-07 GAAGAAGGAC ACCACCGCCAACCGCCGA CGACGACCAC 3539 238 9.56e-07 TGCAGGAACG AGGACTACAATCCGGCAA AGCTTCAAGT 37819 211 1.04e-06 TCCTTTCTTC ACGCCAACACCCTGCACC GACCCCCCAC 8360 16 3.34e-06 TGCTCTCTCC GGCTCCACGCCCCGGAAC GATGATATCC 7870 458 3.34e-06 TGATATTATG CCGACTGCCTGCAGCCAC CATTGCAACA 22348 463 5.76e-06 TATTGGTACA GCGTCTTCAAAAAGCACA GCAAATCCTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11941 9.8e-10 479_[+2]_3 30533 1.6e-09 468_[+2]_14 10079 1.9e-07 251_[+2]_231 9476 8.1e-07 343_[+2]_139 3539 9.6e-07 237_[+2]_245 37819 1e-06 210_[+2]_272 8360 3.3e-06 15_[+2]_467 7870 3.3e-06 457_[+2]_25 22348 5.8e-06 462_[+2]_20 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=18 seqs=9 11941 ( 480) GCGTCCACAACCAGCAAC 1 30533 ( 469) ACGTCCGCCATCCGCCAC 1 10079 ( 252) ACGCACACCAACTGCAAC 1 9476 ( 344) ACCACCGCCAACCGCCGA 1 3539 ( 238) AGGACTACAATCCGGCAA 1 37819 ( 211) ACGCCAACACCCTGCACC 1 8360 ( 16) GGCTCCACGCCCCGGAAC 1 7870 ( 458) CCGACTGCCTGCAGCCAC 1 22348 ( 463) GCGTCTTCAAAAAGCACA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 4347 bayes= 9.04746 E= 1.3e+002 107 -110 50 -982 -982 171 -9 -982 -982 -10 172 -982 34 -10 -982 76 -125 190 -982 -982 -125 122 -982 35 107 -982 50 -124 -982 207 -982 -982 75 90 -109 -982 134 -10 -982 -124 34 49 -109 -24 -125 190 -982 -982 34 90 -982 -24 -982 -982 208 -982 -982 171 -9 -982 107 90 -982 -982 134 -10 -109 -982 34 149 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 9 E= 1.3e+002 0.555556 0.111111 0.333333 0.000000 0.000000 0.777778 0.222222 0.000000 0.000000 0.222222 0.777778 0.000000 0.333333 0.222222 0.000000 0.444444 0.111111 0.888889 0.000000 0.000000 0.111111 0.555556 0.000000 0.333333 0.555556 0.000000 0.333333 0.111111 0.000000 1.000000 0.000000 0.000000 0.444444 0.444444 0.111111 0.000000 0.666667 0.222222 0.000000 0.111111 0.333333 0.333333 0.111111 0.222222 0.111111 0.888889 0.000000 0.000000 0.333333 0.444444 0.000000 0.222222 0.000000 0.000000 1.000000 0.000000 0.000000 0.777778 0.222222 0.000000 0.555556 0.444444 0.000000 0.000000 0.666667 0.222222 0.111111 0.000000 0.333333 0.666667 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AG][CG][GC][TAC]C[CT][AG]C[AC][AC][ACT]C[CAT]G[CG][AC][AC][CA] -------------------------------------------------------------------------------- Time 1.76 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 9 llr = 92 E-value = 1.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :4::1:14::3: pos.-specific C 11:a::3118:a probability G ::::2:21:::: matrix T 94a:7a33927: bits 2.1 * * 1.9 ** * * 1.7 ** * * 1.5 * ** * * * Relative 1.2 * ** * ** * Entropy 1.0 * ** * **** (14.8 bits) 0.8 * **** **** 0.6 ****** **** 0.4 ****** **** 0.2 ************ 0.0 ------------ Multilevel TATCTTCATCTC consensus T G TT TA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 10079 89 7.46e-07 TCTGCTGCTC TATCTTGATCTC CAGCTCTCTC 8360 313 2.83e-06 CGGTCACGGC TATCTTCCTCTC CCTCTCTCTC 30533 31 3.63e-06 CTTTCAAACC TTTCGTGATCTC GTCTTCGTCC 11941 299 7.27e-06 TTGTCACTCT TTTCTTCGTCAC AATGGAGCCG 3539 353 8.10e-06 CGATCAAGAT TATCATTTTCTC AGAGCATTCT 22348 257 1.09e-05 TTCTCAATAC TCTCTTTTTCAC GTGGGAATAA 37819 95 1.25e-05 CGTACGGGTA TTTCTTAATTTC ACCTCGTATA 9476 450 2.33e-05 GATCATATTC CATCTTCATTTC GCTCTCATTG 7870 208 3.90e-05 TATCGTTGCC TTTCGTTTCCAC AGAAGGTACT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10079 7.5e-07 88_[+3]_400 8360 2.8e-06 312_[+3]_176 30533 3.6e-06 30_[+3]_458 11941 7.3e-06 298_[+3]_190 3539 8.1e-06 352_[+3]_136 22348 1.1e-05 256_[+3]_232 37819 1.3e-05 94_[+3]_394 9476 2.3e-05 449_[+3]_39 7870 3.9e-05 207_[+3]_281 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=9 10079 ( 89) TATCTTGATCTC 1 8360 ( 313) TATCTTCCTCTC 1 30533 ( 31) TTTCGTGATCTC 1 11941 ( 299) TTTCTTCGTCAC 1 3539 ( 353) TATCATTTTCTC 1 22348 ( 257) TCTCTTTTTCAC 1 37819 ( 95) TTTCTTAATTTC 1 9476 ( 450) CATCTTCATTTC 1 7870 ( 208) TTTCGTTTCCAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4401 bayes= 8.93074 E= 1.3e+002 -982 -110 -982 176 75 -110 -982 76 -982 -982 -982 193 -982 207 -982 -982 -125 -982 -9 135 -982 -982 -982 193 -125 49 -9 35 75 -110 -109 35 -982 -110 -982 176 -982 171 -982 -24 34 -982 -982 135 -982 207 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 1.3e+002 0.000000 0.111111 0.000000 0.888889 0.444444 0.111111 0.000000 0.444444 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.111111 0.000000 0.222222 0.666667 0.000000 0.000000 0.000000 1.000000 0.111111 0.333333 0.222222 0.333333 0.444444 0.111111 0.111111 0.333333 0.000000 0.111111 0.000000 0.888889 0.000000 0.777778 0.000000 0.222222 0.333333 0.000000 0.000000 0.666667 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[AT]TC[TG]T[CTG][AT]T[CT][TA]C -------------------------------------------------------------------------------- Time 2.52 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 10079 2.74e-08 16_[+3(5.20e-05)]_60_[+3(7.46e-07)]_\ 66_[+1(5.65e-06)]_73_[+2(1.94e-07)]_231 11941 1.52e-09 298_[+3(7.27e-06)]_22_\ [+1(5.00e-06)]_135_[+2(9.81e-10)]_3 22348 2.09e-05 9_[+1(2.08e-05)]_235_[+3(1.09e-05)]_\ 194_[+2(5.76e-06)]_20 30533 7.90e-11 30_[+3(3.63e-06)]_108_\ [+1(2.52e-07)]_306_[+2(1.62e-09)]_14 3539 1.48e-07 237_[+2(9.56e-07)]_97_\ [+3(8.10e-06)]_17_[+1(6.71e-07)]_107 37819 4.37e-06 94_[+3(1.25e-05)]_104_\ [+2(1.04e-06)]_200_[+1(1.72e-05)]_60 7870 3.83e-06 132_[+1(1.48e-06)]_63_\ [+3(3.90e-05)]_238_[+2(3.34e-06)]_25 8360 7.25e-06 15_[+2(3.34e-06)]_279_\ [+3(2.83e-06)]_23_[+1(4.17e-05)]_141 9476 8.66e-07 151_[+1(1.93e-06)]_94_\ [+1(1.93e-06)]_74_[+2(8.11e-07)]_88_[+3(2.33e-05)]_39 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************