******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/364/364.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 11838 1.0000 500 2075 1.0000 500 22328 1.0000 500 22951 1.0000 500 22989 1.0000 500 2525 1.0000 500 25801 1.0000 500 264296 1.0000 500 3383 1.0000 500 35323 1.0000 500 3850 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/364/364.seqs.fa -oc motifs/364 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5500 N= 11 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.257 C 0.235 G 0.245 T 0.262 Background letter frequencies (from dataset with add-one prior applied): A 0.257 C 0.235 G 0.245 T 0.262 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 11 llr = 126 E-value = 4.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 19241:8a2:822521 pos.-specific C 41::3a1:3:25::3: probability G 5:225:1::a:27:29 matrix T ::651:::5::2154: bits 2.1 * * 1.9 * * * 1.7 * * * * 1.5 * * * * * Relative 1.3 * * * ** * Entropy 1.0 * *** ** * * (16.5 bits) 0.8 * *** ** ** * 0.6 *** ****** ** * 0.4 *********** ** * 0.2 ************** * 0.0 ---------------- Multilevel GATTGCAATGACGTTG consensus C AC C AC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 2075 383 9.95e-08 TGGCTCAACT CATTGCAATGAAGTGG GTGGAAGAGG 22328 432 2.93e-07 TGAGGCCGAG GATGGCAATGAGGAGG AGAGAACGCA 11838 178 4.27e-07 TAGTGAAGGC GAAAGCAATGACAATG AAATCGATGA 3850 79 1.14e-06 GCAGCCAATC CAGTCCAATGACAATG ATTGTTCTTT 25801 474 1.14e-06 CAACGATATA GATTGCAACGCTGTAG ATTGCAACGC 35323 283 1.38e-06 CGCCGCTGAC GATTGCAACGATGTTA TTGAACATGG 22951 473 3.36e-06 CGTCAACCAT CATATCAATGACTTCG ATCCACCTCC 3383 338 4.61e-06 GCTGCATAAG GCTAGCAAAGCCGTCG CCAGTGGAGG 2525 455 6.22e-06 ATGGCGGCAC CAGACCGACGACGACG GCACGGTAGA 264296 290 7.16e-06 CGAAGGTGAC GAATCCCATGAGGTAG CCGACCCACA 22989 342 1.62e-05 AAGATTGATA AATGACAAAGAAGATG AGACAAACAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 2075 1e-07 382_[+1]_102 22328 2.9e-07 431_[+1]_53 11838 4.3e-07 177_[+1]_307 3850 1.1e-06 78_[+1]_406 25801 1.1e-06 473_[+1]_11 35323 1.4e-06 282_[+1]_202 22951 3.4e-06 472_[+1]_12 3383 4.6e-06 337_[+1]_147 2525 6.2e-06 454_[+1]_30 264296 7.2e-06 289_[+1]_195 22989 1.6e-05 341_[+1]_143 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=11 2075 ( 383) CATTGCAATGAAGTGG 1 22328 ( 432) GATGGCAATGAGGAGG 1 11838 ( 178) GAAAGCAATGACAATG 1 3850 ( 79) CAGTCCAATGACAATG 1 25801 ( 474) GATTGCAACGCTGTAG 1 35323 ( 283) GATTGCAACGATGTTA 1 22951 ( 473) CATATCAATGACTTCG 1 3383 ( 338) GCTAGCAAAGCCGTCG 1 2525 ( 455) CAGACCGACGACGACG 1 264296 ( 290) GAATCCCATGAGGTAG 1 22989 ( 342) AATGACAAAGAAGATG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 5335 bayes= 8.91886 E= 4.5e+000 -150 63 115 -1010 182 -137 -1010 -1010 -50 -1010 -43 128 50 -1010 -43 79 -150 21 115 -153 -1010 209 -1010 -1010 167 -137 -143 -1010 196 -1010 -1010 -1010 -50 21 -1010 106 -1010 -1010 203 -1010 167 -37 -1010 -1010 -50 95 -43 -53 -50 -1010 157 -153 82 -1010 -1010 106 -50 21 -43 47 -150 -1010 189 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 4.5e+000 0.090909 0.363636 0.545455 0.000000 0.909091 0.090909 0.000000 0.000000 0.181818 0.000000 0.181818 0.636364 0.363636 0.000000 0.181818 0.454545 0.090909 0.272727 0.545455 0.090909 0.000000 1.000000 0.000000 0.000000 0.818182 0.090909 0.090909 0.000000 1.000000 0.000000 0.000000 0.000000 0.181818 0.272727 0.000000 0.545455 0.000000 0.000000 1.000000 0.000000 0.818182 0.181818 0.000000 0.000000 0.181818 0.454545 0.181818 0.181818 0.181818 0.000000 0.727273 0.090909 0.454545 0.000000 0.000000 0.545455 0.181818 0.272727 0.181818 0.363636 0.090909 0.000000 0.909091 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GC]AT[TA][GC]CAA[TC]GACG[TA][TC]G -------------------------------------------------------------------------------- Time 1.33 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 19 sites = 9 llr = 124 E-value = 2.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::1711::::43:42::21 pos.-specific C :46:21112:11::2:2:: probability G 8:22::9::9::a:391:9 matrix T 261178:98146:62178: bits 2.1 * 1.9 * 1.7 * 1.5 ** * * * * Relative 1.3 * **** * * ** Entropy 1.0 ** ***** ** * ** (19.9 bits) 0.8 ** ******* ** **** 0.6 ** *********** **** 0.4 ************** **** 0.2 ************** **** 0.0 ------------------- Multilevel GTCATTGTTGATGTGGTTG consensus TCGGC C TA AA CA sequence C T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------------- 11838 152 1.43e-09 CAGCTGCTGT GCCATTGTTGATGTAGTAG TGAAGGCGAA 3850 158 4.67e-09 GAAGTTGTGG TTGATTGTTGTTGAGGTTG TCAACAAGAT 22951 3 1.60e-08 TT GTTGTTGTTGTTGTTGTTG TGCGATGAAC 3383 371 7.10e-08 AGGAACGAAC GTCATCGTCGAAGTGGCTG TTGCGAGGGG 2075 151 1.24e-07 GTTTGGAATT GCCACTGTCGAAGTTGGTG GGATGTCTTC 22328 186 1.92e-07 TGCGAGCAGG GTGGATGTTGCTGTCGTTG ACGACTAGAG 22989 299 1.18e-06 TCTTGAACTT GCATTTGTTGTTGACTCTG GCGGGAACTA 25801 255 2.25e-06 GACACCTGGT GTCACAGCTTTAGAAGTTG TATTGAGATA 35323 223 2.50e-06 GAACTCTGGC TCCATTCTTGACGAGGTAA CTTGCTGTAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11838 1.4e-09 151_[+2]_330 3850 4.7e-09 157_[+2]_324 22951 1.6e-08 2_[+2]_479 3383 7.1e-08 370_[+2]_111 2075 1.2e-07 150_[+2]_331 22328 1.9e-07 185_[+2]_296 22989 1.2e-06 298_[+2]_183 25801 2.2e-06 254_[+2]_227 35323 2.5e-06 222_[+2]_259 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=9 11838 ( 152) GCCATTGTTGATGTAGTAG 1 3850 ( 158) TTGATTGTTGTTGAGGTTG 1 22951 ( 3) GTTGTTGTTGTTGTTGTTG 1 3383 ( 371) GTCATCGTCGAAGTGGCTG 1 2075 ( 151) GCCACTGTCGAAGTTGGTG 1 22328 ( 186) GTGGATGTTGCTGTCGTTG 1 22989 ( 299) GCATTTGTTGTTGACTCTG 1 25801 ( 255) GTCACAGCTTTAGAAGTTG 1 35323 ( 223) TCCATTCTTGACGAGGTAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 5302 bayes= 8.96344 E= 2.8e+001 -982 -982 166 -24 -982 92 -982 108 -121 124 -14 -124 137 -982 -14 -124 -121 -8 -982 135 -121 -108 -982 157 -982 -108 186 -982 -982 -108 -982 176 -982 -8 -982 157 -982 -982 186 -124 79 -108 -982 76 37 -108 -982 108 -982 -982 203 -982 79 -982 -982 108 -21 -8 44 -24 -982 -982 186 -124 -982 -8 -114 135 -21 -982 -982 157 -121 -982 186 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 9 E= 2.8e+001 0.000000 0.000000 0.777778 0.222222 0.000000 0.444444 0.000000 0.555556 0.111111 0.555556 0.222222 0.111111 0.666667 0.000000 0.222222 0.111111 0.111111 0.222222 0.000000 0.666667 0.111111 0.111111 0.000000 0.777778 0.000000 0.111111 0.888889 0.000000 0.000000 0.111111 0.000000 0.888889 0.000000 0.222222 0.000000 0.777778 0.000000 0.000000 0.888889 0.111111 0.444444 0.111111 0.000000 0.444444 0.333333 0.111111 0.000000 0.555556 0.000000 0.000000 1.000000 0.000000 0.444444 0.000000 0.000000 0.555556 0.222222 0.222222 0.333333 0.222222 0.000000 0.000000 0.888889 0.111111 0.000000 0.222222 0.111111 0.666667 0.222222 0.000000 0.000000 0.777778 0.111111 0.000000 0.888889 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GT][TC][CG][AG][TC]TGT[TC]G[AT][TA]G[TA][GACT]G[TC][TA]G -------------------------------------------------------------------------------- Time 2.49 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 5 llr = 77 E-value = 5.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :2:2:44:::::::: pos.-specific C ::a::2::8::a44a probability G 28::a::82:a:::: matrix T 8::8:462:a::66: bits 2.1 * * ** * 1.9 * * *** * 1.7 * * *** * 1.5 * * *** * Relative 1.3 ***** ***** * Entropy 1.0 ***** ********* (22.1 bits) 0.8 ***** ********* 0.6 ***** ********* 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel TGCTGATGCTGCTTC consensus GA A TATG CC sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 264296 126 1.84e-09 ATCCATGAGC TGCTGATGCTGCTTC TGATGTTGCT 2075 283 1.40e-08 GTTGCCAGGC TGCTGCTGCTGCTCC CAGGTTGTGG 35323 244 1.63e-08 CGAGGTAACT TGCTGTAGCTGCCCC TAGACCAGAG 3383 202 1.80e-07 CGTAACGTTG TACTGAATCTGCTTC ACCGCCAGTG 3850 244 3.74e-07 ACCGGCAGAG GGCAGTTGGTGCCTC GGGAAACTAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 264296 1.8e-09 125_[+3]_360 2075 1.4e-08 282_[+3]_203 35323 1.6e-08 243_[+3]_242 3383 1.8e-07 201_[+3]_284 3850 3.7e-07 243_[+3]_242 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=5 264296 ( 126) TGCTGATGCTGCTTC 1 2075 ( 283) TGCTGCTGCTGCTCC 1 35323 ( 244) TGCTGTAGCTGCCCC 1 3383 ( 202) TACTGAATCTGCTTC 1 3850 ( 244) GGCAGTTGGTGCCTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 5346 bayes= 10.3127 E= 5.9e+002 -897 -897 -29 161 -36 -897 170 -897 -897 209 -897 -897 -36 -897 -897 161 -897 -897 203 -897 63 -23 -897 61 63 -897 -897 119 -897 -897 170 -39 -897 176 -29 -897 -897 -897 -897 193 -897 -897 203 -897 -897 209 -897 -897 -897 76 -897 119 -897 76 -897 119 -897 209 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 5 E= 5.9e+002 0.000000 0.000000 0.200000 0.800000 0.200000 0.000000 0.800000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.000000 0.000000 0.800000 0.000000 0.000000 1.000000 0.000000 0.400000 0.200000 0.000000 0.400000 0.400000 0.000000 0.000000 0.600000 0.000000 0.000000 0.800000 0.200000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.400000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TG][GA]C[TA]G[ATC][TA][GT][CG]TGC[TC][TC]C -------------------------------------------------------------------------------- Time 3.59 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 11838 1.52e-08 151_[+2(1.43e-09)]_7_[+1(4.27e-07)]_\ 307 2075 1.05e-11 150_[+2(1.24e-07)]_113_\ [+3(1.40e-08)]_85_[+1(9.95e-08)]_102 22328 1.39e-06 185_[+2(1.92e-07)]_227_\ [+1(2.93e-07)]_53 22951 1.43e-06 2_[+2(1.60e-08)]_451_[+1(3.36e-06)]_\ 12 22989 3.48e-04 298_[+2(1.18e-06)]_24_\ [+1(1.62e-05)]_143 2525 1.29e-02 454_[+1(6.22e-06)]_30 25801 3.37e-05 254_[+2(2.25e-06)]_200_\ [+1(1.14e-06)]_11 264296 5.19e-08 125_[+3(1.84e-09)]_116_\ [+3(2.07e-05)]_18_[+1(7.16e-06)]_195 3383 2.40e-09 201_[+3(1.80e-07)]_121_\ [+1(4.61e-06)]_17_[+2(7.10e-08)]_111 35323 2.29e-09 222_[+2(2.50e-06)]_2_[+3(1.63e-08)]_\ 24_[+1(1.38e-06)]_202 3850 1.02e-10 78_[+1(1.14e-06)]_63_[+2(4.67e-09)]_\ 67_[+3(3.74e-07)]_242 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************