******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/490/490.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 8975 1.0000 500 46550 1.0000 500 48329 1.0000 500 40135 1.0000 500 49525 1.0000 500 49981 1.0000 500 55176 1.0000 500 44045 1.0000 500 45887 1.0000 500 45918 1.0000 500 46237 1.0000 500 43154 1.0000 500 33928 1.0000 500 35089 1.0000 500 36770 1.0000 500 54979 1.0000 500 49401 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/490/490.seqs.fa -oc motifs/490 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.261 C 0.235 G 0.221 T 0.283 Background letter frequencies (from dataset with add-one prior applied): A 0.261 C 0.235 G 0.221 T 0.283 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 17 llr = 165 E-value = 8.2e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 191::8:4:119 pos.-specific C 2:212:8::1:1 probability G 61::8:11a18: matrix T 1:69:215:81: bits 2.2 * 2.0 * 1.7 * 1.5 * * * * Relative 1.3 * ** * ** Entropy 1.1 * **** * ** (14.0 bits) 0.9 * **** * ** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GATTGACTGTGA consensus C C CT A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 45918 241 2.93e-07 AAGCGATCCT GACTGACAGTGA AATTCTCGAT 55176 214 6.20e-07 ACACTATAGT GATTCACAGTGA CGTGACTGTG 43154 71 1.61e-06 AGTCAGAGTC GATTGATTGTGA GAGTTATGGC 44045 403 2.06e-06 CACCGCCACT GACTCACAGTGA AAACTGTGGC 40135 294 3.18e-06 CTGCTCATTA TATTGACAGTGA AGTGACAAAG 8975 191 3.18e-06 TTTGTGTTTA GGCTGACTGTGA ATAGTATTCG 49401 75 3.97e-06 TTTGAGTTCT CGTTGACTGTGA TGTAATGTGA 35089 439 8.18e-06 CGGGTGCACA CATTCTCTGTGA ATGCCGTTGT 36770 215 1.08e-05 GTCAACAGTC AATTGAGAGTGA TACTCTGCTG 54979 406 1.21e-05 TTGCAAATTA CACTGATTGTGA TCGGCTTTAT 49981 219 1.33e-05 CGTGGGACTG AATTGACTGTAA ACCGAAGATG 45887 361 1.46e-05 GAAATTCGTA GATTGACAGCAA AGAAAGCTGC 46237 401 3.04e-05 TAAAGAAGCA CATCCACTGTGA TTGTATCGAT 48329 271 3.04e-05 GAGATTCTCC GAATGTCTGTGC GAAGACAATG 46550 140 5.84e-05 TGATCGCTTC GATTGACGGCGC AACGGTATCA 33928 71 1.19e-04 GTGACCCGAG GATTGTCAGGTA ATGCTTCATC 49525 304 1.67e-04 GCCAGAGTTC GAATGTGTGAGA AAAAGAGAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45918 2.9e-07 240_[+1]_248 55176 6.2e-07 213_[+1]_275 43154 1.6e-06 70_[+1]_418 44045 2.1e-06 402_[+1]_86 40135 3.2e-06 293_[+1]_195 8975 3.2e-06 190_[+1]_298 49401 4e-06 74_[+1]_414 35089 8.2e-06 438_[+1]_50 36770 1.1e-05 214_[+1]_274 54979 1.2e-05 405_[+1]_83 49981 1.3e-05 218_[+1]_270 45887 1.5e-05 360_[+1]_128 46237 3e-05 400_[+1]_88 48329 3e-05 270_[+1]_218 46550 5.8e-05 139_[+1]_349 33928 0.00012 70_[+1]_418 49525 0.00017 303_[+1]_185 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=17 45918 ( 241) GACTGACAGTGA 1 55176 ( 214) GATTCACAGTGA 1 43154 ( 71) GATTGATTGTGA 1 44045 ( 403) GACTCACAGTGA 1 40135 ( 294) TATTGACAGTGA 1 8975 ( 191) GGCTGACTGTGA 1 49401 ( 75) CGTTGACTGTGA 1 35089 ( 439) CATTCTCTGTGA 1 36770 ( 215) AATTGAGAGTGA 1 54979 ( 406) CACTGATTGTGA 1 49981 ( 219) AATTGACTGTAA 1 45887 ( 361) GATTGACAGCAA 1 46237 ( 401) CATCCACTGTGA 1 48329 ( 271) GAATGTCTGTGC 1 46550 ( 140) GATTGACGGCGC 1 33928 ( 71) GATTGTCAGGTA 1 49525 ( 304) GAATGTGTGAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 9.00042 E= 8.2e-004 -115 0 141 -226 176 -1073 -91 -1073 -115 0 -1073 119 -1073 -199 -1073 173 -1073 0 179 -1073 155 -1073 -1073 -27 -1073 170 -91 -126 66 -1073 -191 90 -1073 -1073 218 -1073 -215 -100 -191 143 -115 -1073 190 -226 176 -100 -1073 -1073 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 17 E= 8.2e-004 0.117647 0.235294 0.588235 0.058824 0.882353 0.000000 0.117647 0.000000 0.117647 0.235294 0.000000 0.647059 0.000000 0.058824 0.000000 0.941176 0.000000 0.235294 0.764706 0.000000 0.764706 0.000000 0.000000 0.235294 0.000000 0.764706 0.117647 0.117647 0.411765 0.000000 0.058824 0.529412 0.000000 0.000000 1.000000 0.000000 0.058824 0.117647 0.058824 0.764706 0.117647 0.000000 0.823529 0.058824 0.882353 0.117647 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GC]A[TC]T[GC][AT]C[TA]GTGA -------------------------------------------------------------------------------- Time 2.62 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 16 llr = 149 E-value = 4.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 31721:a33:13 pos.-specific C :93:1a:461:4 probability G 4::6:::3:9:1 matrix T 4::39:::1:93 bits 2.2 * 2.0 ** 1.7 ** 1.5 * ** ** Relative 1.3 * ** ** Entropy 1.1 ** *** ** (13.4 bits) 0.9 ** *** *** 0.7 ****** *** 0.4 *********** 0.2 *********** 0.0 ------------ Multilevel GCAGTCACCGTC consensus T CT AA A sequence A G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 36770 431 5.86e-07 ATCGTTTTTC GCAGTCACAGTC AGGCAGTTTT 44045 177 1.06e-06 CCGGATTAAT TCAGTCACAGTC ACGTCAAATA 54979 479 1.56e-06 CGTCGTTTCT ACAGTCACCGTT ACTTACAAAC 43154 59 5.34e-06 GCTATTACTG ACAGTCAGAGTC GATTGATTGT 49981 433 6.94e-06 AGCTTTGATT ACCGTCACAGTC GGTCATCGCT 49525 115 6.94e-06 AAACTCCCAC GCATTCAGCGTA TTGCGAACGA 40135 208 6.94e-06 CGTCACAAAC GCATTCAACGTT GTTGTGGTCT 8975 298 1.06e-05 GGTATCAATA GCAATCAAAGTC CCGCATCAAC 49401 264 2.60e-05 TCCACTCACG TCAGTCACTGTA GAACAGTAAA 35089 366 2.79e-05 AAAGGTAATG TCAGTCAGCCTT CCCTGCGAGG 33928 355 3.51e-05 GCACCCGAAA TCCATCAACGTG CGTGTTGTCG 48329 24 5.43e-05 TCGCTCCATT GCATACACCGTT ACTTAACTTC 46237 303 7.20e-05 TTCGAGGGAA GAAGTCAACCTC CCGACGCTGT 45887 275 7.71e-05 TCCAAGCACA TACGTCAGCGTG TCAACTAAGA 46550 459 1.24e-04 TCAATAAGGT TCCTCCAACGTA CGCTGTTATT 55176 65 1.53e-04 TCTAGAGAGA ACCATCACCGAA TCGTTGAAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36770 5.9e-07 430_[+2]_58 44045 1.1e-06 176_[+2]_312 54979 1.6e-06 478_[+2]_10 43154 5.3e-06 58_[+2]_430 49981 6.9e-06 432_[+2]_56 49525 6.9e-06 114_[+2]_374 40135 6.9e-06 207_[+2]_281 8975 1.1e-05 297_[+2]_191 49401 2.6e-05 263_[+2]_225 35089 2.8e-05 365_[+2]_123 33928 3.5e-05 354_[+2]_134 48329 5.4e-05 23_[+2]_465 46237 7.2e-05 302_[+2]_186 45887 7.7e-05 274_[+2]_214 46550 0.00012 458_[+2]_30 55176 0.00015 64_[+2]_424 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=16 36770 ( 431) GCAGTCACAGTC 1 44045 ( 177) TCAGTCACAGTC 1 54979 ( 479) ACAGTCACCGTT 1 43154 ( 59) ACAGTCAGAGTC 1 49981 ( 433) ACCGTCACAGTC 1 49525 ( 115) GCATTCAGCGTA 1 40135 ( 208) GCATTCAACGTT 1 8975 ( 298) GCAATCAAAGTC 1 49401 ( 264) TCAGTCACTGTA 1 35089 ( 366) TCAGTCAGCCTT 1 33928 ( 355) TCCATCAACGTG 1 48329 ( 24) GCATACACCGTT 1 46237 ( 303) GAAGTCAACCTC 1 45887 ( 275) TACGTCAGCGTG 1 46550 ( 459) TCCTCCAACGTA 1 55176 ( 65) ACCATCACCGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 9.75645 E= 4.6e+000 -6 -1064 76 41 -106 190 -1064 -1064 140 41 -1064 -1064 -48 -1064 135 -18 -206 -191 -1064 163 -1064 209 -1064 -1064 194 -1064 -1064 -1064 26 90 18 -1064 26 141 -1064 -218 -1064 -91 198 -1064 -206 -1064 -1064 173 -6 67 -82 -18 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 16 E= 4.6e+000 0.250000 0.000000 0.375000 0.375000 0.125000 0.875000 0.000000 0.000000 0.687500 0.312500 0.000000 0.000000 0.187500 0.000000 0.562500 0.250000 0.062500 0.062500 0.000000 0.875000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.312500 0.437500 0.250000 0.000000 0.312500 0.625000 0.000000 0.062500 0.000000 0.125000 0.875000 0.000000 0.062500 0.000000 0.000000 0.937500 0.250000 0.375000 0.125000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [GTA]C[AC][GT]TCA[CAG][CA]GT[CAT] -------------------------------------------------------------------------------- Time 5.14 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 6 llr = 80 E-value = 8.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :2a:28aa3::: pos.-specific C a7:8::::77:: probability G :2:282:::3a: matrix T :::::::::::a bits 2.2 * * 2.0 * * ** * 1.7 * * ** ** 1.5 * *** ** ** Relative 1.3 * ****** *** Entropy 1.1 * ********** (19.3 bits) 0.9 ************ 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel CCACGAAACCGT consensus AG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 46550 99 4.60e-08 ACTAGGAAAA CCACGAAACCGT GCGGATGAAG 45887 424 3.65e-07 AGCATGCGCA CCAGGAAACCGT TCCATCGGAT 8975 255 4.19e-07 CCATGGAGGG CCACAAAACCGT TACACAAGGT 43154 374 4.60e-07 TCCTCGAAAT CGACGAAACGGT GTATTCATTC 40135 153 7.33e-07 GTGAACGACG CAACGAAAACGT GGAAAATTGG 49981 484 1.11e-06 CTCATTCCAT CCACGGAAAGGT AAAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46550 4.6e-08 98_[+3]_390 45887 3.6e-07 423_[+3]_65 8975 4.2e-07 254_[+3]_234 43154 4.6e-07 373_[+3]_115 40135 7.3e-07 152_[+3]_336 49981 1.1e-06 483_[+3]_5 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=6 46550 ( 99) CCACGAAACCGT 1 45887 ( 424) CCAGGAAACCGT 1 8975 ( 255) CCACAAAACCGT 1 43154 ( 374) CGACGAAACGGT 1 40135 ( 153) CAACGAAAACGT 1 49981 ( 484) CCACGGAAAGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 10.8829 E= 8.9e+001 -923 209 -923 -923 -65 150 -41 -923 194 -923 -923 -923 -923 183 -41 -923 -65 -923 191 -923 167 -923 -41 -923 194 -923 -923 -923 194 -923 -923 -923 35 150 -923 -923 -923 150 59 -923 -923 -923 217 -923 -923 -923 -923 182 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 8.9e+001 0.000000 1.000000 0.000000 0.000000 0.166667 0.666667 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.166667 0.000000 0.833333 0.000000 0.833333 0.000000 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CCACGAAA[CA][CG]GT -------------------------------------------------------------------------------- Time 7.71 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8975 3.71e-07 190_[+1(3.18e-06)]_52_\ [+3(4.19e-07)]_31_[+2(1.06e-05)]_191 46550 6.14e-06 98_[+3(4.60e-08)]_29_[+1(5.84e-05)]_\ 349 48329 1.18e-02 23_[+2(5.43e-05)]_235_\ [+1(3.04e-05)]_218 40135 4.21e-07 152_[+3(7.33e-07)]_43_\ [+2(6.94e-06)]_74_[+1(3.18e-06)]_195 49525 8.59e-03 114_[+2(6.94e-06)]_374 49981 2.21e-06 218_[+1(1.33e-05)]_202_\ [+2(6.94e-06)]_39_[+3(1.11e-06)]_5 55176 7.66e-04 213_[+1(6.20e-07)]_275 44045 2.08e-05 176_[+2(1.06e-06)]_214_\ [+1(2.06e-06)]_86 45887 7.53e-06 77_[+3(6.54e-05)]_185_\ [+2(7.71e-05)]_74_[+1(1.46e-05)]_51_[+3(3.65e-07)]_65 45918 3.63e-04 240_[+1(2.93e-07)]_69_\ [+3(8.21e-05)]_167 46237 1.57e-02 302_[+2(7.20e-05)]_86_\ [+1(3.04e-05)]_88 43154 1.17e-07 58_[+2(5.34e-06)]_[+1(1.61e-06)]_\ 160_[+2(6.29e-05)]_119_[+3(4.60e-07)]_115 33928 4.83e-03 354_[+2(3.51e-05)]_134 35089 9.12e-04 365_[+2(2.79e-05)]_61_\ [+1(8.18e-06)]_50 36770 3.35e-05 214_[+1(1.08e-05)]_204_\ [+2(5.86e-07)]_58 54979 3.40e-04 405_[+1(1.21e-05)]_61_\ [+2(1.56e-06)]_10 49401 9.13e-04 74_[+1(3.97e-06)]_177_\ [+2(2.60e-05)]_225 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************