******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/334/334.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42429 1.0000 500 42959 1.0000 500 43033 1.0000 500 37751 1.0000 500 48032 1.0000 500 48393 1.0000 500 44026 1.0000 500 44404 1.0000 500 35971 1.0000 500 40279 1.0000 500 38361 1.0000 500 46637 1.0000 500 44728 1.0000 500 47717 1.0000 500 37403 1.0000 500 48454 1.0000 500 44532 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/334/334.seqs.fa -oc motifs/334 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8500 N= 17 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.230 G 0.233 T 0.269 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.230 G 0.233 T 0.269 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 13 llr = 130 E-value = 1.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::aa88:824:2 pos.-specific C aa::::5:2:32 probability G ::::22423475 matrix T ::::::1:22:2 bits 2.1 ** 1.9 **** 1.7 **** 1.5 **** Relative 1.3 ***** * * Entropy 1.1 ****** * * (14.5 bits) 0.8 ******** * 0.6 ******** * 0.4 ******** ** 0.2 ******** *** 0.0 ------------ Multilevel CCAAAACAGAGG consensus GG AGCA sequence CT T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 48454 49 1.36e-06 TGAAGATCAT CCAAAAGAGTGG CCCTCTGTAT 44026 478 1.36e-06 GACCTTACGC CCAAAACAGAGA CTAGACCGAC 44404 480 1.86e-06 GTATTTCCTA CCAAAGCAGAGG ACAGTCGCC 38361 357 2.38e-06 ACGACGCGAA CCAAAACACACG AGACACATTC 42429 218 5.16e-06 TACAAGGAAA CCAAAACAAAGC GCGATTGCAA 44728 247 9.54e-06 TAGTGGCATA CCAAAACAAGCA ACAATTTACC 37403 196 1.17e-05 GTTGACCCAA CCAAAAGGTGGG CTATTCGTAC 43033 25 1.41e-05 AAAATCATTC CCAAAACATGCT TATGTGAAAG 47717 198 1.70e-05 GAGATCCATT CCAAAACGCGGA TCCCCAAGGT 44532 307 2.46e-05 ACATCCCCAG CCAAAGGATAGC ACGTTGTAAC 35971 176 2.46e-05 ATGAAATTTA CCAAAGTAGGGG AAATTAAGTG 42959 212 4.39e-05 GTTAATTCGT CCAAGAGAATCG TGAACCAAAT 46637 459 4.85e-05 CTGGCAGCGA CCAAGAGACTGT CGTTCACACA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48454 1.4e-06 48_[+1]_440 44026 1.4e-06 477_[+1]_11 44404 1.9e-06 479_[+1]_9 38361 2.4e-06 356_[+1]_132 42429 5.2e-06 217_[+1]_271 44728 9.5e-06 246_[+1]_242 37403 1.2e-05 195_[+1]_293 43033 1.4e-05 24_[+1]_464 47717 1.7e-05 197_[+1]_291 44532 2.5e-05 306_[+1]_182 35971 2.5e-05 175_[+1]_313 42959 4.4e-05 211_[+1]_277 46637 4.8e-05 458_[+1]_30 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=13 48454 ( 49) CCAAAAGAGTGG 1 44026 ( 478) CCAAAACAGAGA 1 44404 ( 480) CCAAAGCAGAGG 1 38361 ( 357) CCAAAACACACG 1 42429 ( 218) CCAAAACAAAGC 1 44728 ( 247) CCAAAACAAGCA 1 37403 ( 196) CCAAAAGGTGGG 1 43033 ( 25) CCAAAACATGCT 1 47717 ( 198) CCAAAACGCGGA 1 44532 ( 307) CCAAAGGATAGC 1 35971 ( 176) CCAAAGTAGGGG 1 42959 ( 212) CCAAGAGAATCG 1 46637 ( 459) CCAAGAGACTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8313 bayes= 9.84967 E= 1.6e+001 -1035 212 -1035 -1035 -1035 212 -1035 -1035 190 -1035 -1035 -1035 190 -1035 -1035 -1035 166 -1035 -60 -1035 152 -1035 -1 -1035 -1035 123 72 -180 166 -1035 -60 -1035 -22 1 40 -22 52 -1035 72 -22 -1035 42 157 -1035 -22 -58 99 -80 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 1.6e+001 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.846154 0.000000 0.153846 0.000000 0.769231 0.000000 0.230769 0.000000 0.000000 0.538462 0.384615 0.076923 0.846154 0.000000 0.153846 0.000000 0.230769 0.230769 0.307692 0.230769 0.384615 0.000000 0.384615 0.230769 0.000000 0.307692 0.692308 0.000000 0.230769 0.153846 0.461538 0.153846 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- CCAAA[AG][CG]A[GACT][AGT][GC][GA] -------------------------------------------------------------------------------- Time 2.51 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 14 sites = 7 llr = 93 E-value = 6.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :1:7::::aa:441 pos.-specific C 9313a:a4::a169 probability G :44::1:6:::4:: matrix T 114::9:::::::: bits 2.1 * * * 1.9 * * *** 1.7 * * *** 1.5 * * * *** * Relative 1.3 * *** *** * Entropy 1.1 * ******** ** (19.1 bits) 0.8 * ******** ** 0.6 * ************ 0.4 * ************ 0.2 ************** 0.0 -------------- Multilevel CGGACTCGAACACC consensus CTC C GA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 37403 213 3.69e-08 GTGGGCTATT CGTACTCCAACACC GGTCGAGGTG 48454 374 2.26e-07 CTATTAATGT CTGACTCGAACGAC AAGAATCTCC 35971 41 4.01e-07 TTTAAAGCCA CATCCTCGAACGCC ACATCAAAAT 48393 77 6.17e-07 CCGGATAGCA CGGACTCGAACAAA AATAGCACTT 38361 35 6.63e-07 GACCTCGACC TCGACTCCAACGCC TGGCCAACCG 44026 329 7.21e-07 CTGACTTTGC CCTCCTCCAACCCC TCCAACATAA 44404 352 1.18e-06 AGGGATTATT CGCACGCGAACAAC CAACCATGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 37403 3.7e-08 212_[+2]_274 48454 2.3e-07 373_[+2]_113 35971 4e-07 40_[+2]_446 48393 6.2e-07 76_[+2]_410 38361 6.6e-07 34_[+2]_452 44026 7.2e-07 328_[+2]_158 44404 1.2e-06 351_[+2]_135 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=14 seqs=7 37403 ( 213) CGTACTCCAACACC 1 48454 ( 374) CTGACTCGAACGAC 1 35971 ( 41) CATCCTCGAACGCC 1 48393 ( 77) CGGACTCGAACAAA 1 38361 ( 35) TCGACTCCAACGCC 1 44026 ( 329) CCTCCTCCAACCCC 1 44404 ( 352) CGCACGCGAACAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 8279 bayes= 10.0504 E= 6.6e+002 -945 190 -945 -91 -91 31 88 -91 -945 -69 88 67 141 31 -945 -945 -945 212 -945 -945 -945 -945 -70 167 -945 212 -945 -945 -945 90 129 -945 190 -945 -945 -945 190 -945 -945 -945 -945 212 -945 -945 67 -69 88 -945 67 131 -945 -945 -91 190 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 7 E= 6.6e+002 0.000000 0.857143 0.000000 0.142857 0.142857 0.285714 0.428571 0.142857 0.000000 0.142857 0.428571 0.428571 0.714286 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.142857 0.857143 0.000000 1.000000 0.000000 0.000000 0.000000 0.428571 0.571429 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.428571 0.142857 0.428571 0.000000 0.428571 0.571429 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[GC][GT][AC]CTC[GC]AAC[AG][CA]C -------------------------------------------------------------------------------- Time 5.17 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 13 llr = 150 E-value = 3.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::821:8:52a5554: pos.-specific C 2:16:8:157::5515 probability G 84::7228:2:5::54 matrix T :6222::2:::::::2 bits 2.1 1.9 * 1.7 * 1.5 * Relative 1.3 * * * Entropy 1.1 ** **** **** (16.6 bits) 0.8 *** ********** 0.6 **************** 0.4 **************** 0.2 **************** 0.0 ---------------- Multilevel GTACGCAGCCAAAAGC consensus CG ATGG A GCCAG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 48393 293 4.94e-09 AACATAAACG GTACGCAGACAGACGC AACGGAAACA 44728 464 1.10e-07 GCGTGCGGTT GTACGGAGACAGCAGG CACAATCGGA 44026 355 1.43e-07 CAACATAAGT GGAAGCAGACAGACGG GGAATCCGAA 37403 317 2.34e-07 GCGGCATTGG CTACTCAGCCAACCGC GTTACCAAAT 46637 21 2.56e-07 ATTTTGCGAC GTACGCATCCAGCAAG AACAACGTAA 43033 192 8.50e-07 TCTGTCTTGT GTACGGAGACAAAAAT TGTTTGACCC 40279 133 5.13e-06 ACAGATCTTG GTTCGCGCCCAGCAGC GGTTTTTAAA 35971 247 5.52e-06 GCATGCGTCG GGAAACGGCCAAACGG TTTCATTCTT 42429 108 5.94e-06 TCGGGGTGTT GGCAGCAGCAAGCAGC AGTCTCGTGT 37751 458 6.82e-06 GAAGTCTGCT CGTCGCAGAGAAACAC TATATACTTG 47717 32 7.32e-06 CAGCAGGGTC GGATGCAGCGAAAACG ACACGCCTGC 38361 452 1.02e-05 CAAACGTATT GTATTCGGCAAACCAC AGATACTCTA 42959 263 2.86e-05 AACAGTTTTC CTACTGATACAAAAAT TTGCCATCGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48393 4.9e-09 292_[+3]_192 44728 1.1e-07 463_[+3]_21 44026 1.4e-07 354_[+3]_130 37403 2.3e-07 316_[+3]_168 46637 2.6e-07 20_[+3]_464 43033 8.5e-07 191_[+3]_293 40279 5.1e-06 132_[+3]_352 35971 5.5e-06 246_[+3]_238 42429 5.9e-06 107_[+3]_377 37751 6.8e-06 457_[+3]_27 47717 7.3e-06 31_[+3]_453 38361 1e-05 451_[+3]_33 42959 2.9e-05 262_[+3]_222 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=13 48393 ( 293) GTACGCAGACAGACGC 1 44728 ( 464) GTACGGAGACAGCAGG 1 44026 ( 355) GGAAGCAGACAGACGG 1 37403 ( 317) CTACTCAGCCAACCGC 1 46637 ( 21) GTACGCATCCAGCAAG 1 43033 ( 192) GTACGGAGACAAAAAT 1 40279 ( 133) GTTCGCGCCCAGCAGC 1 35971 ( 247) GGAAACGGCCAAACGG 1 42429 ( 108) GGCAGCAGCAAGCAGC 1 37751 ( 458) CGTCGCAGAGAAACAC 1 47717 ( 32) GGATGCAGCGAAAACG 1 38361 ( 452) GTATTCGGCAAACCAC 1 42959 ( 263) CTACTGATACAAAAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 8245 bayes= 9.8378 E= 3.2e+002 -1035 1 172 -1035 -1035 -1035 72 119 152 -158 -1035 -80 -22 142 -1035 -80 -180 -1035 157 -22 -1035 174 -1 -1035 152 -1035 -1 -1035 -1035 -158 172 -80 78 123 -1035 -1035 -80 159 -60 -1035 190 -1035 -1035 -1035 100 -1035 99 -1035 100 100 -1035 -1035 100 100 -1035 -1035 52 -158 121 -1035 -1035 100 72 -80 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 13 E= 3.2e+002 0.000000 0.230769 0.769231 0.000000 0.000000 0.000000 0.384615 0.615385 0.769231 0.076923 0.000000 0.153846 0.230769 0.615385 0.000000 0.153846 0.076923 0.000000 0.692308 0.230769 0.000000 0.769231 0.230769 0.000000 0.769231 0.000000 0.230769 0.000000 0.000000 0.076923 0.769231 0.153846 0.461538 0.538462 0.000000 0.000000 0.153846 0.692308 0.153846 0.000000 1.000000 0.000000 0.000000 0.000000 0.538462 0.000000 0.461538 0.000000 0.538462 0.461538 0.000000 0.000000 0.538462 0.461538 0.000000 0.000000 0.384615 0.076923 0.538462 0.000000 0.000000 0.461538 0.384615 0.153846 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GC][TG]A[CA][GT][CG][AG]G[CA]CA[AG][AC][AC][GA][CG] -------------------------------------------------------------------------------- Time 7.70 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42429 5.07e-04 107_[+3(5.94e-06)]_94_\ [+1(5.16e-06)]_271 42959 5.46e-03 211_[+1(4.39e-05)]_39_\ [+3(2.86e-05)]_222 43033 1.23e-04 24_[+1(1.41e-05)]_155_\ [+3(8.50e-07)]_293 37751 2.64e-02 457_[+3(6.82e-06)]_27 48032 6.58e-01 500 48393 1.18e-07 76_[+2(6.17e-07)]_202_\ [+3(4.94e-09)]_192 44026 5.44e-09 328_[+2(7.21e-07)]_12_\ [+3(1.43e-07)]_107_[+1(1.36e-06)]_11 44404 4.75e-05 290_[+1(3.10e-05)]_49_\ [+2(1.18e-06)]_114_[+1(1.86e-06)]_9 35971 1.24e-06 40_[+2(4.01e-07)]_121_\ [+1(2.46e-05)]_59_[+3(5.52e-06)]_238 40279 3.13e-02 132_[+3(5.13e-06)]_352 38361 4.15e-07 34_[+2(6.63e-07)]_308_\ [+1(2.38e-06)]_83_[+3(1.02e-05)]_33 46637 2.46e-04 20_[+3(2.56e-07)]_422_\ [+1(4.85e-05)]_30 44728 2.84e-05 246_[+1(9.54e-06)]_205_\ [+3(1.10e-07)]_21 47717 6.16e-04 31_[+3(7.32e-06)]_150_\ [+1(1.70e-05)]_291 37403 4.00e-09 195_[+1(1.17e-05)]_5_[+2(3.69e-08)]_\ 90_[+3(2.34e-07)]_168 48454 8.16e-06 48_[+1(1.36e-06)]_313_\ [+2(2.26e-07)]_113 44532 8.12e-02 306_[+1(2.46e-05)]_182 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************