******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/65/65.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9280 1.0000 500 43245 1.0000 500 48258 1.0000 500 32803 1.0000 500 43785 1.0000 500 49243 1.0000 500 16195 1.0000 500 41243 1.0000 500 50419 1.0000 500 7544 1.0000 500 33720 1.0000 500 51952 1.0000 500 50530 1.0000 500 34936 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/65/65.seqs.fa -oc motifs/65 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.262 C 0.248 G 0.221 T 0.269 Background letter frequencies (from dataset with add-one prior applied): A 0.262 C 0.248 G 0.221 T 0.269 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 11 llr = 136 E-value = 2.3e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1::7::5::7:45129 pos.-specific C ::634:1:8:a::27: probability G ::::413::2::421: matrix T 9a4:392a21:625:1 bits 2.2 2.0 * * * 1.7 * * * 1.5 ** * * * * Relative 1.3 ** * ** * * Entropy 1.1 **** * ** * * (17.9 bits) 0.9 **** * ***** ** 0.7 **** * ***** ** 0.4 ****** ****** ** 0.2 **************** 0.0 ---------------- Multilevel TTCACTATCACTATCA consensus TCG G AG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 48258 376 3.70e-09 TAGGTCTCGA TTCACTGTCACTGTCA TTCCGAAAAC 49243 366 8.77e-08 ATTTCGAGCA TTCAGTATCACAATAA TTACAACAGT 9280 454 1.79e-07 TTGGCCTCCT TTCAGTCTCACTACCA TGACAGCGAG 50419 355 2.52e-07 ATGCAACGAT ATCACTGTCACTGTCA AAATAATGTG 32803 350 3.17e-07 TCCACGTGGC TTCATTGTCGCTTTCA AAGTCAGTTC 50530 416 3.88e-07 GTTATTTGTT TTTCGTTTCACAATCA ACGATCCAAA 33720 218 1.51e-06 TTGTCAACAA TTTACTATTACAATAA TTTAAGAAGT 43785 409 1.51e-06 GCACCATTTT TTCATGTTCACTAGCA GTTCACTTTC 43245 216 4.52e-06 AGTACCACCC TTTCGTATCACTGACT CCCCGACGAG 34936 483 4.80e-06 CATTGCCTTT TTCATTATTTCAGCCA GC 51952 15 8.59e-06 GTACCGGTAT TTTCCTATCGCTTGGA GGTGAGTCAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48258 3.7e-09 375_[+1]_109 49243 8.8e-08 365_[+1]_119 9280 1.8e-07 453_[+1]_31 50419 2.5e-07 354_[+1]_130 32803 3.2e-07 349_[+1]_135 50530 3.9e-07 415_[+1]_69 33720 1.5e-06 217_[+1]_267 43785 1.5e-06 408_[+1]_76 43245 4.5e-06 215_[+1]_269 34936 4.8e-06 482_[+1]_2 51952 8.6e-06 14_[+1]_470 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=11 48258 ( 376) TTCACTGTCACTGTCA 1 49243 ( 366) TTCAGTATCACAATAA 1 9280 ( 454) TTCAGTCTCACTACCA 1 50419 ( 355) ATCACTGTCACTGTCA 1 32803 ( 350) TTCATTGTCGCTTTCA 1 50530 ( 416) TTTCGTTTCACAATCA 1 33720 ( 218) TTTACTATTACAATAA 1 43785 ( 409) TTCATGTTCACTAGCA 1 43245 ( 216) TTTCGTATCACTGACT 1 34936 ( 483) TTCATTATTTCAGCCA 1 51952 ( 15) TTTCCTATCGCTTGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 9.62303 E= 2.3e-001 -152 -1010 -1010 175 -1010 -1010 -1010 189 -1010 136 -1010 43 147 14 -1010 -1010 -1010 55 72 2 -1010 -1010 -128 175 80 -144 30 -57 -1010 -1010 -1010 189 -1010 172 -1010 -57 147 -1010 -28 -156 -1010 201 -1010 -1010 47 -1010 -1010 124 80 -1010 72 -57 -152 -45 -28 102 -53 155 -128 -1010 179 -1010 -1010 -156 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 11 E= 2.3e-001 0.090909 0.000000 0.000000 0.909091 0.000000 0.000000 0.000000 1.000000 0.000000 0.636364 0.000000 0.363636 0.727273 0.272727 0.000000 0.000000 0.000000 0.363636 0.363636 0.272727 0.000000 0.000000 0.090909 0.909091 0.454545 0.090909 0.272727 0.181818 0.000000 0.000000 0.000000 1.000000 0.000000 0.818182 0.000000 0.181818 0.727273 0.000000 0.181818 0.090909 0.000000 1.000000 0.000000 0.000000 0.363636 0.000000 0.000000 0.636364 0.454545 0.000000 0.363636 0.181818 0.090909 0.181818 0.181818 0.545455 0.181818 0.727273 0.090909 0.000000 0.909091 0.000000 0.000000 0.090909 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TT[CT][AC][CGT]T[AG]TCAC[TA][AG]TCA -------------------------------------------------------------------------------- Time 1.57 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 6 llr = 78 E-value = 1.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 3::8:57::3:: pos.-specific C 3a::a::a:::a probability G 3:a2:2::a7:: matrix T :::::33:::a: bits 2.2 * * 2.0 ** * ** ** 1.7 ** * ** ** 1.5 ** * ** ** Relative 1.3 **** ** ** Entropy 1.1 **** ****** (18.8 bits) 0.9 **** ****** 0.7 **** ****** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel ACGACAACGGTC consensus C TT A sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 50530 459 4.37e-08 CCTCCAACGA GCGACAACGGTC TCTCCAAGGA 7544 115 6.90e-07 ATAGGCGAAC CCGACGACGGTC CTTCACTACC 43245 194 6.90e-07 TACAACAATA ACGACAACGATC AGTACCACCC 32803 387 8.32e-07 TTCTACTAGT CCGACTTCGGTC GAACAAGCTG 50419 476 1.33e-06 CCCACAATCC ACGGCAACGGTC CGAATCTCCG 51952 86 1.76e-06 CATCAGCTTC GCGACTTCGATC GATGTTTCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50530 4.4e-08 458_[+2]_30 7544 6.9e-07 114_[+2]_374 43245 6.9e-07 193_[+2]_295 32803 8.3e-07 386_[+2]_102 50419 1.3e-06 475_[+2]_13 51952 1.8e-06 85_[+2]_403 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=6 50530 ( 459) GCGACAACGGTC 1 7544 ( 115) CCGACGACGGTC 1 43245 ( 194) ACGACAACGATC 1 32803 ( 387) CCGACTTCGGTC 1 50419 ( 476) ACGGCAACGGTC 1 51952 ( 86) GCGACTTCGATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 10.6026 E= 1.9e+002 35 43 59 -923 -923 201 -923 -923 -923 -923 218 -923 167 -923 -41 -923 -923 201 -923 -923 93 -923 -41 31 135 -923 -923 31 -923 201 -923 -923 -923 -923 218 -923 35 -923 159 -923 -923 -923 -923 189 -923 201 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 6 E= 1.9e+002 0.333333 0.333333 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.833333 0.000000 0.166667 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.166667 0.333333 0.666667 0.000000 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [ACG]CGAC[AT][AT]CG[GA]TC -------------------------------------------------------------------------------- Time 3.25 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 15 sites = 13 llr = 138 E-value = 2.9e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1:4322:184a7226 pos.-specific C 9::4::85:3:168: probability G :a:2371223:2::4 matrix T ::615222::::2:: bits 2.2 * 2.0 * * 1.7 * * 1.5 ** * Relative 1.3 ** * * * Entropy 1.1 ** * * * ** (15.3 bits) 0.9 *** ** * ** ** 0.7 *** ** * ***** 0.4 *** *********** 0.2 *************** 0.0 --------------- Multilevel CGTCTGCCAAAACCA consensus AAG T C GT G sequence GA G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------- 50530 124 1.40e-07 CGAATCTTTC CGTCGGCCAGAGCCG TGGGGACGGT 9280 372 4.16e-07 AGTCCATGAT CGTGTACCAAAACCA TCAGCTTCTG 50419 419 4.78e-07 TGAGGGACTA CGACTGCCGAAACCG ACTTGGAACG 33720 145 7.07e-07 ATTCACCGAG CGAAAGCCAAAATCA GAAGTTGCAG 51952 406 9.00e-07 CGGAACTTTG CGTCTTCCAAAGCCA TCGATAGACC 41243 54 5.75e-06 GGTTGCGCAT CGTAGGTCGGAACCG TAGGCTGACG 34936 37 6.27e-06 CTCTTACAAA CGTGTGTTACAATCA ATTGTCGCCG 43785 41 8.76e-06 GCCATAGGCG CGACGGCAAAAGTCA TTCTTTCTCA 43245 277 9.48e-06 TCCCTTAGCA AGTATTCCACAACCA CTCCACCTCC 7544 49 1.20e-05 TCCAGTATGT CGAAAGGTAGAACCG AGGGCCCTTG 16195 245 1.39e-05 CGAAGTGTCT CGTGTGCTACACACA CAATTCTAAC 48258 253 2.39e-05 TCTCCGTCAA CGATGGCGACAACAG GATTTGTTGC 49243 244 5.00e-05 GGCGGCGCCG CGTCAACGAGAAAAA TATCCCGCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50530 1.4e-07 123_[+3]_362 9280 4.2e-07 371_[+3]_114 50419 4.8e-07 418_[+3]_67 33720 7.1e-07 144_[+3]_341 51952 9e-07 405_[+3]_80 41243 5.8e-06 53_[+3]_432 34936 6.3e-06 36_[+3]_449 43785 8.8e-06 40_[+3]_445 43245 9.5e-06 276_[+3]_209 7544 1.2e-05 48_[+3]_437 16195 1.4e-05 244_[+3]_241 48258 2.4e-05 252_[+3]_233 49243 5e-05 243_[+3]_242 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=13 50530 ( 124) CGTCGGCCAGAGCCG 1 9280 ( 372) CGTGTACCAAAACCA 1 50419 ( 419) CGACTGCCGAAACCG 1 33720 ( 145) CGAAAGCCAAAATCA 1 51952 ( 406) CGTCTTCCAAAGCCA 1 41243 ( 54) CGTAGGTCGGAACCG 1 34936 ( 37) CGTGTGTTACAATCA 1 43785 ( 41) CGACGGCAAAAGTCA 1 43245 ( 277) AGTATTCCACAACCA 1 7544 ( 49) CGAAAGGTAGAACCG 1 16195 ( 245) CGTGTGCTACACACA 1 48258 ( 253) CGATGGCGACAACAG 1 49243 ( 244) CGTCAACGAGAAAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 6804 bayes= 8.94264 E= 2.9e+001 -176 190 -1035 -1035 -1035 -1035 218 -1035 55 -1035 -1035 119 23 63 6 -181 -18 -1035 48 78 -77 -1035 165 -81 -1035 163 -152 -81 -176 112 -52 -22 169 -1035 -52 -1035 55 31 48 -1035 193 -1035 -1035 -1035 140 -169 6 -1035 -77 131 -1035 -22 -77 177 -1035 -1035 123 -1035 80 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 13 E= 2.9e+001 0.076923 0.923077 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.384615 0.000000 0.000000 0.615385 0.307692 0.384615 0.230769 0.076923 0.230769 0.000000 0.307692 0.461538 0.153846 0.000000 0.692308 0.153846 0.000000 0.769231 0.076923 0.153846 0.076923 0.538462 0.153846 0.230769 0.846154 0.000000 0.153846 0.000000 0.384615 0.307692 0.307692 0.000000 1.000000 0.000000 0.000000 0.000000 0.692308 0.076923 0.230769 0.000000 0.153846 0.615385 0.000000 0.230769 0.153846 0.846154 0.000000 0.000000 0.615385 0.000000 0.384615 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CG[TA][CAG][TGA]GC[CT]A[ACG]A[AG][CT]C[AG] -------------------------------------------------------------------------------- Time 5.05 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9280 5.59e-07 371_[+3(4.16e-07)]_67_\ [+1(1.79e-07)]_31 43245 7.16e-07 193_[+2(6.90e-07)]_10_\ [+1(4.52e-06)]_45_[+3(9.48e-06)]_209 48258 5.60e-07 252_[+3(2.39e-05)]_108_\ [+1(3.70e-09)]_109 32803 7.26e-06 349_[+1(3.17e-07)]_21_\ [+2(8.32e-07)]_102 43785 7.20e-05 40_[+3(8.76e-06)]_353_\ [+1(1.51e-06)]_76 49243 5.43e-05 243_[+3(5.00e-05)]_107_\ [+1(8.77e-08)]_119 16195 2.92e-02 244_[+3(1.39e-05)]_241 41243 1.94e-02 53_[+3(5.75e-06)]_432 50419 6.12e-09 354_[+1(2.52e-07)]_48_\ [+3(4.78e-07)]_42_[+2(1.33e-06)]_13 7544 1.26e-04 48_[+3(1.20e-05)]_51_[+2(6.90e-07)]_\ 374 33720 2.76e-05 144_[+3(7.07e-07)]_58_\ [+1(1.51e-06)]_267 51952 3.56e-07 14_[+1(8.59e-06)]_55_[+2(1.76e-06)]_\ 308_[+3(9.00e-07)]_80 50530 1.22e-10 123_[+3(1.40e-07)]_277_\ [+1(3.88e-07)]_27_[+2(4.37e-08)]_30 34936 4.42e-04 36_[+3(6.27e-06)]_431_\ [+1(4.80e-06)]_2 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************