******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/39/39.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 8918 1.0000 500 46548 1.0000 500 25360 1.0000 500 55111 1.0000 500 50132 1.0000 500 44151 1.0000 500 45236 1.0000 500 35692 1.0000 500 45921 1.0000 500 46080 1.0000 500 33421 1.0000 500 48909 1.0000 500 46708 1.0000 500 49832 1.0000 500 49264 1.0000 500 44263 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/39/39.seqs.fa -oc motifs/39 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.278 C 0.218 G 0.230 T 0.274 Background letter frequencies (from dataset with add-one prior applied): A 0.278 C 0.218 G 0.230 T 0.274 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 16 llr = 145 E-value = 1.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :1::a:4:1:74 pos.-specific C 6116:921:41: probability G 17:2:1191236 matrix T 3192::3:94:: bits 2.2 2.0 1.8 ** * 1.5 * ** * Relative 1.3 * ** ** Entropy 1.1 * ** ** * (13.1 bits) 0.9 **** ** ** 0.7 ****** ***** 0.4 ****** ***** 0.2 ************ 0.0 ------------ Multilevel CGTCACAGTCAG consensus T T TGA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 25360 354 1.78e-07 GGGAACTGTA CGTCACCGTCAG ACTTCGATAC 35692 384 8.90e-07 GCACGAATTG CGTCACAGTTGG CGGAAACCAT 50132 380 3.46e-06 TCTGTCTTTC CGTTACTGTTAG CAGGCGCTCT 45236 206 6.46e-06 TCTTATTTGC TGTTACAGTTAG TGTGTAATCT 46548 124 6.46e-06 CATAGAAATT GGTCACTGTCAA CTAATTGCCT 44151 37 1.22e-05 AAAGTAGATT CCTTACAGTCAG ATACAGGAGA 49264 187 1.47e-05 AACTTAAGAG CTTCACTGTTAA TTGTTTCCTT 49832 380 1.80e-05 CTTGCGGGAC TGTCACGGTGAG GTTTGGAACG 46708 292 2.15e-05 CGTAGGAAAT TGTGACAGTGAA CGCAACGGAG 48909 426 2.58e-05 CTACGACCAA CGTCAGTGTCAA CACTGGGACC 33421 463 3.31e-05 TTCCGGGAGA TATCACAGTCAA CGCTGGTCAC 55111 47 4.27e-05 AATTCGAGTT CTTGACTGTGAG AACCGTAAAG 45921 268 6.25e-05 CGTTGGTGCA CGTGACAGACGG AAATTATGGA 8918 361 1.24e-04 TCGGAAGATA TGTCACCCTCGA CCGGCGACGT 44263 324 1.32e-04 CAACAGCAAG GCTCACAGTTCG TTTAAAAATA 46080 393 2.61e-04 TCCGAATCTT CGCCACCGGTGG GGGCAATTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 25360 1.8e-07 353_[+1]_135 35692 8.9e-07 383_[+1]_105 50132 3.5e-06 379_[+1]_109 45236 6.5e-06 205_[+1]_283 46548 6.5e-06 123_[+1]_365 44151 1.2e-05 36_[+1]_452 49264 1.5e-05 186_[+1]_302 49832 1.8e-05 379_[+1]_109 46708 2.2e-05 291_[+1]_197 48909 2.6e-05 425_[+1]_63 33421 3.3e-05 462_[+1]_26 55111 4.3e-05 46_[+1]_442 45921 6.3e-05 267_[+1]_221 8918 0.00012 360_[+1]_128 44263 0.00013 323_[+1]_165 46080 0.00026 392_[+1]_96 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=16 25360 ( 354) CGTCACCGTCAG 1 35692 ( 384) CGTCACAGTTGG 1 50132 ( 380) CGTTACTGTTAG 1 45236 ( 206) TGTTACAGTTAG 1 46548 ( 124) GGTCACTGTCAA 1 44151 ( 37) CCTTACAGTCAG 1 49264 ( 187) CTTCACTGTTAA 1 49832 ( 380) TGTCACGGTGAG 1 46708 ( 292) TGTGACAGTGAA 1 48909 ( 426) CGTCAGTGTCAA 1 33421 ( 463) TATCACAGTCAA 1 55111 ( 47) CTTGACTGTGAG 1 45921 ( 268) CGTGACAGACGG 1 8918 ( 361) TGTCACCCTCGA 1 44263 ( 324) GCTCACAGTTCG 1 46080 ( 393) CGCCACCGGTGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7824 bayes= 8.93074 E= 1.2e+002 -1064 136 -88 19 -215 -80 158 -113 -1064 -180 -1064 178 -1064 152 -29 -54 185 -1064 -1064 -1064 -1064 210 -188 -1064 65 -22 -188 19 -1064 -180 203 -1064 -215 -1064 -188 168 -1064 100 -29 45 131 -180 12 -1064 43 -1064 144 -1064 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 16 E= 1.2e+002 0.000000 0.562500 0.125000 0.312500 0.062500 0.125000 0.687500 0.125000 0.000000 0.062500 0.000000 0.937500 0.000000 0.625000 0.187500 0.187500 1.000000 0.000000 0.000000 0.000000 0.000000 0.937500 0.062500 0.000000 0.437500 0.187500 0.062500 0.312500 0.000000 0.062500 0.937500 0.000000 0.062500 0.000000 0.062500 0.875000 0.000000 0.437500 0.187500 0.375000 0.687500 0.062500 0.250000 0.000000 0.375000 0.000000 0.625000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CT]GTCAC[AT]GT[CT][AG][GA] -------------------------------------------------------------------------------- Time 2.11 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 7 llr = 114 E-value = 2.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :34::::13::1::a34441a pos.-specific C :3::a946616::a:61:6:: probability G 13:3:141:::6:::131:9: matrix T 9167::111943a:::14::: bits 2.2 * * 2.0 * ** 1.8 * *** * 1.5 ** *** ** Relative 1.3 * ** * *** ** Entropy 1.1 * *** ** *** *** (23.5 bits) 0.9 * **** ** *** *** 0.7 * ***** ******** *** 0.4 * ************** **** 0.2 * ******************* 0.0 --------------------- Multilevel TATTCCCCCTCGTCACAACGA consensus CAG G A TT AGTA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 48909 347 4.78e-10 GACCTGAGCG TCTTCCGCCTCATCACGAAGA AAACTCGCCG 25360 385 2.40e-09 CTGTCCTATC TTTTCCCCATTGTCACAGCGA GGCGATGTGA 44263 135 7.24e-09 AGGGCTACGA TGTGCCGCCTTTTCAGATAGA CTGCCCTGAC 45236 283 8.85e-09 GCGACTGGTA TCATCGCGCTTGTCACGACGA TCCAACACAG 46708 464 4.40e-08 CCTGCTCCAG TAATCCCCACCGTCAAATCAA ACTTTGACTC 45921 365 6.62e-08 CACGATCCCC GAATCCGATTCGTCACCTCGA GATCTTGGAT 55111 369 1.20e-07 GTACACGCTT TGTGCCTTCTCTTCAATAAGA AAGAATAGGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48909 4.8e-10 346_[+2]_133 25360 2.4e-09 384_[+2]_95 44263 7.2e-09 134_[+2]_345 45236 8.8e-09 282_[+2]_197 46708 4.4e-08 463_[+2]_16 45921 6.6e-08 364_[+2]_115 55111 1.2e-07 368_[+2]_111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=7 48909 ( 347) TCTTCCGCCTCATCACGAAGA 1 25360 ( 385) TTTTCCCCATTGTCACAGCGA 1 44263 ( 135) TGTGCCGCCTTTTCAGATAGA 1 45236 ( 283) TCATCGCGCTTGTCACGACGA 1 46708 ( 464) TAATCCCCACCGTCAAATCAA 1 45921 ( 365) GAATCCGATTCGTCACCTCGA 1 55111 ( 369) TGTGCCTTCTCTTCAATAAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 7680 bayes= 9.94195 E= 2.1e+002 -945 -945 -69 165 4 39 31 -94 62 -945 -945 106 -945 -945 31 138 -945 219 -945 -945 -945 197 -69 -945 -945 97 90 -94 -96 139 -69 -94 4 139 -945 -94 -945 -61 -945 165 -945 139 -945 65 -96 -945 131 6 -945 -945 -945 187 -945 219 -945 -945 185 -945 -945 -945 4 139 -69 -945 62 -61 31 -94 62 -945 -69 65 62 139 -945 -945 -96 -945 190 -945 185 -945 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 7 E= 2.1e+002 0.000000 0.000000 0.142857 0.857143 0.285714 0.285714 0.285714 0.142857 0.428571 0.000000 0.000000 0.571429 0.000000 0.000000 0.285714 0.714286 0.000000 1.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 0.000000 0.428571 0.428571 0.142857 0.142857 0.571429 0.142857 0.142857 0.285714 0.571429 0.000000 0.142857 0.000000 0.142857 0.000000 0.857143 0.000000 0.571429 0.000000 0.428571 0.142857 0.000000 0.571429 0.285714 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.285714 0.571429 0.142857 0.000000 0.428571 0.142857 0.285714 0.142857 0.428571 0.000000 0.142857 0.428571 0.428571 0.571429 0.000000 0.000000 0.142857 0.000000 0.857143 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[ACG][TA][TG]CC[CG]C[CA]T[CT][GT]TCA[CA][AG][AT][CA]GA -------------------------------------------------------------------------------- Time 4.29 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 9 llr = 102 E-value = 4.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::1::::92:7: pos.-specific C ::7612:18::6 probability G ::24::a::132 matrix T aa::98:::9:2 bits 2.2 * 2.0 ** * 1.8 ** * 1.5 ** * Relative 1.3 ** * **** Entropy 1.1 ** ******** (16.3 bits) 0.9 *********** 0.7 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTCCTTGACTAC consensus GG C A GG sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 46548 471 6.20e-08 TCTTATTGCA TTCCTTGACTAC TCAAACGAGA 45236 129 4.91e-07 CCTTTAGATA TTCCTTGACTAT GGATTCTCTG 44263 41 9.37e-07 TTACCGAGAT TTCCTCGACTGC TCGTAAATCG 49264 385 1.29e-06 ACAACCCCCC TTCGTTGAATAC AAGGTGAGTA 35692 219 3.33e-06 TTAGCTTAAA TTGCTTGAATAC TTATCATGCT 49832 259 4.04e-06 CTAAACAACC TTAGTTGACTGC ACAAGATCTT 45921 434 9.15e-06 AAACAAGAAG TTCGTTGCCTGT TGTTTCTACA 46708 26 1.10e-05 GCTTGAGGAA TTGCTTGACGAG ACCGAGGCCG 8918 464 1.22e-05 CCCACGAGGG TTCGCCGACTAG CTTACTGCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46548 6.2e-08 470_[+3]_18 45236 4.9e-07 128_[+3]_360 44263 9.4e-07 40_[+3]_448 49264 1.3e-06 384_[+3]_104 35692 3.3e-06 218_[+3]_270 49832 4e-06 258_[+3]_230 45921 9.1e-06 433_[+3]_55 46708 1.1e-05 25_[+3]_463 8918 1.2e-05 463_[+3]_25 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=9 46548 ( 471) TTCCTTGACTAC 1 45236 ( 129) TTCCTTGACTAT 1 44263 ( 41) TTCCTCGACTGC 1 49264 ( 385) TTCGTTGAATAC 1 35692 ( 219) TTGCTTGAATAC 1 49832 ( 259) TTAGTTGACTGC 1 45921 ( 434) TTCGTTGCCTGT 1 46708 ( 26) TTGCTTGACGAG 1 8918 ( 464) TTCGCCGACTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7824 bayes= 9.89655 E= 4.8e+002 -982 -982 -982 187 -982 -982 -982 187 -132 161 -5 -982 -982 135 95 -982 -982 -97 -982 170 -982 3 -982 151 -982 -982 212 -982 168 -97 -982 -982 -32 183 -982 -982 -982 -982 -105 170 126 -982 53 -982 -982 135 -5 -30 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 4.8e+002 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.111111 0.666667 0.222222 0.000000 0.000000 0.555556 0.444444 0.000000 0.000000 0.111111 0.000000 0.888889 0.000000 0.222222 0.000000 0.777778 0.000000 0.000000 1.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.222222 0.777778 0.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.666667 0.000000 0.333333 0.000000 0.000000 0.555556 0.222222 0.222222 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TT[CG][CG]T[TC]GA[CA]T[AG][CGT] -------------------------------------------------------------------------------- Time 6.34 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 8918 1.21e-02 463_[+3(1.22e-05)]_25 46548 8.47e-06 123_[+1(6.46e-06)]_335_\ [+3(6.20e-08)]_18 25360 1.87e-08 353_[+1(1.78e-07)]_19_\ [+2(2.40e-09)]_95 55111 6.72e-05 46_[+1(4.27e-05)]_310_\ [+2(1.20e-07)]_111 50132 1.93e-02 379_[+1(3.46e-06)]_109 44151 1.20e-02 36_[+1(1.22e-05)]_452 45236 1.21e-09 128_[+3(4.91e-07)]_65_\ [+1(6.46e-06)]_65_[+2(8.85e-09)]_197 35692 1.40e-05 138_[+1(2.15e-05)]_68_\ [+3(3.33e-06)]_153_[+1(8.90e-07)]_105 45921 8.80e-07 267_[+1(6.25e-05)]_85_\ [+2(6.62e-08)]_48_[+3(9.15e-06)]_55 46080 1.90e-01 500 33421 2.46e-02 462_[+1(3.31e-05)]_26 48909 4.42e-07 346_[+2(4.78e-10)]_58_\ [+1(2.58e-05)]_63 46708 2.76e-07 25_[+3(1.10e-05)]_254_\ [+1(2.15e-05)]_160_[+2(4.40e-08)]_16 49832 1.15e-04 258_[+3(4.04e-06)]_109_\ [+1(1.80e-05)]_109 49264 1.60e-04 186_[+1(1.47e-05)]_186_\ [+3(1.29e-06)]_92_[+3(7.68e-05)] 44263 2.89e-08 40_[+3(9.37e-07)]_82_[+2(7.24e-09)]_\ 345 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************