******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/337/337.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 21078 1.0000 500 260785 1.0000 500 262179 1.0000 500 4084 1.0000 500 4431 1.0000 500 4744 1.0000 500 4752 1.0000 500 6531 1.0000 500 7600 1.0000 500 8174 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/337/337.seqs.fa -oc motifs/337 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 10 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 5000 N= 10 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.279 C 0.209 G 0.230 T 0.281 Background letter frequencies (from dataset with add-one prior applied): A 0.279 C 0.209 G 0.230 T 0.281 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 16 sites = 9 llr = 115 E-value = 3.5e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 169:84:a9234:a32 pos.-specific C 6::9::9::8214::6 probability G 3::1:1::1:44::72 matrix T :41:241:::::6::: bits 2.3 2.0 1.8 * ** * 1.6 * ** * Relative 1.4 ** **** * Entropy 1.1 *** **** *** (18.4 bits) 0.9 ***** **** *** 0.7 ***** **** ***** 0.5 **************** 0.2 **************** 0.0 ---------------- Multilevel CAACAACAACGATAGC consensus GT TT AAGC AA sequence C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 4752 285 1.05e-07 CATATATTAG GTACAACAACGCTAGC TCTTTTGTAA 4084 214 2.27e-07 GAGAGGGGCC CAACTACAACAACAAC TTGGAAGGCA 7600 83 3.09e-07 TTGTAGATCG CAACAGCAGCGGCAGC TAGAACCAAA 260785 254 3.74e-07 AGTCCCAGTT CAACTTCAACGGCAAA TGGATCACAG 21078 62 3.74e-07 TTAACTCATT GTACATCAAACACAGC TAAGCTGCTG 4431 484 4.93e-07 GATTTCACTG CTACAATAACAATAGC A 8174 166 7.58e-07 AAGCAACCAT CAAGATCAACCGTAGG ATCTAGTACA 262179 321 1.85e-06 ATTGTTCGAT GATCATCAACGATAAG AATGAGTAAC 4744 466 2.70e-06 AGCGAGACAA ATACAACAAAAGTAGA CAACAGCTAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 4752 1e-07 284_[+1]_200 4084 2.3e-07 213_[+1]_271 7600 3.1e-07 82_[+1]_402 260785 3.7e-07 253_[+1]_231 21078 3.7e-07 61_[+1]_423 4431 4.9e-07 483_[+1]_1 8174 7.6e-07 165_[+1]_319 262179 1.8e-06 320_[+1]_164 4744 2.7e-06 465_[+1]_19 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=9 4752 ( 285) GTACAACAACGCTAGC 1 4084 ( 214) CAACTACAACAACAAC 1 7600 ( 83) CAACAGCAGCGGCAGC 1 260785 ( 254) CAACTTCAACGGCAAA 1 21078 ( 62) GTACATCAAACACAGC 1 4431 ( 484) CTACAATAACAATAGC 1 8174 ( 166) CAAGATCAACCGTAGG 1 262179 ( 321) GATCATCAACGATAAG 1 4744 ( 466) ATACAACAAAAGTAGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 4850 bayes= 9.92035 E= 3.5e+000 -133 141 53 -982 99 -982 -982 66 167 -982 -982 -134 -982 208 -105 -982 148 -982 -982 -34 67 -982 -105 66 -982 208 -982 -134 184 -982 -982 -982 167 -982 -105 -982 -33 189 -982 -982 25 9 95 -982 67 -91 95 -982 -982 108 -982 98 184 -982 -982 -982 25 -982 153 -982 -33 141 -5 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 9 E= 3.5e+000 0.111111 0.555556 0.333333 0.000000 0.555556 0.000000 0.000000 0.444444 0.888889 0.000000 0.000000 0.111111 0.000000 0.888889 0.111111 0.000000 0.777778 0.000000 0.000000 0.222222 0.444444 0.000000 0.111111 0.444444 0.000000 0.888889 0.000000 0.111111 1.000000 0.000000 0.000000 0.000000 0.888889 0.000000 0.111111 0.000000 0.222222 0.777778 0.000000 0.000000 0.333333 0.222222 0.444444 0.000000 0.444444 0.111111 0.444444 0.000000 0.000000 0.444444 0.000000 0.555556 1.000000 0.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.222222 0.555556 0.222222 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [CG][AT]AC[AT][AT]CAA[CA][GAC][AG][TC]A[GA][CAG] -------------------------------------------------------------------------------- Time 1.26 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 10 llr = 138 E-value = 3.0e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :1411:1::1:21472:58:a pos.-specific C ::2:7:12123:5:1:1312: probability G 17:8:185256646239218: matrix T 924129:37212:::5::::: bits 2.3 2.0 1.8 * 1.6 * * Relative 1.4 * * * ** Entropy 1.1 * * ** * * ** (20.0 bits) 0.9 ** **** * ** * *** 0.7 ** ****** ***** * *** 0.5 ********* *********** 0.2 ********************* 0.0 --------------------- Multilevel TGAGCTGGTGGGCGATGAAGA consensus TT T TGCCAGAGG C C sequence C C T T A G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 4084 40 5.87e-10 AGGTTGTGAA TGTGTTGTTGGGGGAGGAAGA CTTAGTGGGT 7600 417 6.91e-09 CTTAAAACTG TTAGCTGGGGGGGGGGGCAGA CTTTTTTCAA 6531 297 3.79e-08 CAAAGGATTT TGTACTGGTCGGCAATGGACA AGGCGGCCAT 4752 241 5.07e-08 TCAAAGGATG TGTGCTCGCCGACGATGCAGA CACATCTGTG 260785 90 5.07e-08 GTGATATAGA TGAGCTGCTGCGCAATCCCGA GAGCCACCCT 262179 470 6.12e-08 TCTCTCAGGA TGTGCTGTTTCTCGCTGGAGA TGCTTGTCTT 4744 84 1.62e-07 AACGAGTGCA TGCTCGGGTGGGGGGAGAAGA GAGAGTGGTT 21078 78 5.64e-07 CAAACACAGC TAAGCTGCTGGAAGAAGAACA GCCGGAGGAT 4431 426 4.36e-06 CAACAGACGT GTCGTTATTTCGCAAGGAAGA TAAATTGAGT 8174 319 4.59e-06 TGATATCGTC TGAGATGGGATTGAATGAGGA GAACACTGCT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 4084 5.9e-10 39_[+2]_440 7600 6.9e-09 416_[+2]_63 6531 3.8e-08 296_[+2]_183 4752 5.1e-08 240_[+2]_239 260785 5.1e-08 89_[+2]_390 262179 6.1e-08 469_[+2]_10 4744 1.6e-07 83_[+2]_396 21078 5.6e-07 77_[+2]_402 4431 4.4e-06 425_[+2]_54 8174 4.6e-06 318_[+2]_161 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=10 4084 ( 40) TGTGTTGTTGGGGGAGGAAGA 1 7600 ( 417) TTAGCTGGGGGGGGGGGCAGA 1 6531 ( 297) TGTACTGGTCGGCAATGGACA 1 4752 ( 241) TGTGCTCGCCGACGATGCAGA 1 260785 ( 90) TGAGCTGCTGCGCAATCCCGA 1 262179 ( 470) TGTGCTGTTTCTCGCTGGAGA 1 4744 ( 84) TGCTCGGGTGGGGGGAGAAGA 1 21078 ( 78) TAAGCTGCTGGAAGAAGAACA 1 4431 ( 426) GTCGTTATTTCGCAAGGAAGA 1 8174 ( 319) TGAGATGGGATTGAATGAGGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 4800 bayes= 8.90388 E= 3.0e+000 -997 -997 -120 168 -148 -997 160 -49 52 -7 -997 51 -148 -997 180 -149 -148 174 -997 -49 -997 -997 -120 168 -148 -106 180 -997 -997 -7 112 9 -997 -106 -20 132 -148 -7 112 -49 -997 52 138 -149 -48 -997 138 -49 -148 125 80 -997 52 -997 138 -997 132 -106 -20 -997 -48 -997 38 83 -997 -106 197 -997 84 52 -20 -997 152 -106 -120 -997 -997 -7 180 -997 184 -997 -997 -997 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 10 E= 3.0e+000 0.000000 0.000000 0.100000 0.900000 0.100000 0.000000 0.700000 0.200000 0.400000 0.200000 0.000000 0.400000 0.100000 0.000000 0.800000 0.100000 0.100000 0.700000 0.000000 0.200000 0.000000 0.000000 0.100000 0.900000 0.100000 0.100000 0.800000 0.000000 0.000000 0.200000 0.500000 0.300000 0.000000 0.100000 0.200000 0.700000 0.100000 0.200000 0.500000 0.200000 0.000000 0.300000 0.600000 0.100000 0.200000 0.000000 0.600000 0.200000 0.100000 0.500000 0.400000 0.000000 0.400000 0.000000 0.600000 0.000000 0.700000 0.100000 0.200000 0.000000 0.200000 0.000000 0.300000 0.500000 0.000000 0.100000 0.900000 0.000000 0.500000 0.300000 0.200000 0.000000 0.800000 0.100000 0.100000 0.000000 0.000000 0.200000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[GT][ATC]G[CT]TG[GTC][TG][GCT][GC][GAT][CG][GA][AG][TGA]G[ACG]A[GC]A -------------------------------------------------------------------------------- Time 2.19 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 9 llr = 96 E-value = 2.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 6:::8::8:a:: pos.-specific C 1:621:::1::4 probability G ::21:a2:9::: matrix T 3a271:82::a6 bits 2.3 2.0 * 1.8 * * ** 1.6 * * *** Relative 1.4 * * *** Entropy 1.1 * ******* (15.3 bits) 0.9 * ******** 0.7 *********** 0.5 ************ 0.2 ************ 0.0 ------------ Multilevel ATCTAGTAGATT consensus T GC GT C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 260785 374 4.76e-07 GAGCAGGGAG ATCCAGTAGATC ACCATCCTTC 4752 14 8.60e-07 CTCTGTTGAC ATGTAGTAGATT CAAGACGAAG 6531 191 1.66e-06 CGTAAGACCA ATCTAGTTGATT GAAACGTCGT 21078 1 3.05e-06 . ATGCAGTAGATT TAGTCACTTG 8174 182 3.79e-06 CAACCGTAGG ATCTAGTACATT GACATCGCTA 4084 479 4.42e-06 CCAATGAAGA TTCTCGTAGATC TTATTGCAAA 7600 70 1.94e-05 ACGATGACGG TTTTTGTAGATC GCAACAGCAG 4744 133 2.42e-05 AAGCTTACTT CTCGAGGAGATT CTCTCTTCCT 262179 72 2.52e-05 CAGCGAATCG TTTTAGGTGATC AACTTTTTTC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 260785 4.8e-07 373_[+3]_115 4752 8.6e-07 13_[+3]_475 6531 1.7e-06 190_[+3]_298 21078 3e-06 [+3]_488 8174 3.8e-06 181_[+3]_307 4084 4.4e-06 478_[+3]_10 7600 1.9e-05 69_[+3]_419 4744 2.4e-05 132_[+3]_356 262179 2.5e-05 71_[+3]_417 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=9 260785 ( 374) ATCCAGTAGATC 1 4752 ( 14) ATGTAGTAGATT 1 6531 ( 191) ATCTAGTTGATT 1 21078 ( 1) ATGCAGTAGATT 1 8174 ( 182) ATCTAGTACATT 1 4084 ( 479) TTCTCGTAGATC 1 7600 ( 70) TTTTTGTAGATC 1 4744 ( 133) CTCGAGGAGATT 1 262179 ( 72) TTTTAGGTGATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 4890 bayes= 9.21757 E= 2.8e+002 99 -91 -982 25 -982 -982 -982 183 -982 141 -5 -34 -982 9 -105 125 148 -91 -982 -134 -982 -982 212 -982 -982 -982 -5 147 148 -982 -982 -34 -982 -91 195 -982 184 -982 -982 -982 -982 -982 -982 183 -982 108 -982 98 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 2.8e+002 0.555556 0.111111 0.000000 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.555556 0.222222 0.222222 0.000000 0.222222 0.111111 0.666667 0.777778 0.111111 0.000000 0.111111 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.222222 0.777778 0.777778 0.000000 0.000000 0.222222 0.000000 0.111111 0.888889 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.444444 0.000000 0.555556 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [AT]T[CGT][TC]AG[TG][AT]GAT[TC] -------------------------------------------------------------------------------- Time 3.03 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21078 2.17e-08 [+3(3.05e-06)]_49_[+1(3.74e-07)]_\ [+2(5.64e-07)]_402 260785 4.20e-10 89_[+2(5.07e-08)]_104_\ [+1(3.78e-05)]_23_[+1(3.74e-07)]_104_[+3(4.76e-07)]_115 262179 8.44e-08 71_[+3(2.52e-05)]_132_\ [+2(5.09e-06)]_84_[+1(1.85e-06)]_133_[+2(6.12e-08)]_10 4084 3.30e-11 39_[+2(5.87e-10)]_46_[+1(7.58e-05)]_\ 91_[+1(2.27e-07)]_249_[+3(4.42e-06)]_10 4431 3.20e-05 425_[+2(4.36e-06)]_37_\ [+1(4.93e-07)]_1 4744 2.78e-07 31_[+2(6.70e-05)]_31_[+2(1.62e-07)]_\ 28_[+3(2.42e-05)]_321_[+1(2.70e-06)]_19 4752 2.23e-10 13_[+3(8.60e-07)]_215_\ [+2(5.07e-08)]_23_[+1(1.05e-07)]_200 6531 1.88e-06 190_[+3(1.66e-06)]_94_\ [+2(3.79e-08)]_183 7600 1.72e-09 69_[+3(1.94e-05)]_1_[+1(3.09e-07)]_\ 318_[+2(6.91e-09)]_63 8174 3.42e-07 165_[+1(7.58e-07)]_[+3(3.79e-06)]_\ 125_[+2(4.59e-06)]_116_[+1(2.57e-05)]_29 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************