******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/487/487.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 46336 1.0000 500 12985 1.0000 500 3730 1.0000 500 32559 1.0000 500 26635 1.0000 500 11336 1.0000 500 54342 1.0000 500 26780 1.0000 500 34964 1.0000 500 32219 1.0000 500 37602 1.0000 500 40694 1.0000 500 47833 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/487/487.seqs.fa -oc motifs/487 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.259 C 0.247 G 0.234 T 0.260 Background letter frequencies (from dataset with add-one prior applied): A 0.259 C 0.247 G 0.234 T 0.260 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 11 llr = 122 E-value = 1.9e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A ::::1:22:529 pos.-specific C :::1:a1:a38: probability G 2::25:7::1:1 matrix T 8aa75::8:2:: bits 2.1 * * 1.9 ** * * 1.7 ** * * 1.5 ** * * * Relative 1.3 *** * ** ** Entropy 1.0 *** **** ** (16.0 bits) 0.8 **** **** ** 0.6 ********* ** 0.4 ********* ** 0.2 ************ 0.0 ------------ Multilevel TTTTGCGTCACA consensus T C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 12985 455 2.02e-07 CGACGCGACT TTTTGCGTCCCA CAACGTCGTG 40694 479 2.71e-07 GTTTTCTCTT TTTTTCGTCCCA AATTTGTGCC 26635 456 1.15e-06 CCCCAACAGC TTTTTCGACACA CAAAGCTCGA 46336 266 1.30e-06 TCTACGATGG TTTTACGTCACA GACTCAGGAC 34964 393 2.28e-06 GTTTTTCGAC TTTTTCGTCCAA CTTCGTACGG 32219 123 3.63e-06 AGCGTTTTTC GTTGGCGTCACA AACACATCAA 26780 9 6.31e-06 TCGTACAG TTTTTCCTCTCA CCGTCCACGC 32559 357 6.31e-06 CGTTCTCTTC TTTTTCGTCTCG CCGTGAAACA 47833 392 8.32e-06 AGGACCAGCT TTTCGCATCACA ATGTACAAAA 54342 286 1.57e-05 CGCCTCCTTT GTTTGCATCGCA AATACAATGT 3730 321 1.57e-05 AGGCGACTAC TTTGGCGACAAA TCTGTCCGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 12985 2e-07 454_[+1]_34 40694 2.7e-07 478_[+1]_10 26635 1.1e-06 455_[+1]_33 46336 1.3e-06 265_[+1]_223 34964 2.3e-06 392_[+1]_96 32219 3.6e-06 122_[+1]_366 26780 6.3e-06 8_[+1]_480 32559 6.3e-06 356_[+1]_132 47833 8.3e-06 391_[+1]_97 54342 1.6e-05 285_[+1]_203 3730 1.6e-05 320_[+1]_168 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=11 12985 ( 455) TTTTGCGTCCCA 1 40694 ( 479) TTTTTCGTCCCA 1 26635 ( 456) TTTTTCGACACA 1 46336 ( 266) TTTTACGTCACA 1 34964 ( 393) TTTTTCGTCCAA 1 32219 ( 123) GTTGGCGTCACA 1 26780 ( 9) TTTTTCCTCTCA 1 32559 ( 357) TTTTTCGTCTCG 1 47833 ( 392) TTTCGCATCACA 1 54342 ( 286) GTTTGCATCGCA 1 3730 ( 321) TTTGGCGACAAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 9.52784 E= 1.9e-002 -1010 -1010 -36 165 -1010 -1010 -1010 194 -1010 -1010 -1010 194 -1010 -144 -36 148 -151 -1010 96 81 -1010 202 -1010 -1010 -51 -144 164 -1010 -51 -1010 -1010 165 -1010 202 -1010 -1010 81 14 -136 -52 -51 173 -1010 -1010 181 -1010 -136 -1010 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 1.9e-002 0.000000 0.000000 0.181818 0.818182 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.090909 0.181818 0.727273 0.090909 0.000000 0.454545 0.454545 0.000000 1.000000 0.000000 0.000000 0.181818 0.090909 0.727273 0.000000 0.181818 0.000000 0.000000 0.818182 0.000000 1.000000 0.000000 0.000000 0.454545 0.272727 0.090909 0.181818 0.181818 0.818182 0.000000 0.000000 0.909091 0.000000 0.090909 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TTTT[GT]CGTC[AC]CA -------------------------------------------------------------------------------- Time 1.84 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 21 sites = 5 llr = 96 E-value = 6.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A a28:4:a:a:62:6a:2622: pos.-specific C :::::8:::24:4::8:442a probability G :82a62:8:8:86:::8:22: matrix T :::::::2:::::4:2::24: bits 2.1 * * 1.9 * * * * * * 1.7 * * * * * * 1.5 * * * * * * Relative 1.3 **** ***** * *** * Entropy 1.0 ****************** * (27.8 bits) 0.8 ****************** * 0.6 ****************** * 0.4 ****************** * 0.2 ****************** * 0.0 --------------------- Multilevel AGAGGCAGAGAGGAACGACTC consensus AG AG T CCACT TACAA sequence GC TG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------- 32219 381 9.53e-11 GACAGTTAGT AGAGACAGAGAGCAACGAACC TTGCTGGATA 26780 178 3.89e-10 GCCAGGATGC AGAGGCAGAGCGGTATGCTTC GGTGGTCCCC 26635 414 7.59e-10 CTCCTACTCG AGAGGGAGAGAAGTACGACGC TCGGATCATC 11336 342 9.99e-10 TCTTGACGGA AGAGGCATACCGGAACGACAC TACCAAATCT 40694 165 6.17e-09 AGACCTATTT AAGGACAGAGAGCAACACGTC ACTGTTGTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 32219 9.5e-11 380_[+2]_99 26780 3.9e-10 177_[+2]_302 26635 7.6e-10 413_[+2]_66 11336 1e-09 341_[+2]_138 40694 6.2e-09 164_[+2]_315 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=21 seqs=5 32219 ( 381) AGAGACAGAGAGCAACGAACC 1 26780 ( 178) AGAGGCAGAGCGGTATGCTTC 1 26635 ( 414) AGAGGGAGAGAAGTACGACGC 1 11336 ( 342) AGAGGCATACCGGAACGACAC 1 40694 ( 165) AAGGACAGAGAGCAACACGTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 21 n= 6240 bayes= 10.536 E= 6.3e+001 195 -897 -897 -897 -37 -897 177 -897 163 -897 -23 -897 -897 -897 209 -897 63 -897 136 -897 -897 169 -23 -897 195 -897 -897 -897 -897 -897 177 -38 195 -897 -897 -897 -897 -31 177 -897 121 69 -897 -897 -37 -897 177 -897 -897 69 136 -897 121 -897 -897 62 195 -897 -897 -897 -897 169 -897 -38 -37 -897 177 -897 121 69 -897 -897 -37 69 -23 -38 -37 -31 -23 62 -897 201 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 21 nsites= 5 E= 6.3e+001 1.000000 0.000000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.800000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.200000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.600000 0.400000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.400000 0.600000 0.000000 0.600000 0.000000 0.000000 0.400000 1.000000 0.000000 0.000000 0.000000 0.000000 0.800000 0.000000 0.200000 0.200000 0.000000 0.800000 0.000000 0.600000 0.400000 0.000000 0.000000 0.200000 0.400000 0.200000 0.200000 0.200000 0.200000 0.200000 0.400000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[GA][AG]G[GA][CG]A[GT]A[GC][AC][GA][GC][AT]A[CT][GA][AC][CAGT][TACG]C -------------------------------------------------------------------------------- Time 3.51 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 7 llr = 98 E-value = 2.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1:6::9:96141:3:: pos.-specific C :::711:1:4:1:69a probability G :a419:9:41::a1:: matrix T 9::1::1::367::1: bits 2.1 * * * 1.9 * * * 1.7 * * * 1.5 * **** * ** Relative 1.3 ** **** * ** Entropy 1.0 *** ***** * * ** (20.2 bits) 0.8 ********* *** ** 0.6 ********* ****** 0.4 ********* ****** 0.2 **************** 0.0 ---------------- Multilevel TGACGAGAACTTGCCC consensus G GTA A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 3730 380 2.71e-08 GCGCTGTTTG TGAGGAGAACTTGACC ACATGGGTAC 46336 138 2.71e-08 AGCCGATAGC TGACGAGCATTTGCCC AGGTTTTCGT 26635 103 1.71e-07 CTGTTTGCTC TGATGAGAGCTTGCTC TTTTTCTTAT 32219 174 1.87e-07 TCACATTTCG TGACCAGAGGTTGACC GTACACAGGC 11336 105 2.40e-07 GGCAGTTGCG AGGCGAGAACAAGCCC AGTGACGATG 37602 294 4.61e-07 GTGTAGAATG TGGCGCGAAAATGGCC TCGGTGACTG 54342 404 4.61e-07 AGGACGAAAG TGGCGATAGTACGCCC AAAGACAATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 3730 2.7e-08 379_[+3]_105 46336 2.7e-08 137_[+3]_347 26635 1.7e-07 102_[+3]_382 32219 1.9e-07 173_[+3]_311 11336 2.4e-07 104_[+3]_380 37602 4.6e-07 293_[+3]_191 54342 4.6e-07 403_[+3]_81 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=7 3730 ( 380) TGAGGAGAACTTGACC 1 46336 ( 138) TGACGAGCATTTGCCC 1 26635 ( 103) TGATGAGAGCTTGCTC 1 32219 ( 174) TGACCAGAGGTTGACC 1 11336 ( 105) AGGCGAGAACAAGCCC 1 37602 ( 294) TGGCGCGAAAATGGCC 1 54342 ( 404) TGGCGATAGTACGCCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 10.4196 E= 2.7e+002 -86 -945 -945 172 -945 -945 209 -945 114 -945 87 -945 -945 153 -71 -86 -945 -79 187 -945 173 -79 -945 -945 -945 -945 187 -86 173 -79 -945 -945 114 -945 87 -945 -86 79 -71 14 73 -945 -945 113 -86 -79 -945 146 -945 -945 209 -945 14 121 -71 -945 -945 179 -945 -86 -945 201 -945 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 2.7e+002 0.142857 0.000000 0.000000 0.857143 0.000000 0.000000 1.000000 0.000000 0.571429 0.000000 0.428571 0.000000 0.000000 0.714286 0.142857 0.142857 0.000000 0.142857 0.857143 0.000000 0.857143 0.142857 0.000000 0.000000 0.000000 0.000000 0.857143 0.142857 0.857143 0.142857 0.000000 0.000000 0.571429 0.000000 0.428571 0.000000 0.142857 0.428571 0.142857 0.285714 0.428571 0.000000 0.000000 0.571429 0.142857 0.142857 0.000000 0.714286 0.000000 0.000000 1.000000 0.000000 0.285714 0.571429 0.142857 0.000000 0.000000 0.857143 0.000000 0.142857 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TG[AG]CGAGA[AG][CT][TA]TG[CA]CC -------------------------------------------------------------------------------- Time 4.97 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46336 1.40e-06 137_[+3(2.71e-08)]_112_\ [+1(1.30e-06)]_223 12985 1.99e-03 454_[+1(2.02e-07)]_34 3730 1.20e-05 320_[+1(1.57e-05)]_47_\ [+3(2.71e-08)]_105 32559 4.79e-02 356_[+1(6.31e-06)]_132 26635 9.06e-12 102_[+3(1.71e-07)]_295_\ [+2(7.59e-10)]_21_[+1(1.15e-06)]_33 11336 1.37e-08 104_[+3(2.40e-07)]_221_\ [+2(9.99e-10)]_138 54342 7.06e-05 285_[+1(1.57e-05)]_106_\ [+3(4.61e-07)]_81 26780 1.26e-07 8_[+1(6.31e-06)]_157_[+2(3.89e-10)]_\ 302 34964 8.09e-03 392_[+1(2.28e-06)]_96 32219 4.15e-12 122_[+1(3.63e-06)]_39_\ [+3(1.87e-07)]_119_[+1(3.11e-05)]_60_[+2(9.53e-11)]_99 37602 7.76e-04 96_[+2(9.83e-05)]_176_\ [+3(4.61e-07)]_191 40694 4.13e-08 164_[+2(6.17e-09)]_293_\ [+1(2.71e-07)]_10 47833 4.55e-02 391_[+1(8.32e-06)]_97 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************