******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/262/262.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42593 1.0000 500 9147 1.0000 500 9020 1.0000 500 36973 1.0000 500 47984 1.0000 500 43557 1.0000 500 49491 1.0000 500 50496 1.0000 500 10872 1.0000 500 35595 1.0000 500 44180 1.0000 500 32960 1.0000 500 45843 1.0000 500 50269 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/262/262.seqs.fa -oc motifs/262 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 7000 N= 14 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.271 C 0.228 G 0.229 T 0.272 Background letter frequencies (from dataset with add-one prior applied): A 0.271 C 0.228 G 0.229 T 0.272 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 14 llr = 129 E-value = 9.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A :::6:514:79: pos.-specific C 11a3a1119:14 probability G 3::1::7:11:4 matrix T 69:::415:1:3 bits 2.1 * * 1.9 * * 1.7 * * * 1.5 * * * * Relative 1.3 ** * * * Entropy 1.1 ** * * * (13.3 bits) 0.9 *** * * *** 0.6 ***** * *** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TTCACAGTCAAC consensus G C T A G sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 35595 300 4.84e-07 GCCGCATAAT TTCACTGTCAAG GTCGCATCGA 42593 408 4.84e-07 CTGTCGACGC TTCACTGTCAAC GAGCTTCTGC 44180 405 3.08e-06 GACGATGGAA TTCACAGTCGAC CGCAACTTTA 10872 79 1.11e-05 TGATTTTAAA GTCACTGCCAAT CTACGGAAAA 36973 379 1.11e-05 GTGACGATTG CTCACAGTCAAT CAGGATTTTA 9147 251 1.57e-05 GACCTTTCTC TTCGCAGTCGAC AAAATATCGC 50269 62 1.89e-05 CACACTGCGT TCCCCTGTCAAT CTTGTATGAC 9020 83 2.75e-05 GTTTTTTAAA TTCGCTGTCTAG CTCTCTACGT 49491 3 4.16e-05 TT GTCCCATACAAT TCAGCACGTC 32960 41 4.52e-05 CTGATAGTTG TTCACCCACAAG GTGCATCACA 50496 60 4.85e-05 CCCGCGGTGT TTCACCGACACC AATACATGTG 45843 19 6.08e-05 GAAAGTCTCG GTCCCAGAGAAG ACATGGTGCT 43557 299 6.08e-05 TCATTTTGCT TTCCCAACCAAG AAAACCGGTA 47984 286 2.14e-04 GACACGAAAG GCCACATACTAC CTATCAAAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 35595 4.8e-07 299_[+1]_189 42593 4.8e-07 407_[+1]_81 44180 3.1e-06 404_[+1]_84 10872 1.1e-05 78_[+1]_410 36973 1.1e-05 378_[+1]_110 9147 1.6e-05 250_[+1]_238 50269 1.9e-05 61_[+1]_427 9020 2.7e-05 82_[+1]_406 49491 4.2e-05 2_[+1]_486 32960 4.5e-05 40_[+1]_448 50496 4.9e-05 59_[+1]_429 45843 6.1e-05 18_[+1]_470 43557 6.1e-05 298_[+1]_190 47984 0.00021 285_[+1]_203 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=14 35595 ( 300) TTCACTGTCAAG 1 42593 ( 408) TTCACTGTCAAC 1 44180 ( 405) TTCACAGTCGAC 1 10872 ( 79) GTCACTGCCAAT 1 36973 ( 379) CTCACAGTCAAT 1 9147 ( 251) TTCGCAGTCGAC 1 50269 ( 62) TCCCCTGTCAAT 1 9020 ( 83) TTCGCTGTCTAG 1 49491 ( 3) GTCCCATACAAT 1 32960 ( 41) TTCACCCACAAG 1 50496 ( 60) TTCACCGACACC 1 45843 ( 19) GTCCCAGAGAAG 1 43557 ( 299) TTCCCAACCAAG 1 47984 ( 286) GCCACATACTAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 8.93074 E= 9.2e+002 -1045 -167 32 124 -1045 -67 -1045 166 -1045 213 -1045 -1045 107 33 -68 -1045 -1045 213 -1045 -1045 88 -67 -1045 39 -192 -167 164 -93 40 -67 -1045 88 -1045 203 -168 -1045 140 -1045 -68 -93 177 -167 -1045 -1045 -1045 65 64 7 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 9.2e+002 0.000000 0.071429 0.285714 0.642857 0.000000 0.142857 0.000000 0.857143 0.000000 1.000000 0.000000 0.000000 0.571429 0.285714 0.142857 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.142857 0.000000 0.357143 0.071429 0.071429 0.714286 0.142857 0.357143 0.142857 0.000000 0.500000 0.000000 0.928571 0.071429 0.000000 0.714286 0.000000 0.142857 0.142857 0.928571 0.071429 0.000000 0.000000 0.000000 0.357143 0.357143 0.285714 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TG]TC[AC]C[AT]G[TA]CAA[CGT] -------------------------------------------------------------------------------- Time 1.69 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 13 llr = 122 E-value = 8.1e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 34:2::22:::2 pos.-specific C :28116:19::: probability G 2::2:1821:a1 matrix T 542693:5:a:8 bits 2.1 * 1.9 ** 1.7 *** 1.5 * * * *** Relative 1.3 * * * *** Entropy 1.1 * * * *** (13.5 bits) 0.9 * *** **** 0.6 * *** **** 0.4 ******* **** 0.2 ************ 0.0 ------------ Multilevel TACTTCGTCTGT consensus AT T G sequence C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 50269 469 1.37e-07 CAGCCTCTCT TTCTTCGTCTGT CCATTGCGGT 43557 237 4.45e-07 TCGTGCTTTG TACTTCGGCTGT GAAGTTCGTA 36973 359 3.22e-06 TCTAGTTAGG TTCCTCGTCTGT GACGATTGCT 45843 349 7.14e-06 TCCCCATCAT TTCTTCGTGTGT CTGAAAAAAA 44180 73 1.14e-05 AACCGATTCG TTCATTGGCTGT TCCATCTCTG 35595 449 1.56e-05 ATGGAGCGAG AATTTCGGCTGT ACTTCAGCGA 9020 288 1.68e-05 ATCATATACA TCCTTCGACTGA GAGTGAATGA 49491 271 1.86e-05 CTCCACAACA GCCTTCATCTGT TTTCCCTCTC 10872 298 3.29e-05 TGGTCGAAAA TATGTTGTCTGT CTACTGGTAT 42593 293 5.14e-05 AAGGTGCTTT GACGTCGTCTGG ACGATGACGC 50496 11 5.55e-05 TCTCGAAGGA ATCTTGATCTGT AGAGGCGAAA 9147 376 9.36e-05 AGCCCGTTTT ACCTTTGCCTGA ACCAGTCAGT 47984 129 1.98e-04 GTCCAGATGG AACACTGACTGT GAGCCGCAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 50269 1.4e-07 468_[+2]_20 43557 4.5e-07 236_[+2]_252 36973 3.2e-06 358_[+2]_130 45843 7.1e-06 348_[+2]_140 44180 1.1e-05 72_[+2]_416 35595 1.6e-05 448_[+2]_40 9020 1.7e-05 287_[+2]_201 49491 1.9e-05 270_[+2]_218 10872 3.3e-05 297_[+2]_191 42593 5.1e-05 292_[+2]_196 50496 5.6e-05 10_[+2]_478 9147 9.4e-05 375_[+2]_113 47984 0.0002 128_[+2]_360 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=13 50269 ( 469) TTCTTCGTCTGT 1 43557 ( 237) TACTTCGGCTGT 1 36973 ( 359) TTCCTCGTCTGT 1 45843 ( 349) TTCTTCGTGTGT 1 44180 ( 73) TTCATTGGCTGT 1 35595 ( 449) AATTTCGGCTGT 1 9020 ( 288) TCCTTCGACTGA 1 49491 ( 271) GCCTTCATCTGT 1 10872 ( 298) TATGTTGTCTGT 1 42593 ( 293) GACGTCGTCTGG 1 50496 ( 11) ATCTTGATCTGT 1 9147 ( 376) ACCTTTGCCTGA 1 47984 ( 129) AACACTGACTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6846 bayes= 9.56922 E= 8.1e+003 18 -1035 -57 99 50 2 -1035 50 -1035 189 -1035 -82 -82 -156 -57 118 -1035 -156 -1035 176 -1035 143 -157 18 -82 -1035 189 -1035 -82 -156 1 99 -1035 202 -157 -1035 -1035 -1035 -1035 188 -1035 -1035 213 -1035 -82 -1035 -157 150 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 13 E= 8.1e+003 0.307692 0.000000 0.153846 0.538462 0.384615 0.230769 0.000000 0.384615 0.000000 0.846154 0.000000 0.153846 0.153846 0.076923 0.153846 0.615385 0.000000 0.076923 0.000000 0.923077 0.000000 0.615385 0.076923 0.307692 0.153846 0.000000 0.846154 0.000000 0.153846 0.076923 0.230769 0.538462 0.000000 0.923077 0.076923 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.153846 0.000000 0.076923 0.769231 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TA][ATC]CTT[CT]G[TG]CTGT -------------------------------------------------------------------------------- Time 3.47 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 13 llr = 141 E-value = 1.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :2::232:212:2:6a pos.-specific C 1:635:8:42::264: probability G ::25:1::2:2824:: matrix T 982236:a28525::: bits 2.1 1.9 * * 1.7 * * 1.5 * ** * * Relative 1.3 * ** * * Entropy 1.1 ** ** * *** (15.7 bits) 0.9 ** ** * * *** 0.6 **** *** * * *** 0.4 ******** *** *** 0.2 ******** ******* 0.0 ---------------- Multilevel TTCGCTCTCTTGTCAA consensus ATCTA A A GGC sequence A T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 49491 284 1.54e-09 TTCATCTGTT TTCCCTCTCTTGTCAA TTTCGCATTT 9020 454 1.03e-07 TCGCCTCCGG TTCGCACTATTGGCCA CGGTGAAGAA 44180 229 4.70e-07 ACAGTACCAG TATGCTCTCTGGTCAA ATGGATGCCA 10872 453 1.00e-06 CCTTCGTAAG TTCGAACTCCTGGCAA ACTGGGCTAT 43557 365 1.95e-06 CCTCCGAGCC TTCTCTCTTTTTTCAA TAAAGCCGAG 35595 183 4.66e-06 GTCTCGGTCG TTGGTTCTGCTGTGAA GGATCTTCCG 47984 452 4.66e-06 ATCTTGTCAA TTTTCTCTATAGTGCA ACAACTCATT 32960 14 7.70e-06 GCATCTCTAC TACGCTCTCAAGGGAA TCTGATAGTT 9147 25 1.05e-05 GAAAACAGCA TTCCAACTTTTTCCAA TATCGACAGT 50269 280 1.12e-05 GAAAAATCTG CTCGATCTTTGGCCCA AACCCCTTTA 45843 63 1.12e-05 TTTTCAGACT TTCGTGATCTGGTGCA GCATCCGAGA 36973 67 1.70e-05 CCATGGAATG TAGCTACTATTGACAA GTGACACTTG 50496 383 4.56e-05 ACTTCCAGAA TTTCTTATGTAGAGCA GGGAAGTTTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49491 1.5e-09 283_[+3]_201 9020 1e-07 453_[+3]_31 44180 4.7e-07 228_[+3]_256 10872 1e-06 452_[+3]_32 43557 1.9e-06 364_[+3]_120 35595 4.7e-06 182_[+3]_302 47984 4.7e-06 451_[+3]_33 32960 7.7e-06 13_[+3]_471 9147 1e-05 24_[+3]_460 50269 1.1e-05 279_[+3]_205 45843 1.1e-05 62_[+3]_422 36973 1.7e-05 66_[+3]_418 50496 4.6e-05 382_[+3]_102 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=13 49491 ( 284) TTCCCTCTCTTGTCAA 1 9020 ( 454) TTCGCACTATTGGCCA 1 44180 ( 229) TATGCTCTCTGGTCAA 1 10872 ( 453) TTCGAACTCCTGGCAA 1 43557 ( 365) TTCTCTCTTTTTTCAA 1 35595 ( 183) TTGGTTCTGCTGTGAA 1 47984 ( 452) TTTTCTCTATAGTGCA 1 32960 ( 14) TACGCTCTCAAGGGAA 1 9147 ( 25) TTCCAACTTTTTCCAA 1 50269 ( 280) CTCGATCTTTGGCCCA 1 45843 ( 63) TTCGTGATCTGGTGCA 1 36973 ( 67) TAGCTACTATTGACAA 1 50496 ( 383) TTTCTTATGTAGAGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6790 bayes= 9.55736 E= 1.7e+002 -1035 -156 -1035 176 -23 -1035 -1035 150 -1035 143 -57 -24 -1035 43 123 -82 -23 102 -1035 18 18 -1035 -157 118 -82 189 -1035 -1035 -1035 -1035 -1035 188 -23 75 -57 -24 -182 -57 -1035 150 -23 -1035 1 99 -1035 -1035 189 -82 -82 -57 1 76 -1035 143 75 -1035 118 75 -1035 -1035 188 -1035 -1035 -1035 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 13 E= 1.7e+002 0.000000 0.076923 0.000000 0.923077 0.230769 0.000000 0.000000 0.769231 0.000000 0.615385 0.153846 0.230769 0.000000 0.307692 0.538462 0.153846 0.230769 0.461538 0.000000 0.307692 0.307692 0.000000 0.076923 0.615385 0.153846 0.846154 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.230769 0.384615 0.153846 0.230769 0.076923 0.153846 0.000000 0.769231 0.230769 0.000000 0.230769 0.538462 0.000000 0.000000 0.846154 0.153846 0.153846 0.153846 0.230769 0.461538 0.000000 0.615385 0.384615 0.000000 0.615385 0.384615 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[TA][CT][GC][CTA][TA]CT[CAT]T[TAG]G[TG][CG][AC]A -------------------------------------------------------------------------------- Time 5.35 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42593 2.76e-04 292_[+2(5.14e-05)]_103_\ [+1(4.84e-07)]_81 9147 1.77e-04 24_[+3(1.05e-05)]_210_\ [+1(1.57e-05)]_113_[+2(9.36e-05)]_113 9020 1.10e-06 82_[+1(2.75e-05)]_193_\ [+2(1.68e-05)]_154_[+3(1.03e-07)]_31 36973 1.07e-05 66_[+3(1.70e-05)]_276_\ [+2(3.22e-06)]_8_[+1(1.11e-05)]_110 47984 1.45e-03 451_[+3(4.66e-06)]_33 43557 1.20e-06 236_[+2(4.45e-07)]_50_\ [+1(6.08e-05)]_54_[+3(1.95e-06)]_120 49491 3.83e-08 2_[+1(4.16e-05)]_256_[+2(1.86e-05)]_\ 1_[+3(1.54e-09)]_201 50496 1.03e-03 10_[+2(5.55e-05)]_37_[+1(4.85e-05)]_\ 311_[+3(4.56e-05)]_102 10872 6.79e-06 78_[+1(1.11e-05)]_207_\ [+2(3.29e-05)]_143_[+3(1.00e-06)]_32 35595 8.40e-07 182_[+3(4.66e-06)]_101_\ [+1(4.84e-07)]_137_[+2(1.56e-05)]_40 44180 4.23e-07 72_[+2(1.14e-05)]_144_\ [+3(4.70e-07)]_160_[+1(3.08e-06)]_84 32960 3.58e-03 13_[+3(7.70e-06)]_11_[+1(4.52e-05)]_\ 448 45843 6.62e-05 18_[+1(6.08e-05)]_32_[+3(1.12e-05)]_\ 270_[+2(7.14e-06)]_140 50269 7.06e-07 61_[+1(1.89e-05)]_206_\ [+3(1.12e-05)]_173_[+2(1.37e-07)]_20 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************