******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/239/239.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 9427 1.0000 500 231 1.0000 500 13396 1.0000 500 13417 1.0000 500 14192 1.0000 500 48664 1.0000 500 32784 1.0000 500 49943 1.0000 500 44892 1.0000 500 43300 1.0000 500 49472 1.0000 500 45942 1.0000 500 47243 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/239/239.seqs.fa -oc motifs/239 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 13 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 6500 N= 13 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.244 C 0.269 G 0.243 T 0.244 Background letter frequencies (from dataset with add-one prior applied): A 0.244 C 0.269 G 0.243 T 0.244 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 9 llr = 103 E-value = 4.3e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a71:6699::4: pos.-specific C :3724:::a::a probability G ::27:41::a:: matrix T :::1:::1::6: bits 2.0 * * 1.8 * ** * 1.6 * **** * 1.4 * **** * Relative 1.2 * **** * Entropy 1.0 ** ******** (16.6 bits) 0.8 ************ 0.6 ************ 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel AACGAAAACGTC consensus CGCCG A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 13396 471 5.92e-08 ACAAGTCTAC AACGAAAACGTC GACCTCAGCA 231 463 4.32e-07 AAGGAAGCCG AACGCGAACGTC GACGGATCCA 49472 252 5.62e-07 GTGAATGGGG AACGCGAACGAC CGTTCGTGGG 48664 462 7.46e-07 GCAACAAACA ACCGAAAACGAC ACCAACGACC 49943 475 4.01e-06 TTGTGCGATT ACGGCGAACGTC GACCGAACGA 14192 394 4.01e-06 CTACAGTCCG AACTCGAACGTC GCCTCGACCA 47243 461 8.44e-06 GTTGTTCCGG AAGGAAATCGAC CGCTATTGAC 9427 483 8.44e-06 AAAAGAGTTC AACCAAGACGTC TTTCAC 43300 33 1.21e-05 GTATTGAAGA ACACAAAACGAC GATGAAGATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 13396 5.9e-08 470_[+1]_18 231 4.3e-07 462_[+1]_26 49472 5.6e-07 251_[+1]_237 48664 7.5e-07 461_[+1]_27 49943 4e-06 474_[+1]_14 14192 4e-06 393_[+1]_95 47243 8.4e-06 460_[+1]_28 9427 8.4e-06 482_[+1]_6 43300 1.2e-05 32_[+1]_456 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=9 13396 ( 471) AACGAAAACGTC 1 231 ( 463) AACGCGAACGTC 1 49472 ( 252) AACGCGAACGAC 1 48664 ( 462) ACCGAAAACGAC 1 49943 ( 475) ACGGCGAACGTC 1 14192 ( 394) AACTCGAACGTC 1 47243 ( 461) AAGGAAATCGAC 1 9427 ( 483) AACCAAGACGTC 1 43300 ( 33) ACACAAAACGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 9.59664 E= 4.3e+001 204 -982 -982 -982 145 31 -982 -982 -113 131 -13 -982 -982 -28 145 -113 119 72 -982 -982 119 -982 87 -982 187 -982 -113 -982 187 -982 -982 -113 -982 189 -982 -982 -982 -982 204 -982 87 -982 -982 119 -982 189 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 4.3e+001 1.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.111111 0.666667 0.222222 0.000000 0.000000 0.222222 0.666667 0.111111 0.555556 0.444444 0.000000 0.000000 0.555556 0.000000 0.444444 0.000000 0.888889 0.000000 0.111111 0.000000 0.888889 0.000000 0.000000 0.111111 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.444444 0.000000 0.000000 0.555556 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[AC][CG][GC][AC][AG]AACG[TA]C -------------------------------------------------------------------------------- Time 1.45 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 12 llr = 120 E-value = 6.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 9298313822:: pos.-specific C 16114::1:8:8 probability G :3:1:9716:a1 matrix T ::::3:::3::1 bits 2.0 * 1.8 * 1.6 * * * * 1.4 * * * * Relative 1.2 * ** * * ** Entropy 1.0 * ** *** *** (14.4 bits) 0.8 * ** *** *** 0.6 **** ******* 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel ACAACGGAGCGC consensus G T A T sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 13396 40 4.82e-07 TCCGACAATG ACAATGAAGCGC TTGCGAATCG 9427 26 4.82e-07 GAGGTGGATC AGAATGGAGCGC CAGTCGATAA 13417 436 3.42e-06 AGCACCGACT ACAACAGAGCGC TGTCTATCAG 32784 437 4.54e-06 TACCAATCGT ACCATGGAGCGC ATCAATATGC 44892 339 5.17e-06 CAACCATCAC ACAACGGAAAGC GCCCCGTACA 14192 159 5.17e-06 AAAATGTGCC AAAATGAATCGC AAAGTAGAAG 47243 404 8.47e-06 CACCTCCATT ACAAAGGAAAGC ATATCATTCG 231 319 1.44e-05 GAGCGCTTCG ACAACGAATCGG CCGAGGCGGA 43300 413 1.53e-05 ATTCCGGCAG CGAACGAAGCGC CGACGCTTCC 45942 257 1.70e-05 GGGTTGCGCG AAAAAGGCGCGC TTACGGCCGG 49943 283 5.47e-05 TTTTCGTCTC ACACCGGATCGT CCCAATCCGC 49472 109 6.64e-05 TTACCTAGAG AGAGAGGGGCGC CCGGGGACAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 13396 4.8e-07 39_[+2]_449 9427 4.8e-07 25_[+2]_463 13417 3.4e-06 435_[+2]_53 32784 4.5e-06 436_[+2]_52 44892 5.2e-06 338_[+2]_150 14192 5.2e-06 158_[+2]_330 47243 8.5e-06 403_[+2]_85 231 1.4e-05 318_[+2]_170 43300 1.5e-05 412_[+2]_76 45942 1.7e-05 256_[+2]_232 49943 5.5e-05 282_[+2]_206 49472 6.6e-05 108_[+2]_380 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=12 13396 ( 40) ACAATGAAGCGC 1 9427 ( 26) AGAATGGAGCGC 1 13417 ( 436) ACAACAGAGCGC 1 32784 ( 437) ACCATGGAGCGC 1 44892 ( 339) ACAACGGAAAGC 1 14192 ( 159) AAAATGAATCGC 1 47243 ( 404) ACAAAGGAAAGC 1 231 ( 319) ACAACGAATCGG 1 43300 ( 413) CGAACGAAGCGC 1 45942 ( 257) AAAAAGGCGCGC 1 49943 ( 283) ACACCGGATCGT 1 49472 ( 109) AGAGAGGGGCGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 6357 bayes= 9.49463 E= 6.8e+001 191 -169 -1023 -1023 -55 111 4 -1023 191 -169 -1023 -1023 177 -169 -154 -1023 4 63 -1023 45 -155 -1023 191 -1023 45 -1023 145 -1023 177 -169 -154 -1023 -55 -1023 126 4 -55 163 -1023 -1023 -1023 -1023 204 -1023 -1023 163 -154 -155 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 6.8e+001 0.916667 0.083333 0.000000 0.000000 0.166667 0.583333 0.250000 0.000000 0.916667 0.083333 0.000000 0.000000 0.833333 0.083333 0.083333 0.000000 0.250000 0.416667 0.000000 0.333333 0.083333 0.000000 0.916667 0.000000 0.333333 0.000000 0.666667 0.000000 0.833333 0.083333 0.083333 0.000000 0.166667 0.000000 0.583333 0.250000 0.166667 0.833333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.083333 0.083333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[CG]AA[CTA]G[GA]A[GT]CGC -------------------------------------------------------------------------------- Time 2.97 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 12 llr = 135 E-value = 1.1e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 1:14:11:::32:::: pos.-specific C 163::2121:2322:: probability G 1:42:32:8:333:a: matrix T 8424a4781a2368:a bits 2.0 * * ** 1.8 * * ** 1.6 * * ** 1.4 * * * *** Relative 1.2 * *** *** Entropy 1.0 * * *** *** (16.2 bits) 0.8 ** * *** *** 0.6 ** ** **** **** 0.4 ** ** **** **** 0.2 ********** **** 0.0 ---------------- Multilevel TCGATTTTGTATTTGT consensus TCT G GCG sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 231 61 5.10e-08 TGAGTAGATA TTGTTGTTGTTGTTGT TGTTGTTGTT 48664 405 3.06e-07 CGGCGCTTTC TCCTTTTTGTCTCTGT CAATCGTCCT 49943 321 3.50e-07 ACGAGACTCT TCCATTGTGTGTGTGT GTGTGTGTGT 44892 377 5.61e-07 CATTGTTCGC TTCATTTCGTGATTGT GTATCGGATT 43300 461 6.31e-07 CCCGTGGCCC TCGTTGTTTTACTTGT AATCGAATTG 49472 159 8.70e-07 ACGTTTCGTT GCCATTTTGTTTTTGT CCGACTCCGG 47243 56 1.94e-06 GGAGAAGTTC TTTATGCTGTACTTGT AACTTCCGTA 45942 378 2.12e-06 TGCGTTTTGT TTGGTATTGTACGTGT ACGGGTGGAT 13417 22 5.00e-06 GTCAGAGCCG TTGTTGTCGTCGTCGT CGGAATGGCC 32784 235 1.36e-05 ATCATCGATT TCTGTCATGTAGGTGT TGTATAAGAC 13396 107 2.75e-05 GGCACTCATC CCGTTCGTCTGTTTGT TCCAGCGCGC 9427 313 3.72e-05 CAGGGGAAAT ACAATTTTGTGACCGT TTGTAACAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 231 5.1e-08 60_[+3]_424 48664 3.1e-07 404_[+3]_80 49943 3.5e-07 320_[+3]_164 44892 5.6e-07 376_[+3]_108 43300 6.3e-07 460_[+3]_24 49472 8.7e-07 158_[+3]_326 47243 1.9e-06 55_[+3]_429 45942 2.1e-06 377_[+3]_107 13417 5e-06 21_[+3]_463 32784 1.4e-05 234_[+3]_250 13396 2.8e-05 106_[+3]_378 9427 3.7e-05 312_[+3]_172 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=12 231 ( 61) TTGTTGTTGTTGTTGT 1 48664 ( 405) TCCTTTTTGTCTCTGT 1 49943 ( 321) TCCATTGTGTGTGTGT 1 44892 ( 377) TTCATTTCGTGATTGT 1 43300 ( 461) TCGTTGTTTTACTTGT 1 49472 ( 159) GCCATTTTGTTTTTGT 1 47243 ( 56) TTTATGCTGTACTTGT 1 45942 ( 378) TTGGTATTGTACGTGT 1 13417 ( 22) TTGTTGTCGTCGTCGT 1 32784 ( 235) TCTGTCATGTAGGTGT 1 13396 ( 107) CCGTTCGTCTGTTTGT 1 9427 ( 313) ACAATTTTGTGACCGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 6305 bayes= 10.1356 E= 1.1e+001 -155 -169 -154 162 -1023 111 -1023 77 -155 31 78 -55 77 -1023 -54 77 -1023 -1023 -1023 204 -155 -69 46 77 -155 -169 -54 145 -1023 -69 -1023 177 -1023 -169 178 -155 -1023 -1023 -1023 204 45 -69 46 -55 -55 -11 4 45 -1023 -69 4 126 -1023 -69 -1023 177 -1023 -1023 204 -1023 -1023 -1023 -1023 204 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 12 E= 1.1e+001 0.083333 0.083333 0.083333 0.750000 0.000000 0.583333 0.000000 0.416667 0.083333 0.333333 0.416667 0.166667 0.416667 0.000000 0.166667 0.416667 0.000000 0.000000 0.000000 1.000000 0.083333 0.166667 0.333333 0.416667 0.083333 0.083333 0.166667 0.666667 0.000000 0.166667 0.000000 0.833333 0.000000 0.083333 0.833333 0.083333 0.000000 0.000000 0.000000 1.000000 0.333333 0.166667 0.333333 0.166667 0.166667 0.250000 0.250000 0.333333 0.000000 0.166667 0.250000 0.583333 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[CT][GC][AT]T[TG]TTGT[AG][TCG][TG]TGT -------------------------------------------------------------------------------- Time 4.50 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 9427 3.10e-06 25_[+2(4.82e-07)]_275_\ [+3(3.72e-05)]_154_[+1(8.44e-06)]_6 231 1.15e-08 60_[+3(5.10e-08)]_2_[+3(5.10e-08)]_\ 224_[+2(1.44e-05)]_132_[+1(4.32e-07)]_26 13396 2.64e-08 39_[+2(4.82e-07)]_55_[+3(2.75e-05)]_\ 348_[+1(5.92e-08)]_18 13417 3.55e-04 21_[+3(5.00e-06)]_398_\ [+2(3.42e-06)]_53 14192 2.71e-04 158_[+2(5.17e-06)]_223_\ [+1(4.01e-06)]_95 48664 6.62e-06 404_[+3(3.06e-07)]_41_\ [+1(7.46e-07)]_27 32784 5.35e-04 234_[+3(1.36e-05)]_186_\ [+2(4.54e-06)]_52 49943 1.68e-06 282_[+2(5.47e-05)]_26_\ [+3(3.50e-07)]_138_[+1(4.01e-06)]_14 44892 3.59e-05 338_[+2(5.17e-06)]_26_\ [+3(5.61e-07)]_108 43300 2.47e-06 32_[+1(1.21e-05)]_368_\ [+2(1.53e-05)]_36_[+3(6.31e-07)]_24 49472 7.74e-07 108_[+2(6.64e-05)]_38_\ [+3(8.70e-07)]_77_[+1(5.62e-07)]_237 45942 5.96e-04 256_[+2(1.70e-05)]_109_\ [+3(2.12e-06)]_107 47243 2.88e-06 55_[+3(1.94e-06)]_332_\ [+2(8.47e-06)]_45_[+1(8.44e-06)]_28 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************