******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/304/304.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 36828 1.0000 500 47001 1.0000 500 676 1.0000 500 48607 1.0000 500 49759 1.0000 500 16850 1.0000 500 44901 1.0000 500 34710 1.0000 500 51797 1.0000 500 51931 1.0000 500 33088 1.0000 500 48608 1.0000 500 49202 1.0000 500 43361 1.0000 500 49650 1.0000 500 34261 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/304/304.seqs.fa -oc motifs/304 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 16 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 8000 N= 16 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.277 C 0.240 G 0.227 T 0.255 Background letter frequencies (from dataset with add-one prior applied): A 0.277 C 0.240 G 0.227 T 0.255 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 20 sites = 14 llr = 174 E-value = 2.4e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1842:42314::4:11:11: pos.-specific C 4:128:21:3::19:::4:7 probability G 6:5:2611112921692:11 matrix T :2:6::5583812:218581 bits 2.1 1.9 1.7 * * 1.5 * * Relative 1.3 * ** * ** Entropy 1.1 * ** * ** * ** * (17.9 bits) 0.9 ** ** * ** **** ** 0.6 ****** * ** ******* 0.4 ****** * ** ******* 0.2 ************ ******* 0.0 -------------------- Multilevel GAGTCGTTTATGACGGTTTC consensus CTAAGAAA CG G T GC sequence C C T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------------- 51797 39 1.24e-09 AGAACGAAGC GAATCGTTTAGGGCGGTCTC GATTTCTGTA 51931 96 2.69e-08 GATTGGAGAG GAATCGTATTGGACGGTTAC TTGATCACAA 676 72 3.89e-08 CCCGTTCCAA AAGTCGAATCTGGCGGTTTC GACTGCCTCT 49759 33 4.93e-08 TCCTTTCCAC CAATCGATTCTGTCGGTCAC TATTGACGGG 44901 336 1.84e-07 CCATTATCCG CAGACATTTTTGTCGGTATG AATGTTCACT 47001 98 2.25e-07 TCCTCGGTCC GAGCGATTTCTGGCGGTCGC TTCACCTGCG 49650 177 4.01e-07 TGATGCAGCT GAATCGCCAATGCCTGTTTC ATATGGTGAT 16850 427 1.06e-06 ACCTTCGACG GTACCATGTATGCCGGTTTG CTCTCGACGA 34261 169 1.46e-06 CAAAAGTATC CAGTCAAAAATGACAGGCTC GCGCCGCTTC 49202 87 3.86e-06 ATGTACGACC GAGTCGCTGGTGAGGGTTTT TTGTCGTTGG 43361 339 4.43e-06 CCAGTTACCC GTGCGGTTTTTGACTTTTTT GACACGCCAT 48608 229 4.74e-06 ACGAACATCG GACTGGCCTAGGACGGGATC GCGAGATGCT 33088 404 5.79e-06 TTGGATACAA CTGACATTTTTGACAAGCTC TAATAGGTAG 48607 239 6.59e-06 TTCCCGGAAG CAAACAGATCTTTCTGTTTC AGAAAATCTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 51797 1.2e-09 38_[+1]_442 51931 2.7e-08 95_[+1]_385 676 3.9e-08 71_[+1]_409 49759 4.9e-08 32_[+1]_448 44901 1.8e-07 335_[+1]_145 47001 2.2e-07 97_[+1]_383 49650 4e-07 176_[+1]_304 16850 1.1e-06 426_[+1]_54 34261 1.5e-06 168_[+1]_312 49202 3.9e-06 86_[+1]_394 43361 4.4e-06 338_[+1]_142 48608 4.7e-06 228_[+1]_252 33088 5.8e-06 403_[+1]_77 48607 6.6e-06 238_[+1]_242 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=20 seqs=14 51797 ( 39) GAATCGTTTAGGGCGGTCTC 1 51931 ( 96) GAATCGTATTGGACGGTTAC 1 676 ( 72) AAGTCGAATCTGGCGGTTTC 1 49759 ( 33) CAATCGATTCTGTCGGTCAC 1 44901 ( 336) CAGACATTTTTGTCGGTATG 1 47001 ( 98) GAGCGATTTCTGGCGGTCGC 1 49650 ( 177) GAATCGCCAATGCCTGTTTC 1 16850 ( 427) GTACCATGTATGCCGGTTTG 1 34261 ( 169) CAGTCAAAAATGACAGGCTC 1 49202 ( 87) GAGTCGCTGGTGAGGGTTTT 1 43361 ( 339) GTGCGGTTTTTGACTTTTTT 1 48608 ( 229) GACTGGCCTAGGACGGGATC 1 33088 ( 404) CTGACATTTTTGACAAGCTC 1 48607 ( 239) CAAACAGATCTTTCTGTTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 20 n= 7696 bayes= 9.70653 E= 2.4e-002 -195 57 133 -1045 150 -1045 -1045 -25 63 -175 114 -1045 -37 -16 -1045 116 -1045 171 -9 -1045 63 -1045 133 -1045 -37 -16 -167 97 4 -75 -167 97 -95 -1045 -167 162 37 25 -167 16 -1045 -1045 -9 162 -1045 -1045 203 -184 63 -75 -9 -25 -1045 195 -167 -1045 -95 -1045 150 -25 -195 -1045 191 -184 -1045 -1045 -9 162 -95 57 -1045 97 -95 -1045 -167 162 -1045 157 -67 -84 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 20 nsites= 14 E= 2.4e-002 0.071429 0.357143 0.571429 0.000000 0.785714 0.000000 0.000000 0.214286 0.428571 0.071429 0.500000 0.000000 0.214286 0.214286 0.000000 0.571429 0.000000 0.785714 0.214286 0.000000 0.428571 0.000000 0.571429 0.000000 0.214286 0.214286 0.071429 0.500000 0.285714 0.142857 0.071429 0.500000 0.142857 0.000000 0.071429 0.785714 0.357143 0.285714 0.071429 0.285714 0.000000 0.000000 0.214286 0.785714 0.000000 0.000000 0.928571 0.071429 0.428571 0.142857 0.214286 0.214286 0.000000 0.928571 0.071429 0.000000 0.142857 0.000000 0.642857 0.214286 0.071429 0.000000 0.857143 0.071429 0.000000 0.000000 0.214286 0.785714 0.142857 0.357143 0.000000 0.500000 0.142857 0.000000 0.071429 0.785714 0.000000 0.714286 0.142857 0.142857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GC][AT][GA][TAC][CG][GA][TAC][TA]T[ACT][TG]G[AGT]C[GT]G[TG][TC]TC -------------------------------------------------------------------------------- Time 2.20 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 14 llr = 135 E-value = 2.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::8:2:4:843 pos.-specific C 1:6:a5::6::1 probability G :241:1a:424: matrix T 98:1:1:6::16 bits 2.1 * * 1.9 * * 1.7 * * * 1.5 * * * Relative 1.3 ** * * Entropy 1.1 *** * * ** (13.9 bits) 0.9 ***** **** 0.6 ***** ****** 0.4 ***** ****** 0.2 ************ 0.0 ------------ Multilevel TTCACCGTCAAT consensus GG A AGGGA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 49650 466 5.55e-08 AGCAAAGAAA TTCACCGTCAGT AACGCAACAT 33088 41 1.28e-06 TAACACATGC TTCACGGTCAGT CAGTGATGTT 44901 30 1.28e-06 CCAGCGTCCA TTCACAGTCAAT GCAACCATTG 48608 198 3.21e-06 AAGTCCACGG TGCACCGACAGT CTCATCCTCA 43361 292 6.12e-06 CTTCCATCGA TGGACCGTGAGT GACTATGATT 36828 75 8.38e-06 AATTCCACGG TTCACAGACAGA TGTAACAAAT 51931 39 1.30e-05 GAAATGAGTG TTCACGGACAAA CATGTCGAAA 16850 93 2.48e-05 CCAGATCCTG TTGACTGTGAAA CGAGGGTGTC 49759 418 2.86e-05 GCCTTTTTGA TTGGCAGTGAAT CTGATCGATG 34261 101 3.08e-05 ATTTAAAGTA TTGACTGACGAT TGTTGTTTAC 51797 100 5.11e-05 GAAATCACTT TTGACCGTGGTA GTATCTGTGA 49202 131 5.47e-05 CTCGAGCGAA TTCTCCGAGATT GTACAGAAAG 676 175 6.14e-05 CCGCATCCGA CTCGCCGACAGT TTCACACGAA 48607 423 8.76e-05 AGACTGTCGG TGCACCGTGGAC TTTGGCGGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49650 5.5e-08 465_[+2]_23 33088 1.3e-06 40_[+2]_448 44901 1.3e-06 29_[+2]_459 48608 3.2e-06 197_[+2]_291 43361 6.1e-06 291_[+2]_197 36828 8.4e-06 74_[+2]_414 51931 1.3e-05 38_[+2]_450 16850 2.5e-05 92_[+2]_396 49759 2.9e-05 417_[+2]_71 34261 3.1e-05 100_[+2]_388 51797 5.1e-05 99_[+2]_389 49202 5.5e-05 130_[+2]_358 676 6.1e-05 174_[+2]_314 48607 8.8e-05 422_[+2]_66 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=14 49650 ( 466) TTCACCGTCAGT 1 33088 ( 41) TTCACGGTCAGT 1 44901 ( 30) TTCACAGTCAAT 1 48608 ( 198) TGCACCGACAGT 1 43361 ( 292) TGGACCGTGAGT 1 36828 ( 75) TTCACAGACAGA 1 51931 ( 39) TTCACGGACAAA 1 16850 ( 93) TTGACTGTGAAA 1 49759 ( 418) TTGGCAGTGAAT 1 34261 ( 101) TTGACTGACGAT 1 51797 ( 100) TTGACCGTGGTA 1 49202 ( 131) TTCTCCGAGATT 1 676 ( 175) CTCGCCGACAGT 1 48607 ( 423) TGCACCGTGGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 7824 bayes= 9.73036 E= 2.0e+002 -1045 -175 -1045 186 -1045 -1045 -9 162 -1045 142 65 -1045 150 -1045 -67 -184 -1045 206 -1045 -1045 -37 106 -67 -84 -1045 -1045 214 -1045 63 -1045 -1045 116 -1045 125 91 -1045 150 -1045 -9 -1045 63 -1045 91 -84 4 -175 -1045 133 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 14 E= 2.0e+002 0.000000 0.071429 0.000000 0.928571 0.000000 0.000000 0.214286 0.785714 0.000000 0.642857 0.357143 0.000000 0.785714 0.000000 0.142857 0.071429 0.000000 1.000000 0.000000 0.000000 0.214286 0.500000 0.142857 0.142857 0.000000 0.000000 1.000000 0.000000 0.428571 0.000000 0.000000 0.571429 0.000000 0.571429 0.428571 0.000000 0.785714 0.000000 0.214286 0.000000 0.428571 0.000000 0.428571 0.142857 0.285714 0.071429 0.000000 0.642857 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[TG][CG]AC[CA]G[TA][CG][AG][AG][TA] -------------------------------------------------------------------------------- Time 4.04 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 16 sites = 10 llr = 123 E-value = 3.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::6487a9a1176a5: pos.-specific C 17312::1:473:::9 probability G :::2:3:::22:2::: matrix T 9313:::::3::2:51 bits 2.1 1.9 * * * 1.7 * * * 1.5 * *** * * Relative 1.3 * *** * * Entropy 1.1 ** ***** * * * (17.8 bits) 0.9 ** ***** ** *** 0.6 *** ***** ****** 0.4 *** ***** ****** 0.2 **************** 0.0 ---------------- Multilevel TCAAAAAAACCAAAAC consensus TCTCG TGCG T sequence G G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 49202 432 2.30e-09 ACTCTCTCTC TCATAAAAACCAAAAC GCCGCAAATT 36828 283 1.25e-07 AATGACCAAT TTCTAAAAATCAAATC AGCATGAGCG 44901 5 5.10e-07 CTTG TCAAAGAAACGAGAAC CAGCGTCCAT 34261 197 5.65e-07 TCGCGCCGCT TCAGAAAAAACATATC AGTCACGGGA 43361 371 1.39e-06 CACGCCATTC TCCAAAAAATCCAAAT TGCTCTCACA 34710 386 1.62e-06 CACGCTAACG TCACAAAAACACAATC GACAATCGTC 676 126 1.75e-06 CCTCGTAGAA TCCACGAAAGCAGAAC CGGAGAACCA 49650 372 2.36e-06 CCCAATGGAA TTATCAACACCAAATC AGGAATATCC 33088 265 2.53e-06 GCCTACCAAT CTAAAAAAATCATAAC GCAGGTCTGT 51797 382 5.02e-06 TTGCCCGGAG TCTGAGAAAGGCAATC TACGTATGGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 49202 2.3e-09 431_[+3]_53 36828 1.2e-07 282_[+3]_202 44901 5.1e-07 4_[+3]_480 34261 5.6e-07 196_[+3]_288 43361 1.4e-06 370_[+3]_114 34710 1.6e-06 385_[+3]_99 676 1.8e-06 125_[+3]_359 49650 2.4e-06 371_[+3]_113 33088 2.5e-06 264_[+3]_220 51797 5e-06 381_[+3]_103 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=16 seqs=10 49202 ( 432) TCATAAAAACCAAAAC 1 36828 ( 283) TTCTAAAAATCAAATC 1 44901 ( 5) TCAAAGAAACGAGAAC 1 34261 ( 197) TCAGAAAAAACATATC 1 43361 ( 371) TCCAAAAAATCCAAAT 1 34710 ( 386) TCACAAAAACACAATC 1 676 ( 126) TCCACGAAAGCAGAAC 1 49650 ( 372) TTATCAACACCAAATC 1 33088 ( 265) CTAAAAAAATCATAAC 1 51797 ( 382) TCTGAGAAAGGCAATC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 7760 bayes= 9.84989 E= 3.8e+002 -997 -126 -997 182 -997 154 -997 23 111 32 -997 -135 53 -126 -18 23 153 -26 -997 -997 134 -997 40 -997 185 -997 -997 -997 170 -126 -997 -997 185 -997 -997 -997 -147 73 -18 23 -147 154 -18 -997 134 32 -997 -997 111 -997 -18 -35 185 -997 -997 -997 85 -997 -997 97 -997 190 -997 -135 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 10 E= 3.8e+002 0.000000 0.100000 0.000000 0.900000 0.000000 0.700000 0.000000 0.300000 0.600000 0.300000 0.000000 0.100000 0.400000 0.100000 0.200000 0.300000 0.800000 0.200000 0.000000 0.000000 0.700000 0.000000 0.300000 0.000000 1.000000 0.000000 0.000000 0.000000 0.900000 0.100000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.100000 0.400000 0.200000 0.300000 0.100000 0.700000 0.200000 0.000000 0.700000 0.300000 0.000000 0.000000 0.600000 0.000000 0.200000 0.200000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.900000 0.000000 0.100000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- T[CT][AC][ATG][AC][AG]AAA[CTG][CG][AC][AGT]A[AT]C -------------------------------------------------------------------------------- Time 5.88 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 36828 2.06e-05 74_[+2(8.38e-06)]_196_\ [+3(1.25e-07)]_202 47001 2.81e-03 97_[+1(2.25e-07)]_383 676 1.19e-07 71_[+1(3.89e-08)]_34_[+3(1.75e-06)]_\ 33_[+2(6.14e-05)]_314 48607 2.10e-03 238_[+1(6.59e-06)]_164_\ [+2(8.76e-05)]_66 49759 1.98e-05 32_[+1(4.93e-08)]_365_\ [+2(2.86e-05)]_71 16850 2.65e-04 92_[+2(2.48e-05)]_322_\ [+1(1.06e-06)]_54 44901 4.64e-09 4_[+3(5.10e-07)]_9_[+2(1.28e-06)]_\ 155_[+2(7.20e-05)]_127_[+1(1.84e-07)]_145 34710 1.52e-02 385_[+3(1.62e-06)]_99 51797 1.13e-08 38_[+1(1.24e-09)]_41_[+2(5.11e-05)]_\ 270_[+3(5.02e-06)]_103 51931 1.12e-05 38_[+2(1.30e-05)]_45_[+1(2.69e-08)]_\ 385 33088 4.70e-07 40_[+2(1.28e-06)]_212_\ [+3(2.53e-06)]_123_[+1(5.79e-06)]_77 48608 9.32e-05 197_[+2(3.21e-06)]_19_\ [+1(4.74e-06)]_252 49202 1.66e-08 86_[+1(3.86e-06)]_24_[+2(5.47e-05)]_\ 289_[+3(2.30e-09)]_53 43361 8.82e-07 291_[+2(6.12e-06)]_35_\ [+1(4.43e-06)]_12_[+3(1.39e-06)]_114 49650 2.16e-09 124_[+1(6.13e-05)]_32_\ [+1(4.01e-07)]_175_[+3(2.36e-06)]_78_[+2(5.55e-08)]_23 34261 6.17e-07 100_[+2(3.08e-05)]_56_\ [+1(1.46e-06)]_8_[+3(5.65e-07)]_288 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************