******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/220/220.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 47468 1.0000 500 29004 1.0000 500 48202 1.0000 500 39048 1.0000 500 15160 1.0000 500 15329 1.0000 500 48981 1.0000 500 39797 1.0000 500 49146 1.0000 500 15922 1.0000 500 30667 1.0000 500 49948 1.0000 500 40785 1.0000 500 33231 1.0000 500 48352 1.0000 500 48700 1.0000 500 39744 1.0000 500 39552 1.0000 500 39894 1.0000 500 38110 1.0000 500 47462 1.0000 500 43814 1.0000 500 48383 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/220/220.seqs.fa -oc motifs/220 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 23 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 11500 N= 23 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.264 C 0.251 G 0.223 T 0.262 Background letter frequencies (from dataset with add-one prior applied): A 0.264 C 0.251 G 0.223 T 0.262 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 14 sites = 18 llr = 181 E-value = 1.6e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1:163a4:2:::78 pos.-specific C 17::5:2:3:2:11 probability G 22822:4:2:5a2: matrix T 6112::1a2a3::1 bits 2.2 * 1.9 * * * * 1.7 * * * * 1.5 * * * * Relative 1.3 * * * * * Entropy 1.1 * * * * * * (14.5 bits) 0.9 ** * * * * * 0.6 *** * * ***** 0.4 ****** * ***** 0.2 ******** ***** 0.0 -------------- Multilevel TCGACAATCTGGAA consensus G GA G A T G sequence G T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------------- 48700 427 8.76e-08 CGTTCATCCT TCGACACTCTGGAA GAACCATTCG 39048 244 3.74e-07 AAAGGAAGAA TCGAGAATATGGAA TATCTGTTTC 40785 236 4.27e-07 CGACGGCGGT GCGAAAGTTTGGAA TCAAGTCGAA 43814 137 1.21e-06 CCCCTCGGAG TCGGCAGTCTTGGA AACTCCTCCC 47468 321 1.84e-06 AAGTTCCTAA ACGACAGTTTTGAA ATATTTTCAA 39552 158 2.36e-06 TGTCCTGTCC TGGAAAATGTTGAA GGGTGCGCGA 49146 417 4.57e-06 TCATACTCAC TTGTCAATCTGGAA TTGCGTTCGG 15922 421 5.14e-06 TAGTCTTTGG CCGAGAGTGTTGAA CCTATGCCGA 33231 249 7.65e-06 ATTTGTCATC TTGAAACTTTGGAA AACTCAGCGC 39797 67 8.38e-06 TTACAGCTAA TGGGCAATTTGGGA CTTCTTAATT 39894 418 2.01e-05 TCCGACACCT CCGTCAATATCGAA ATCAATCCAT 48352 451 2.01e-05 CATACATTCT GCAAAAATATGGAA CATCGGGAAG 30667 282 2.52e-05 TAGAGTAGAC GCGACAGTCTTGCT ATGTCCGGTC 48383 257 3.89e-05 GAGAATTCGG TCGGAACTGTGGGT GCTCGTTGTA 15329 94 3.89e-05 TTAAGTCTGC TGTACAGTGTGGCA TGGATGATGG 39744 285 4.44e-05 AACGGGGTCG TCTAGATTCTTGAA GCGCGCTGAG 48981 52 4.44e-05 GCGACTCCAG ACGTCAATATCGGA GACCAAGTTG 15160 477 4.71e-05 GACACCGCCA GCGGAAGTCTCGAC GCAGGTTGGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 48700 8.8e-08 426_[+1]_60 39048 3.7e-07 243_[+1]_243 40785 4.3e-07 235_[+1]_251 43814 1.2e-06 136_[+1]_350 47468 1.8e-06 320_[+1]_166 39552 2.4e-06 157_[+1]_329 49146 4.6e-06 416_[+1]_70 15922 5.1e-06 420_[+1]_66 33231 7.7e-06 248_[+1]_238 39797 8.4e-06 66_[+1]_420 39894 2e-05 417_[+1]_69 48352 2e-05 450_[+1]_36 30667 2.5e-05 281_[+1]_205 48383 3.9e-05 256_[+1]_230 15329 3.9e-05 93_[+1]_393 39744 4.4e-05 284_[+1]_202 48981 4.4e-05 51_[+1]_435 15160 4.7e-05 476_[+1]_10 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=14 seqs=18 48700 ( 427) TCGACACTCTGGAA 1 39048 ( 244) TCGAGAATATGGAA 1 40785 ( 236) GCGAAAGTTTGGAA 1 43814 ( 137) TCGGCAGTCTTGGA 1 47468 ( 321) ACGACAGTTTTGAA 1 39552 ( 158) TGGAAAATGTTGAA 1 49146 ( 417) TTGTCAATCTGGAA 1 15922 ( 421) CCGAGAGTGTTGAA 1 33231 ( 249) TTGAAACTTTGGAA 1 39797 ( 67) TGGGCAATTTGGGA 1 39894 ( 418) CCGTCAATATCGAA 1 48352 ( 451) GCAAAAATATGGAA 1 30667 ( 282) GCGACAGTCTTGCT 1 48383 ( 257) TCGGAACTGTGGGT 1 15329 ( 94) TGTACAGTGTGGCA 1 39744 ( 285) TCTAGATTCTTGAA 1 48981 ( 52) ACGTCAATATCGGA 1 15160 ( 477) GCGGAAGTCTCGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 14 n= 11201 bayes= 9.4136 E= 1.6e+000 -125 -117 0 109 -1081 152 -42 -123 -225 -1081 190 -123 121 -1081 0 -65 33 99 -42 -1081 192 -1081 -1081 -1081 56 -59 80 -223 -1081 -1081 -1081 193 -25 41 0 -24 -1081 -1081 -1081 193 -1081 -59 116 35 -1081 -1081 216 -1081 133 -117 0 -1081 166 -217 -1081 -123 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 14 nsites= 18 E= 1.6e+000 0.111111 0.111111 0.222222 0.555556 0.000000 0.722222 0.166667 0.111111 0.055556 0.000000 0.833333 0.111111 0.611111 0.000000 0.222222 0.166667 0.333333 0.500000 0.166667 0.000000 1.000000 0.000000 0.000000 0.000000 0.388889 0.166667 0.388889 0.055556 0.000000 0.000000 0.000000 1.000000 0.222222 0.333333 0.222222 0.222222 0.000000 0.000000 0.000000 1.000000 0.000000 0.166667 0.500000 0.333333 0.000000 0.000000 1.000000 0.000000 0.666667 0.111111 0.222222 0.000000 0.833333 0.055556 0.000000 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TG]CG[AG][CA]A[AG]T[CAGT]T[GT]G[AG]A -------------------------------------------------------------------------------- Time 4.43 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 12 llr = 130 E-value = 2.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :3:::89::a54 pos.-specific C 811:92::8::2 probability G :3921:1a:::: matrix T 24:8::::3:54 bits 2.2 * 1.9 * * 1.7 * * * 1.5 * * ** * Relative 1.3 * ****** * Entropy 1.1 * ******** (15.6 bits) 0.9 * ********* 0.6 * ********* 0.4 * ********** 0.2 ************ 0.0 ------------ Multilevel CTGTCAAGCAAA consensus A T TT sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 39048 291 2.75e-07 TGTTCGCCTC CTGTCAAGCAAA GTTGGACCAA 38110 379 7.86e-07 CACTAAATCA CAGTCAAGCAAA AAGTAAATCA 33231 201 7.86e-07 AATGACTCCA CAGTCAAGCAAT GGATAGACTG 15160 109 1.93e-06 GGATTAAGGT CGGTCAAGTAAA GCTCCTTATC 47468 132 2.74e-06 AGCAACCTTA CAGTCAAGTATT GTTCGTGTTT 47462 121 3.03e-06 CGGCGTGTAT TTGTCAAGCAAT CTATGGCGAC 49948 94 7.45e-06 CTAAAAGAAG CGGGCAAGCATC CCTTCCAAGG 43814 443 9.16e-06 CCACACTTGT TGGTCAAGCATC CATCCATCCA 39744 403 1.01e-05 TGCTTTCTAC CTGGCCAGCATT CTCTTCTTCT 40785 299 1.09e-05 CAGATAGGAA CCGTCCAGCATT ACTTGTAGCT 48981 371 1.51e-05 TCGATGCACC CTCTCAAGTAAA AGAGAATACC 48383 34 2.92e-05 TAATAACATT CTGTGAGGCATA TTTCACAGAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 39048 2.7e-07 290_[+2]_198 38110 7.9e-07 378_[+2]_110 33231 7.9e-07 200_[+2]_288 15160 1.9e-06 108_[+2]_380 47468 2.7e-06 131_[+2]_357 47462 3e-06 120_[+2]_368 49948 7.4e-06 93_[+2]_395 43814 9.2e-06 442_[+2]_46 39744 1e-05 402_[+2]_86 40785 1.1e-05 298_[+2]_190 48981 1.5e-05 370_[+2]_118 48383 2.9e-05 33_[+2]_455 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=12 39048 ( 291) CTGTCAAGCAAA 1 38110 ( 379) CAGTCAAGCAAA 1 33231 ( 201) CAGTCAAGCAAT 1 15160 ( 109) CGGTCAAGTAAA 1 47468 ( 132) CAGTCAAGTATT 1 47462 ( 121) TTGTCAAGCAAT 1 49948 ( 94) CGGGCAAGCATC 1 43814 ( 443) TGGTCAAGCATC 1 39744 ( 403) CTGGCCAGCATT 1 40785 ( 299) CCGTCCAGCATT 1 48981 ( 371) CTCTCAAGTAAA 1 48383 ( 34) CTGTGAGGCATA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 11247 bayes= 9.5293 E= 2.8e+001 -1023 173 -1023 -65 -8 -159 16 67 -1023 -159 204 -1023 -1023 -1023 -42 167 -1023 187 -142 -1023 166 -59 -1023 -1023 179 -1023 -142 -1023 -1023 -1023 216 -1023 -1023 158 -1023 -7 192 -1023 -1023 -1023 92 -1023 -1023 93 66 -59 -1023 67 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 12 E= 2.8e+001 0.000000 0.833333 0.000000 0.166667 0.250000 0.083333 0.250000 0.416667 0.000000 0.083333 0.916667 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.916667 0.083333 0.000000 0.833333 0.166667 0.000000 0.000000 0.916667 0.000000 0.083333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.000000 0.250000 1.000000 0.000000 0.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.416667 0.166667 0.000000 0.416667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- C[TAG]GTCAAG[CT]A[AT][AT] -------------------------------------------------------------------------------- Time 8.51 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 11 llr = 122 E-value = 2.1e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :41a:1::3992 pos.-specific C 313:9::::114 probability G ::6:1:a:7::2 matrix T 75:::9:a:::3 bits 2.2 * 1.9 * ** 1.7 * ** 1.5 ***** ** Relative 1.3 ******** Entropy 1.1 * ******** (15.9 bits) 0.9 * ********* 0.6 *********** 0.4 *********** 0.2 *********** 0.0 ------------ Multilevel TTGACTGTGAAC consensus CAC A T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 15160 395 2.39e-07 CAGTCGCATA TTGACTGTGAAG ACAAAGCAAC 33231 216 3.66e-07 AAGCAATGGA TAGACTGTGAAT AGGTCCGGTC 39048 186 6.18e-07 CGTTAAGAAA TTGACTGTAAAC AGAACTGATG 38110 44 1.39e-06 TGTAACCTAG CAGACTGTGAAT CCATTACTTT 47468 474 1.39e-06 AGTGTTTTTG CAGACTGTGAAT GAGAGTGTAT 30667 174 2.77e-06 TCAATCATTC TTGACAGTGAAC CACAAGATAA 29004 349 4.32e-06 TCTCCGTTGC TTCACTGTAAAG TAGACATTTT 48202 364 1.02e-05 CACACGGACC TACACTGTGCAC CTTTCCAACC 48352 253 1.25e-05 TTCACTTTCA TTAACTGTAAAA TCCAGTTGAT 15922 474 1.25e-05 AGGTGTAAAG CTGACTGTGACA GGAATGCGAT 43814 417 2.39e-05 GGTGTGTGTT TCCAGTGTGAAC TCTCCCACAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 15160 2.4e-07 394_[+3]_94 33231 3.7e-07 215_[+3]_273 39048 6.2e-07 185_[+3]_303 38110 1.4e-06 43_[+3]_445 47468 1.4e-06 473_[+3]_15 30667 2.8e-06 173_[+3]_315 29004 4.3e-06 348_[+3]_140 48202 1e-05 363_[+3]_125 48352 1.2e-05 252_[+3]_236 15922 1.2e-05 473_[+3]_15 43814 2.4e-05 416_[+3]_72 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=11 15160 ( 395) TTGACTGTGAAG 1 33231 ( 216) TAGACTGTGAAT 1 39048 ( 186) TTGACTGTAAAC 1 38110 ( 44) CAGACTGTGAAT 1 47468 ( 474) CAGACTGTGAAT 1 30667 ( 174) TTGACAGTGAAC 1 29004 ( 349) TTCACTGTAAAG 1 48202 ( 364) TACACTGTGCAC 1 48352 ( 253) TTAACTGTAAAA 1 15922 ( 474) CTGACTGTGACA 1 43814 ( 417) TCCAGTGTGAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 11247 bayes= 10.3518 E= 2.1e+002 -1010 12 -1010 147 46 -146 -1010 106 -154 12 151 -1010 192 -1010 -1010 -1010 -1010 186 -129 -1010 -154 -1010 -1010 180 -1010 -1010 216 -1010 -1010 -1010 -1010 193 4 -1010 170 -1010 178 -146 -1010 -1010 178 -146 -1010 -1010 -54 53 -29 6 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 11 E= 2.1e+002 0.000000 0.272727 0.000000 0.727273 0.363636 0.090909 0.000000 0.545455 0.090909 0.272727 0.636364 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.909091 0.090909 0.000000 0.090909 0.000000 0.000000 0.909091 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.272727 0.000000 0.727273 0.000000 0.909091 0.090909 0.000000 0.000000 0.909091 0.090909 0.000000 0.000000 0.181818 0.363636 0.181818 0.272727 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [TC][TA][GC]ACTGT[GA]AA[CT] -------------------------------------------------------------------------------- Time 12.97 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47468 1.96e-07 131_[+2(2.74e-06)]_177_\ [+1(1.84e-06)]_139_[+3(1.39e-06)]_15 29004 1.44e-02 348_[+3(4.32e-06)]_140 48202 6.50e-02 363_[+3(1.02e-05)]_125 39048 2.63e-09 185_[+3(6.18e-07)]_46_\ [+1(3.74e-07)]_33_[+2(2.75e-07)]_198 15160 5.40e-07 108_[+2(1.93e-06)]_274_\ [+3(2.39e-07)]_70_[+1(4.71e-05)]_10 15329 7.09e-02 93_[+1(3.89e-05)]_393 48981 5.19e-03 51_[+1(4.44e-05)]_305_\ [+2(1.51e-05)]_118 39797 3.64e-02 66_[+1(8.38e-06)]_420 49146 2.48e-02 416_[+1(4.57e-06)]_70 15922 1.04e-03 420_[+1(5.14e-06)]_39_\ [+3(1.25e-05)]_15 30667 1.02e-03 173_[+3(2.77e-06)]_96_\ [+1(2.52e-05)]_205 49948 9.91e-03 93_[+2(7.45e-06)]_395 40785 1.04e-04 235_[+1(4.27e-07)]_49_\ [+2(1.09e-05)]_190 33231 6.83e-08 200_[+2(7.86e-07)]_3_[+3(3.66e-07)]_\ 21_[+1(7.65e-06)]_238 48352 1.42e-03 252_[+3(1.25e-05)]_186_\ [+1(2.01e-05)]_36 48700 9.19e-04 426_[+1(8.76e-08)]_60 39744 4.19e-03 284_[+1(4.44e-05)]_104_\ [+2(1.01e-05)]_86 39552 2.65e-03 157_[+1(2.36e-06)]_31_\ [+1(7.64e-05)]_284 39894 4.94e-02 417_[+1(2.01e-05)]_69 38110 2.88e-05 43_[+3(1.39e-06)]_323_\ [+2(7.86e-07)]_110 47462 2.13e-02 120_[+2(3.03e-06)]_368 43814 5.16e-06 136_[+1(1.21e-06)]_266_\ [+3(2.39e-05)]_14_[+2(9.16e-06)]_46 48383 3.50e-03 33_[+2(2.92e-05)]_211_\ [+1(3.89e-05)]_230 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************