******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/247/247.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 53964 1.0000 500 52135 1.0000 500 37043 1.0000 500 21912 1.0000 500 48386 1.0000 500 48538 1.0000 500 40162 1.0000 500 11438 1.0000 500 45174 1.0000 500 45234 1.0000 500 45252 1.0000 500 45404 1.0000 500 46090 1.0000 500 43013 1.0000 500 43040 1.0000 500 41606 1.0000 500 34337 1.0000 500 49021 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/247/247.seqs.fa -oc motifs/247 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 9000 N= 18 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.261 C 0.246 G 0.227 T 0.267 Background letter frequencies (from dataset with add-one prior applied): A 0.261 C 0.246 G 0.227 T 0.267 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 9 llr = 110 E-value = 3.9e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 281::a:7:::9 pos.-specific C 3:32::a::::1 probability G 4::2a::3a:a: matrix T :266:::::a:: bits 2.1 * * * 1.9 *** *** 1.7 *** *** 1.5 *** **** Relative 1.3 *** **** Entropy 1.1 * ******** (17.7 bits) 0.9 * ******** 0.6 *********** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel GATTGACAGTGA consensus CTCC G sequence A G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 45234 417 1.20e-07 CAAAAGACAA CATTGACAGTGA AGGGCATTTC 37043 194 1.73e-07 GAAGTAATTT GACTGACAGTGA TCCGGACGAC 43040 426 5.04e-07 ACGAATTTTA GATCGACAGTGA TCATTGAAAG 41606 163 5.50e-07 ACCTGATAGG GACTGACGGTGA CTAGGAATGG 48386 1 6.03e-07 . CATGGACAGTGA TCGATACGAT 34337 163 2.07e-06 GAGCAATCCT CACGGACGGTGA CGACATGGGG 40162 423 4.47e-06 GAAATCGCCG ATTCGACAGTGA ATACATTCCG 53964 242 4.47e-06 AAATTGAGCT AAATGACGGTGA AAACGTATGC 49021 334 5.33e-06 GTAGCATCTT GTTTGACAGTGC TCTATCCATC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 45234 1.2e-07 416_[+1]_72 37043 1.7e-07 193_[+1]_295 43040 5e-07 425_[+1]_63 41606 5.5e-07 162_[+1]_326 48386 6e-07 [+1]_488 34337 2.1e-06 162_[+1]_326 40162 4.5e-06 422_[+1]_66 53964 4.5e-06 241_[+1]_247 49021 5.3e-06 333_[+1]_155 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=9 45234 ( 417) CATTGACAGTGA 1 37043 ( 194) GACTGACAGTGA 1 43040 ( 426) GATCGACAGTGA 1 41606 ( 163) GACTGACGGTGA 1 48386 ( 1) CATGGACAGTGA 1 34337 ( 163) CACGGACGGTGA 1 40162 ( 423) ATTCGACAGTGA 1 53964 ( 242) AAATGACGGTGA 1 49021 ( 334) GTTTGACAGTGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 10.0666 E= 3.9e-001 -23 44 97 -982 158 -982 -982 -27 -123 44 -982 106 -982 -14 -3 106 -982 -982 214 -982 194 -982 -982 -982 -982 202 -982 -982 135 -982 56 -982 -982 -982 214 -982 -982 -982 -982 190 -982 -982 214 -982 177 -114 -982 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 3.9e-001 0.222222 0.333333 0.444444 0.000000 0.777778 0.000000 0.000000 0.222222 0.111111 0.333333 0.000000 0.555556 0.000000 0.222222 0.222222 0.555556 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.888889 0.111111 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [GCA][AT][TC][TCG]GAC[AG]GTGA -------------------------------------------------------------------------------- Time 3.23 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 12 sites = 18 llr = 159 E-value = 2.6e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 3:1::2:::3:1 pos.-specific C 231:812:668: probability G :4:126::422: matrix T 4389:28a:::9 bits 2.1 1.9 * 1.7 * 1.5 * * * Relative 1.3 ** ** ** Entropy 1.1 ** *** ** (12.7 bits) 0.9 *** *** ** 0.6 *** ****** 0.4 ************ 0.2 ************ 0.0 ------------ Multilevel TGTTCGTTCCCT consensus AC GA GAG sequence CT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 21912 33 2.72e-07 TCGTCGCGGA TGTTCGTTGCCT GGCAAGAACG 45404 11 7.69e-07 CGCGGCTTCT TGTTCGTTCACT GTTGTCATCA 34337 222 2.61e-06 TTTCTGTCTG TCTTCGTTCCGT TTCCTCGAGA 41606 385 4.32e-06 ATTTCTTCCG ACTTGGTTCCCT CTTTCTCCTG 45174 419 1.10e-05 TGTTGGTGAG ATTTCGCTCCCT TTCCCGTCAG 49021 377 1.53e-05 GATAGGCGCT TGTTGTTTCCCT GTAAGTTCCT 48538 246 2.38e-05 GCGACTCCGA CGTTCGCTCACT GATTGTCAAC 37043 381 2.38e-05 CGGCCCTTCC ATTTCATTCGCT CCCCCTATCG 48386 459 3.52e-05 GCGTACAGAC TCCTCGTTCCGT TGTGGTTGCC 53964 281 4.26e-05 CGAATCGTGC CCTTGGTTGGCT CTACGAGTCG 52135 235 4.66e-05 ATACTTTTAC ACTGCGTTGACT GTTCCAGATA 46090 313 6.01e-05 CACTCTGTCC ATATCATTCCCT CGTTCCCAGT 45252 262 6.51e-05 CCGATCCGAA ACATCATTGCCT ACATCATCAC 40162 7 7.00e-05 CCAGTT CGCTCATTGCCT ATTAGAATGC 43013 324 7.55e-05 AAACGAGGCT TTTTCTTTGAGT CATCTTAATC 45234 209 9.45e-05 TTCCAGCTTG TGTTGGTTGCCA CTGTTTCTGG 43040 190 1.25e-04 GCGACGAATG TGTTCCCTCACT ACTTACGAGG 11438 290 3.90e-04 GGATGATTTA CTTGCTTTCGGT TGGGCCGTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 21912 2.7e-07 32_[+2]_456 45404 7.7e-07 10_[+2]_478 34337 2.6e-06 221_[+2]_267 41606 4.3e-06 384_[+2]_104 45174 1.1e-05 418_[+2]_70 49021 1.5e-05 376_[+2]_112 48538 2.4e-05 245_[+2]_243 37043 2.4e-05 380_[+2]_108 48386 3.5e-05 458_[+2]_30 53964 4.3e-05 280_[+2]_208 52135 4.7e-05 234_[+2]_254 46090 6e-05 312_[+2]_176 45252 6.5e-05 261_[+2]_227 40162 7e-05 6_[+2]_482 43013 7.6e-05 323_[+2]_165 45234 9.5e-05 208_[+2]_280 43040 0.00013 189_[+2]_299 11438 0.00039 289_[+2]_199 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=12 seqs=18 21912 ( 33) TGTTCGTTGCCT 1 45404 ( 11) TGTTCGTTCACT 1 34337 ( 222) TCTTCGTTCCGT 1 41606 ( 385) ACTTGGTTCCCT 1 45174 ( 419) ATTTCGCTCCCT 1 49021 ( 377) TGTTGTTTCCCT 1 48538 ( 246) CGTTCGCTCACT 1 37043 ( 381) ATTTCATTCGCT 1 48386 ( 459) TCCTCGTTCCGT 1 53964 ( 281) CCTTGGTTGGCT 1 52135 ( 235) ACTGCGTTGACT 1 46090 ( 313) ATATCATTCCCT 1 45252 ( 262) ACATCATTGCCT 1 40162 ( 7) CGCTCATTGCCT 1 43013 ( 324) TTTTCTTTGAGT 1 45234 ( 209) TGTTGGTTGCCA 1 43040 ( 190) TGTTCCCTCACT 1 11438 ( 290) CTTGCTTTCGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 9.0653 E= 2.6e+002 36 -14 -1081 73 -1081 44 78 6 -123 -114 -1081 154 -1081 -1081 -103 173 -1081 166 -3 -1081 -23 -214 129 -68 -1081 -56 -1081 164 -1081 -1081 -1081 190 -1081 131 78 -1081 9 118 -44 -1081 -1081 166 -3 -1081 -223 -1081 -1081 182 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 18 E= 2.6e+002 0.333333 0.222222 0.000000 0.444444 0.000000 0.333333 0.388889 0.277778 0.111111 0.111111 0.000000 0.777778 0.000000 0.000000 0.111111 0.888889 0.000000 0.777778 0.222222 0.000000 0.222222 0.055556 0.555556 0.166667 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.611111 0.388889 0.000000 0.277778 0.555556 0.166667 0.000000 0.000000 0.777778 0.222222 0.000000 0.055556 0.000000 0.000000 0.944444 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [TAC][GCT]TT[CG][GA]TT[CG][CA][CG]T -------------------------------------------------------------------------------- Time 6.51 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 17 llr = 155 E-value = 1.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 11916::3a543 pos.-specific C :5:8:::5:111 probability G :21::a:2:4:4 matrix T 92:14:a1::52 bits 2.1 * 1.9 ** * 1.7 ** * 1.5 * ** * Relative 1.3 * * ** * Entropy 1.1 * ***** * (13.1 bits) 0.9 * ***** ** 0.6 * ***** *** 0.4 * ***** *** 0.2 ************ 0.0 ------------ Multilevel TCACAGTCAATG consensus T A GAA sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 46090 125 6.78e-08 GGATGCTCTC TCACAGTCAATG ATGGGTACCG 45174 3 4.86e-06 AG TTACAGTCAGTA GTTACAGTGT 49021 57 5.87e-06 CTCCTCGTCT TCACAGTCAACA ACAAAAAACT 45252 73 5.87e-06 TTCACGTTGT TCACTGTCAGAT CCAGGATCTT 45234 368 7.93e-06 CGACAAGCAT TCACAGTCAATC CGTTAATGTA 41606 225 9.82e-06 GAATCCGGGT TAACAGTAAGTG TCTAAAGAAC 45404 135 9.82e-06 TAACTAACAG TAACAGTAAGTG TAAAGTATCG 52135 331 9.82e-06 CATCCCTTTC TCACAGTTAATG CAGACCAGTA 53964 224 1.53e-05 GACATGTCGA TGACAGTAAAAT TGAGCTAAAT 43013 461 2.97e-05 CTTTACTGGA TTACTGTGAGAG CTTTCACAGC 11438 35 3.77e-05 GATCGACTGG TCGCTGTGAATG TCCGGAGTGG 43040 442 4.44e-05 CAGTGATCAT TGAAAGTAAGTG TACATCGGAG 34337 343 5.54e-05 ACGACTCCTG TTACAGTGAACA GTTCAACTGT 37043 86 8.02e-05 AGTCAAATAT TGGCTGTCAAAA TAATCTCTGC 40162 168 8.63e-05 TCTGTAGGTC TCAATGTAAGAT GCTTCGCTTG 48538 326 1.73e-04 CAAGTTGCCA ACATAGTCAATT ACGTCCACTC 48386 413 1.91e-04 TATAACTCGA ACACTGTCACAA TTTCACAATA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 46090 6.8e-08 124_[+3]_364 45174 4.9e-06 2_[+3]_486 49021 5.9e-06 56_[+3]_432 45252 5.9e-06 72_[+3]_416 45234 7.9e-06 367_[+3]_121 41606 9.8e-06 224_[+3]_264 45404 9.8e-06 134_[+3]_354 52135 9.8e-06 330_[+3]_158 53964 1.5e-05 223_[+3]_265 43013 3e-05 460_[+3]_28 11438 3.8e-05 34_[+3]_454 43040 4.4e-05 441_[+3]_47 34337 5.5e-05 342_[+3]_146 37043 8e-05 85_[+3]_403 40162 8.6e-05 167_[+3]_321 48538 0.00017 325_[+3]_163 48386 0.00019 412_[+3]_76 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=17 46090 ( 125) TCACAGTCAATG 1 45174 ( 3) TTACAGTCAGTA 1 49021 ( 57) TCACAGTCAACA 1 45252 ( 73) TCACTGTCAGAT 1 45234 ( 368) TCACAGTCAATC 1 41606 ( 225) TAACAGTAAGTG 1 45404 ( 135) TAACAGTAAGTG 1 52135 ( 331) TCACAGTTAATG 1 53964 ( 224) TGACAGTAAAAT 1 43013 ( 461) TTACTGTGAGAG 1 11438 ( 35) TCGCTGTGAATG 1 43040 ( 442) TGAAAGTAAGTG 1 34337 ( 343) TTACAGTGAACA 1 37043 ( 86) TGGCTGTCAAAA 1 40162 ( 168) TCAATGTAAGAT 1 48538 ( 326) ACATAGTCAATT 1 48386 ( 413) ACACTGTCACAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 8802 bayes= 9.08304 E= 1.2e+001 -115 -1073 -1073 172 -115 111 -36 -60 176 -1073 -95 -1073 -115 175 -1073 -218 131 -1073 -1073 40 -1073 -1073 214 -1073 -1073 -1073 -1073 190 17 94 -36 -218 194 -1073 -1073 -1073 102 -206 86 -1073 44 -106 -1073 99 17 -206 86 -18 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 17 E= 1.2e+001 0.117647 0.000000 0.000000 0.882353 0.117647 0.529412 0.176471 0.176471 0.882353 0.000000 0.117647 0.000000 0.117647 0.823529 0.000000 0.058824 0.647059 0.000000 0.000000 0.352941 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.294118 0.470588 0.176471 0.058824 1.000000 0.000000 0.000000 0.000000 0.529412 0.058824 0.411765 0.000000 0.352941 0.117647 0.000000 0.529412 0.294118 0.058824 0.411765 0.235294 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- TCAC[AT]GT[CA]A[AG][TA][GAT] -------------------------------------------------------------------------------- Time 9.18 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 53964 4.26e-05 223_[+3(1.53e-05)]_6_[+1(4.47e-06)]_\ 27_[+2(4.26e-05)]_208 52135 2.64e-03 234_[+2(4.66e-05)]_84_\ [+3(9.82e-06)]_158 37043 6.19e-06 85_[+3(8.02e-05)]_96_[+1(1.73e-07)]_\ 175_[+2(2.38e-05)]_108 21912 3.14e-03 32_[+2(2.72e-07)]_456 48386 5.50e-05 [+1(6.03e-07)]_446_[+2(3.52e-05)]_\ 30 48538 1.49e-02 245_[+2(2.38e-05)]_243 40162 2.87e-04 6_[+2(7.00e-05)]_149_[+3(8.63e-05)]_\ 243_[+1(4.47e-06)]_66 11438 6.21e-02 34_[+3(3.77e-05)]_454 45174 5.41e-04 2_[+3(4.86e-06)]_264_[+3(6.99e-05)]_\ 128_[+2(1.10e-05)]_70 45234 1.94e-06 208_[+2(9.45e-05)]_36_\ [+1(4.88e-05)]_99_[+3(7.93e-06)]_37_[+1(1.20e-07)]_72 45252 2.96e-03 72_[+3(5.87e-06)]_177_\ [+2(6.51e-05)]_227 45404 2.29e-05 10_[+2(7.69e-07)]_112_\ [+3(9.82e-06)]_354 46090 4.76e-05 124_[+3(6.78e-08)]_176_\ [+2(6.01e-05)]_176 43013 1.30e-02 323_[+2(7.55e-05)]_125_\ [+3(2.97e-05)]_28 43040 4.03e-05 425_[+1(5.04e-07)]_4_[+3(4.44e-05)]_\ 47 41606 5.86e-07 137_[+1(1.24e-05)]_13_\ [+1(5.50e-07)]_50_[+3(9.82e-06)]_148_[+2(4.32e-06)]_104 34337 5.71e-06 162_[+1(2.07e-06)]_47_\ [+2(2.61e-06)]_109_[+3(5.54e-05)]_146 49021 8.77e-06 56_[+3(5.87e-06)]_265_\ [+1(5.33e-06)]_31_[+2(1.53e-05)]_112 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************