Publication Type:
Journal ArticleSource:
Scientific Data, Volume 2, p. - (2015)URL:
http://dx.doi.org/10.1038/sdata.2015.10Abstract:
<p><em>Mycobacterium tuberculosis</em> (MTB) is a pathogenic bacterium responsible for 12 million active cases of tuberculosis (TB) worldwide. The complexity and critical regulatory components of MTB pathogenicity are still poorly understood despite extensive research efforts. In this study, we constructed the first systems-scale map of transcription factor (TF) binding sites and their regulatory target proteins in MTB. We constructed FLAG-tagged overexpression constructs for 206 TFs in MTB, used ChIP-seq to identify genome-wide binding events and surveyed global transcriptomic changes for each overexpressed TF. Here we present data for the most comprehensive map of MTB gene regulation to date. We also define elaborate quality control measures, extensive filtering steps, and the gene-level overlap between ChIP-seq and microarray datasets. Further, we describe the use of TF overexpression datasets to validate a global gene regulatory network model of MTB and describe an online source to explore the datasets.</p>
Attachment | Size |
---|---|
ISA-tab file | 8.9 KB |
Supplementary Table 1 | 157.5 KB |
Supplementary Table 2 | 337.5 KB |
Supplementary Table 3 | 1.33 MB |
To investigate the MTB transcriptional landscape in a systematic manner, we developed a high-throughput approach to identify the genes controlled by nearly all predicted MTB TFs. We individually cloned and conditionally overexpressed 206 MTB TFs to induce the regulatory signature of each one. Using this approach we identified the sets of genes affected by TF overexpression (TFOE) and assembled them into an easily searchable map of transcriptional regulation in MTB.
Accessing large datasets like the TFOE expression data can be difficult when the data spreads over thousands of genes and hundreds of regulators. To address the difficulties usually associated with accessing large data sets, we have designed a simple Excel spreadsheet for querying TFOE data to find regulators of specific genes or sets of genes. (Rustad et al. Genome Biol. 2014)