ChromeGoose

ChromeGoose is a Chrome browser plugin for integration of bioinformatics analysis scripts, visualization tools and desktop and web resources by enabling seamless data exchange between all the components.

Analysis of the most biological data types require different desktop analysis software such as Cytoscape, MeV, R, web resources such as NCBI Entrez, EMBL String and custom scripts for manipulating, analyzing and visualizing the data. ChromeGoose provides a unified framework for leveraging the power of all these components by connecting them through well-define data-types. These data types include NameList, Matrix, Tuple and Network that were defined previously by the Gaggle framework. Various desktop analysis packages, web resources and custom scripts are easily plugged into ChromeGoose architecture and all data incompatibility issues are avoided.

Current implementation of the ChromeGoose integrates connectivity between Chrome, Cytoscape, MeV and R. In addition based on the OpenCPU package, any R scripts can be plugged into this framework with minimal effort. We have already implemented three R scripts for custom analysis of regulatory network models including data plotting, filtering and performing enrichment analysis.

Currently Supported R analysis modules

  • Plot Expression Data:

    Plot data function will produce simple plots of expression data for a given set of genes. This function accepts a gene list captured in the ChromeGoose or uploaded files or text input. Input can be a tab-delimited file of single column with a header. Each gene is listed in a separate row. It will also require selecting an organisms to collect gene expression matrix. Currently 3 different types of plots are supported line plot, heat map and smoothed plot.

  • MTB TF Overexpression Data Filter:

    MTB Transcription Factor (TF) overexpression data set includes expression signatures of all genes affected by conditionally overexpressed 206 MTB TF. This function will query TF overexpression dataset by using a list of genes as input and will identify Transcription factors, which affect their expression levels, based on various parameters. This function accepts a gene list captured in the ChromeGoose or uploaded files or text input. Gene list is a tab-delimited file of single column with a header. Each gene is listed in a separate row.

    Input parameters:

    Fold change: The lower threshold for fold change of expression ratios. Results with fold changes equal or higher than given value will be included. (default: 1)

    P-value: The upper threshold for expression change significance p-value. Results with p-values smaller than or equal to given value will be included. (default: 0.05)

    Show only up-regulated: Whether or not to show only up-regulated genes

    Show only down-regulated: Whether or not to show only down-regulated genes

    Output:

    For each gene, matching TFs where expression of the query gene changes significantly will be shown along with the value of fold change and p-value. All TFs that have affect on the query gene are also listed. Plot of the expression values for each gene in the list across all TFOE experiments will also be plotted and fold-changes over the threshold will be marked.

  • Gene set enrichment:

    Geneset enrichment analysis finds enrichment of gene regulatory network modules for a given set of genes. This function compares the overlap of genes in each module to given set of genes by using hypergeometric distribution. For each comparison, p-value is calculated and corrected for multiple testing by using the method of Benjamini, Hochberg. Modules enriched for given set of genes are listed along with total number of genes in the module, number of genes in the list, number of overlapping genes and p-values. Modules are linked to module pages in the network portal. Overlapping genes are captured in the ChromeGoose for further analysis.

    This function accepts a gene list captured in the ChromeGoose or uploaded files or text input. Gene list is a tab-delimited file of single column with a header.Each gene is listed in a separate row. It will also require selecting an organisms to collect regulatory network information.