cmonkey.set_enrichment
index
/home/weiju/Projects/ISB/cmonkey-python/cmonkey/set_enrichment.py

set_enrichment.py - cMonkey set_enrichment scoring.
 
This file is part of cMonkey Python. Please see README and LICENSE for
more information and licensing details.

 
Modules
       
cmonkey.datamatrix
json
logging
math
multiprocessing
numpy
os
cmonkey.scoring
cmonkey.util

 
Classes
       
cmonkey.scoring.ScoringFunctionBase
ScoringFunction
CutoffEnrichmentSet
DiscreteEnrichmentSet
SetType

 
class CutoffEnrichmentSet
    Enrichment set representation constructed with a cutoff
 
  Methods defined here:
__init__(self, cutoff, elems)
instance creation
__repr__(self)
genes(self)
returns all genes
genes_above_cutoff(self)
returns the genes that have a weight above the cutoff

 
class DiscreteEnrichmentSet
     Methods defined here:
__init__(self, genes)
instance creation
__repr__(self)
genes(self)
genes_above_cutoff(self)
returns the genes that have a weight above the cutoff

 
class ScoringFunction(cmonkey.scoring.ScoringFunctionBase)
    Set enrichment scoring function
 
  Methods defined here:
__init__(self, organism, membership, ratios, config_params=None)
Create scoring function instance
bonferroni_cutoff(self)
Bonferroni cutoff value
do_compute(self, iteration_result, ref_matrix)
compute method
Note: will return None if not computed yet and the result of a previous
scoring if the function is not supposed to actually run in this iteration
run_logs(self)
return the run logs

Methods inherited from cmonkey.scoring.ScoringFunctionBase:
check_requirements(self)
Give the scoring module an opportunity to check whether the
requirements to run are all met
compute(self, iteration_result, reference_matrix=None)
general compute method,
iteration_result is a dictionary that contains the
results generated by the scoring functions in the
current computation.
the reference_matrix is actually a hack that allows the scoring
function to normalize its scores to the range of a reference
score matrix. In the normal case, those would be the gene expression
row scores
compute_force(self, iteration_result, reference_matrix=None)
enforce computation, regardless of the iteration function
gene_names(self)
returns the gene names
last_cached(self)
num_clusters(self)
returns the number of clusters
pickle_path(self)
returns the function-specific pickle-path
rows_for_cluster(self, cluster)
returns the rows for the specified cluster
run_in_iteration(self, i)
scaling(self, iteration)
returns the quantile normalization scaling for the specified iteration
set_score_means(self, iteration_result, matrix)

 
class SetType
    Set type representation. This is just a grouping from name to a number of sets
and providing access to all contained genes
 
  Methods defined here:
__init__(self, name, sets)
instance creation
__repr__(self)
string representation
genes(self)
All genes contained in the sets

 
Functions
       
compute_cluster_score(args)
Computes the cluster score for a given set type
process_sets(input_sets, thesaurus)
Reusable function that maps a dictionary {name: [genes]}
to a set of DiscretenEnrichmentSet objects
read_set_types(config_params, thesaurus)
Reads sets from a JSON file. We also ensure that genes
are stored in canonical form in the set, so that set operations based on
gene names will succeed
read_sets_csv(infile, thesaurus, sep=',')
Reads sets from a CSV file
We support 2-column and 3 column formats:
 
2 column files have the format
 
<set name><separator><gene>
 
3 column files have the format
 
<set name><separator><gene><weight>

 
Data
        CANONICAL_ROWNAMES = None
CANONICAL_ROW_INDEXES = None
SET_MATRIX = None
SET_MEMBERSHIP = None
SET_SET_TYPE = None
SET_SYNONYMS = None