cmonkey.motif
index
/home/weiju/Projects/ISB/cmonkey-python/cmonkey/motif.py

motif.py - cMonkey motif related processing
This module captures the motif-specific scoring component
of cMonkey.
 
This file is part of cMonkey Python. Please see README and LICENSE for
more information and licensing details.

 
Modules
       
cPickle
collections
cmonkey.datamatrix
logging
cmonkey.meme
numpy
os
cmonkey.scoring
sqlite3
cmonkey.seqtools
subprocess
sys
tempfile
cmonkey.util
cmonkey.weeder

 
Classes
       
__builtin__.tuple(__builtin__.object)
ComputeScoreParams
WeederRunner
cmonkey.scoring.ScoringFunctionBase
MotifScoringFunctionBase
MemeScoringFunction
WeederScoringFunction

 
class ComputeScoreParams(__builtin__.tuple)
    ComputeScoreParams(iteration, cluster, feature_ids, seqs, used_seqs, meme_runner, min_cluster_rows, max_cluster_rows, num_motifs, previous_motif_infos, outdir, num_iterations, debug)
 
 
Method resolution order:
ComputeScoreParams
__builtin__.tuple
__builtin__.object

Methods defined here:
__getnewargs__(self)
Return self as a plain tuple.  Used by copy and pickle.
__getstate__(self)
Exclude the OrderedDict from pickling
__repr__(self)
Return a nicely formatted representation string
_asdict(self)
Return a new OrderedDict which maps field names to their values
_replace(_self, **kwds)
Return a new ComputeScoreParams object replacing specified fields with new values

Class methods defined here:
_make(cls, iterable, new=<built-in method __new__ of type object>, len=<built-in function len>) from __builtin__.type
Make a new ComputeScoreParams object from a sequence or iterable

Static methods defined here:
__new__(_cls, iteration, cluster, feature_ids, seqs, used_seqs, meme_runner, min_cluster_rows, max_cluster_rows, num_motifs, previous_motif_infos, outdir, num_iterations, debug)
Create new instance of ComputeScoreParams(iteration, cluster, feature_ids, seqs, used_seqs, meme_runner, min_cluster_rows, max_cluster_rows, num_motifs, previous_motif_infos, outdir, num_iterations, debug)

Data descriptors defined here:
__dict__
Return a new OrderedDict which maps field names to their values
cluster
Alias for field number 1
debug
Alias for field number 12
feature_ids
Alias for field number 2
iteration
Alias for field number 0
max_cluster_rows
Alias for field number 7
meme_runner
Alias for field number 5
min_cluster_rows
Alias for field number 6
num_iterations
Alias for field number 11
num_motifs
Alias for field number 8
outdir
Alias for field number 10
previous_motif_infos
Alias for field number 9
seqs
Alias for field number 3
used_seqs
Alias for field number 4

Data and other attributes defined here:
_fields = ('iteration', 'cluster', 'feature_ids', 'seqs', 'used_seqs', 'meme_runner', 'min_cluster_rows', 'max_cluster_rows', 'num_motifs', 'previous_motif_infos', 'outdir', 'num_iterations', 'debug')

Methods inherited from __builtin__.tuple:
__add__(...)
x.__add__(y) <==> x+y
__contains__(...)
x.__contains__(y) <==> y in x
__eq__(...)
x.__eq__(y) <==> x==y
__ge__(...)
x.__ge__(y) <==> x>=y
__getattribute__(...)
x.__getattribute__('name') <==> x.name
__getitem__(...)
x.__getitem__(y) <==> x[y]
__getslice__(...)
x.__getslice__(i, j) <==> x[i:j]
 
Use of negative indices is not supported.
__gt__(...)
x.__gt__(y) <==> x>y
__hash__(...)
x.__hash__() <==> hash(x)
__iter__(...)
x.__iter__() <==> iter(x)
__le__(...)
x.__le__(y) <==> x<=y
__len__(...)
x.__len__() <==> len(x)
__lt__(...)
x.__lt__(y) <==> x<y
__mul__(...)
x.__mul__(n) <==> x*n
__ne__(...)
x.__ne__(y) <==> x!=y
__rmul__(...)
x.__rmul__(n) <==> n*x
__sizeof__(...)
T.__sizeof__() -- size of T in memory, in bytes
count(...)
T.count(value) -> integer -- return number of occurrences of value
index(...)
T.index(value, [start, [stop]]) -> integer -- return first index of value.
Raises ValueError if the value is not present.

 
class MemeScoringFunction(MotifScoringFunctionBase)
    Scoring function for motifs
 
 
Method resolution order:
MemeScoringFunction
MotifScoringFunctionBase
cmonkey.scoring.ScoringFunctionBase

Methods defined here:
__init__(self, organism, membership, ratios, config_params=None)
creates a ScoringFunction
initialize(self, args)
process additional parameters
meme_runner(self)
returns the MEME runner object

Methods inherited from MotifScoringFunctionBase:
compute(self, iteration_result, ref_matrix=None)
override base class compute() method, behavior is more complicated,
since it nests Motif and MEME runs
compute_force(self, iteration_result, ref_matrix=None)
override base class compute() method, behavior is more complicated,
since it nests Motif and MEME runs
compute_pvalues(self, iteration_result, num_motifs, force)
Compute motif scores.
The result is a dictionary from cluster -> (feature_id, pvalue)
containing a sparse gene-to-pvalue mapping for each cluster
 
In order to influence the sequences
that go into meme, the user can specify a list of sequence filter
functions that have the signature
(seqs, feature_ids, distance) -> seqs
These filters are applied in the order they appear in the list.
last_cached(self)
motif_in_iteration(self, i)
TODO: change to an id that is not called 'MEME'
run_logs(self)

Methods inherited from cmonkey.scoring.ScoringFunctionBase:
check_requirements(self)
Give the scoring module an opportunity to check whether the
requirements to run are all met
do_compute(self, iteration_result, ref_matrix=None)
gene_names(self)
returns the gene names
num_clusters(self)
returns the number of clusters
pickle_path(self)
returns the function-specific pickle-path
rows_for_cluster(self, cluster)
returns the rows for the specified cluster
run_in_iteration(self, i)
scaling(self, iteration)
returns the quantile normalization scaling for the specified iteration
set_score_means(self, iteration_result, matrix)

 
class MotifScoringFunctionBase(cmonkey.scoring.ScoringFunctionBase)
    Base class for motif scoring functions that use MEME
This class of scoring function has 2 schedules:
1. run_in_iteration(i) is the normal schedule
2. motif_in_iteration(i) determines when the motifing tools is run
 
  Methods defined here:
__init__(self, id, organism, membership, ratios, seqtype, config_params=None)
creates a ScoringFunction
compute(self, iteration_result, ref_matrix=None)
override base class compute() method, behavior is more complicated,
since it nests Motif and MEME runs
compute_force(self, iteration_result, ref_matrix=None)
override base class compute() method, behavior is more complicated,
since it nests Motif and MEME runs
compute_pvalues(self, iteration_result, num_motifs, force)
Compute motif scores.
The result is a dictionary from cluster -> (feature_id, pvalue)
containing a sparse gene-to-pvalue mapping for each cluster
 
In order to influence the sequences
that go into meme, the user can specify a list of sequence filter
functions that have the signature
(seqs, feature_ids, distance) -> seqs
These filters are applied in the order they appear in the list.
last_cached(self)
motif_in_iteration(self, i)
TODO: change to an id that is not called 'MEME'
run_logs(self)

Methods inherited from cmonkey.scoring.ScoringFunctionBase:
check_requirements(self)
Give the scoring module an opportunity to check whether the
requirements to run are all met
do_compute(self, iteration_result, ref_matrix=None)
gene_names(self)
returns the gene names
num_clusters(self)
returns the number of clusters
pickle_path(self)
returns the function-specific pickle-path
rows_for_cluster(self, cluster)
returns the rows for the specified cluster
run_in_iteration(self, i)
scaling(self, iteration)
returns the quantile normalization scaling for the specified iteration
set_score_means(self, iteration_result, matrix)

 
class WeederRunner
    Wrapper around Weeder so we can use the multiprocessing module.
The function basically runs Weeder ont the specified set of sequences,
converts its output to a MEME output file and runs MAST on the MEME output
to generate a MEME run result.
 
  Methods defined here:
__call__(self, params)
call the runner like a function
__init__(self, meme_suite, config_params, remove_tempfiles=True)
create a runner object

 
class WeederScoringFunction(MotifScoringFunctionBase)
    Motif scoring function that runs Weeder instead of MEME
 
 
Method resolution order:
WeederScoringFunction
MotifScoringFunctionBase
cmonkey.scoring.ScoringFunctionBase

Methods defined here:
__init__(self, organism, membership, ratios, config_params=None)
creates a scoring function
check_requirements(self)
meme_runner(self)
returns the MEME runner object

Methods inherited from MotifScoringFunctionBase:
compute(self, iteration_result, ref_matrix=None)
override base class compute() method, behavior is more complicated,
since it nests Motif and MEME runs
compute_force(self, iteration_result, ref_matrix=None)
override base class compute() method, behavior is more complicated,
since it nests Motif and MEME runs
compute_pvalues(self, iteration_result, num_motifs, force)
Compute motif scores.
The result is a dictionary from cluster -> (feature_id, pvalue)
containing a sparse gene-to-pvalue mapping for each cluster
 
In order to influence the sequences
that go into meme, the user can specify a list of sequence filter
functions that have the signature
(seqs, feature_ids, distance) -> seqs
These filters are applied in the order they appear in the list.
last_cached(self)
motif_in_iteration(self, i)
TODO: change to an id that is not called 'MEME'
run_logs(self)

Methods inherited from cmonkey.scoring.ScoringFunctionBase:
do_compute(self, iteration_result, ref_matrix=None)
gene_names(self)
returns the gene names
num_clusters(self)
returns the number of clusters
pickle_path(self)
returns the function-specific pickle-path
rows_for_cluster(self, cluster)
returns the rows for the specified cluster
run_in_iteration(self, i)
scaling(self, iteration)
returns the quantile normalization scaling for the specified iteration
set_score_means(self, iteration_result, matrix)

 
Functions
       
cluster_seqs(params)
Retrieves the sequences for a cluster. Designed to run in in pool.map()
compute_cluster_score(params)
This function computes the MEME score for a cluster
compute_mean_score(pvalue_matrix, membership, organism)
cluster-specific mean scores
get_remove_atgs_filter(distance)
returns a remove ATG filter
get_remove_low_complexity_filter(meme_suite)
Factory method that returns a low complexity filter
meme_json(run_result)
pvalues2matrix(all_pvalues, num_clusters, gene_names, reverse_map)
converts a map from {cluster: {feature: pvalue}} to a scoring matrix
unique_filter(seqs, feature_ids)
returns a map that contains only the keys that are in
feature_ids and only contains unique sequences

 
Data
        MEMBERSIP = None
ORGANISM = None
SEQUENCE_FILTERS = None