Python: module cmonkey.motif

cmonkey.motif

index
/home/weiju/Projects/ISB/cmonkey-python/cmonkey/motif.py

motif.py - cMonkey motif related processing This module captures the motif-specific scoring component of cMonkey. This file is part of cMonkey Python. Please see README and LICENSE for more information and licensing details.

Modules

cPickle
collections
cmonkey.datamatrix
logging
cmonkey.meme
numpy
os
cmonkey.scoring
sqlite3
cmonkey.seqtools
subprocess
sys
tempfile
cmonkey.util
cmonkey.weeder

Classes



__builtin__.tuple(__builtin__.object)

ComputeScoreParams

WeederRunner
cmonkey.scoring.ScoringFunctionBase

MotifScoringFunctionBase

MemeScoringFunction
WeederScoringFunction

class ComputeScoreParams(__builtin__.tuple)

    ComputeScoreParams(iteration, cluster, feature_ids, seqs, used_seqs, meme_runner, min_cluster_rows, max_cluster_rows, num_motifs, previous_motif_infos, outdir, num_iterations, debug)

Method resolution order:

ComputeScoreParams

__builtin__.tuple

__builtin__.object

Methods defined here:

__getnewargs__(self)
Return self as a plain tuple.  Used by copy and pickle.

__getstate__(self)
Exclude the OrderedDict from pickling

__repr__(self)
Return a nicely formatted representation string

_asdict(self)
Return a new OrderedDict which maps field names to their values

_replace(_self, **kwds)
Return a new ComputeScoreParams object replacing specified fields with new values

Class methods defined here:

_make(cls, iterable, new=<built-in method __new__ of type object>, len=<built-in function len>) from __builtin__.type
Make a new ComputeScoreParams object from a sequence or iterable

Static methods defined here:

__new__(_cls, iteration, cluster, feature_ids, seqs, used_seqs, meme_runner, min_cluster_rows, max_cluster_rows, num_motifs, previous_motif_infos, outdir, num_iterations, debug)
Create new instance of ComputeScoreParams(iteration, cluster, feature_ids, seqs, used_seqs, meme_runner, min_cluster_rows, max_cluster_rows, num_motifs, previous_motif_infos, outdir, num_iterations, debug)

Data descriptors defined here:

__dict__

Return a new OrderedDict which maps field names to their values

cluster

Alias for field number 1

debug

Alias for field number 12

feature_ids

Alias for field number 2

iteration

Alias for field number 0

max_cluster_rows

Alias for field number 7

meme_runner

Alias for field number 5

min_cluster_rows

Alias for field number 6

num_iterations

Alias for field number 11

num_motifs

Alias for field number 8

outdir

Alias for field number 10

previous_motif_infos

Alias for field number 9

seqs

Alias for field number 3

used_seqs

Alias for field number 4

Data and other attributes defined here:

_fields = ('iteration', 'cluster', 'feature_ids', 'seqs', 'used_seqs', 'meme_runner', 'min_cluster_rows', 'max_cluster_rows', 'num_motifs', 'previous_motif_infos', 'outdir', 'num_iterations', 'debug')

Methods inherited from __builtin__.tuple:

__add__(...)
x.__add__(y) <==> x+y

__contains__(...)
x.__contains__(y) <==> y in x

__eq__(...)
x.__eq__(y) <==> x==y

__ge__(...)
x.__ge__(y) <==> x>=y

__getattribute__(...)
x.__getattribute__('name') <==> x.name

__getitem__(...)
x.__getitem__(y) <==> x[y]

__getslice__(...)
x.__getslice__(i, j) <==> x[i:j] Use of negative indices is not supported.

__gt__(...)
x.__gt__(y) <==> x>y

__hash__(...)
x.__hash__() <==> hash(x)

__iter__(...)
x.__iter__() <==> iter(x)

__le__(...)
x.__le__(y) <==> x<=y

__len__(...)
x.__len__() <==> len(x)

__lt__(...)
x.__lt__(y) <==> x<y

__mul__(...)
x.__mul__(n) <==> x*n

__ne__(...)
x.__ne__(y) <==> x!=y

__rmul__(...)
x.__rmul__(n) <==> n*x

__sizeof__(...)
T.__sizeof__() -- size of T in memory, in bytes

count(...)
T.count(value) -> integer -- return number of occurrences of value

index(...)
T.index(value, [start, [stop]]) -> integer -- return first index of value. Raises ValueError if the value is not present.

class MemeScoringFunction(MotifScoringFunctionBase)

    Scoring function for motifs

Method resolution order:

MemeScoringFunction

MotifScoringFunctionBase

cmonkey.scoring.ScoringFunctionBase

Methods defined here:

__init__(self, organism, membership, ratios, config_params=None)
creates a ScoringFunction

initialize(self, args)
process additional parameters

meme_runner(self)
returns the MEME runner object

Methods inherited from MotifScoringFunctionBase:

compute(self, iteration_result, ref_matrix=None)
override base class compute() method, behavior is more complicated, since it nests Motif and MEME runs

compute_force(self, iteration_result, ref_matrix=None)
override base class compute() method, behavior is more complicated, since it nests Motif and MEME runs

compute_pvalues(self, iteration_result, num_motifs, force)
Compute motif scores. The result is a dictionary from cluster -> (feature_id, pvalue) containing a sparse gene-to-pvalue mapping for each cluster In order to influence the sequences that go into meme, the user can specify a list of sequence filter functions that have the signature (seqs, feature_ids, distance) -> seqs These filters are applied in the order they appear in the list.

last_cached(self)

motif_in_iteration(self, i)
TODO: change to an id that is not called 'MEME'

run_logs(self)

Methods inherited from cmonkey.scoring.ScoringFunctionBase:

check_requirements(self)
Give the scoring module an opportunity to check whether the requirements to run are all met

do_compute(self, iteration_result, ref_matrix=None)

gene_names(self)
returns the gene names

num_clusters(self)
returns the number of clusters

pickle_path(self)
returns the function-specific pickle-path

rows_for_cluster(self, cluster)
returns the rows for the specified cluster

run_in_iteration(self, i)

scaling(self, iteration)
returns the quantile normalization scaling for the specified iteration

set_score_means(self, iteration_result, matrix)

class MotifScoringFunctionBase(cmonkey.scoring.ScoringFunctionBase)

    Base class for motif scoring functions that use MEME This class of scoring function has 2 schedules: 1. run_in_iteration(i) is the normal schedule 2. motif_in_iteration(i) determines when the motifing tools is run

Methods defined here:

__init__(self, id, organism, membership, ratios, seqtype, config_params=None)
creates a ScoringFunction

compute(self, iteration_result, ref_matrix=None)
override base class compute() method, behavior is more complicated, since it nests Motif and MEME runs

compute_force(self, iteration_result, ref_matrix=None)
override base class compute() method, behavior is more complicated, since it nests Motif and MEME runs

compute_pvalues(self, iteration_result, num_motifs, force)
Compute motif scores. The result is a dictionary from cluster -> (feature_id, pvalue) containing a sparse gene-to-pvalue mapping for each cluster In order to influence the sequences that go into meme, the user can specify a list of sequence filter functions that have the signature (seqs, feature_ids, distance) -> seqs These filters are applied in the order they appear in the list.

last_cached(self)

motif_in_iteration(self, i)
TODO: change to an id that is not called 'MEME'

run_logs(self)

Methods inherited from cmonkey.scoring.ScoringFunctionBase:

check_requirements(self)
Give the scoring module an opportunity to check whether the requirements to run are all met

do_compute(self, iteration_result, ref_matrix=None)

gene_names(self)
returns the gene names

num_clusters(self)
returns the number of clusters

pickle_path(self)
returns the function-specific pickle-path

rows_for_cluster(self, cluster)
returns the rows for the specified cluster

run_in_iteration(self, i)

scaling(self, iteration)
returns the quantile normalization scaling for the specified iteration

set_score_means(self, iteration_result, matrix)

class WeederRunner

    Wrapper around Weeder so we can use the multiprocessing module. The function basically runs Weeder ont the specified set of sequences, converts its output to a MEME output file and runs MAST on the MEME output to generate a MEME run result.

Methods defined here:

__call__(self, params)
call the runner like a function

__init__(self, meme_suite, config_params, remove_tempfiles=True)
create a runner object

class WeederScoringFunction(MotifScoringFunctionBase)

    Motif scoring function that runs Weeder instead of MEME

Method resolution order:

WeederScoringFunction

MotifScoringFunctionBase

cmonkey.scoring.ScoringFunctionBase

Methods defined here:

__init__(self, organism, membership, ratios, config_params=None)
creates a scoring function

check_requirements(self)

meme_runner(self)
returns the MEME runner object

Methods inherited from MotifScoringFunctionBase:

compute(self, iteration_result, ref_matrix=None)
override base class compute() method, behavior is more complicated, since it nests Motif and MEME runs

compute_force(self, iteration_result, ref_matrix=None)
override base class compute() method, behavior is more complicated, since it nests Motif and MEME runs

compute_pvalues(self, iteration_result, num_motifs, force)
Compute motif scores. The result is a dictionary from cluster -> (feature_id, pvalue) containing a sparse gene-to-pvalue mapping for each cluster In order to influence the sequences that go into meme, the user can specify a list of sequence filter functions that have the signature (seqs, feature_ids, distance) -> seqs These filters are applied in the order they appear in the list.

last_cached(self)

motif_in_iteration(self, i)
TODO: change to an id that is not called 'MEME'

run_logs(self)

Methods inherited from cmonkey.scoring.ScoringFunctionBase:

do_compute(self, iteration_result, ref_matrix=None)

gene_names(self)
returns the gene names

num_clusters(self)
returns the number of clusters

pickle_path(self)
returns the function-specific pickle-path

rows_for_cluster(self, cluster)
returns the rows for the specified cluster

run_in_iteration(self, i)

scaling(self, iteration)
returns the quantile normalization scaling for the specified iteration

set_score_means(self, iteration_result, matrix)

Functions

cluster_seqs(params)
Retrieves the sequences for a cluster. Designed to run in in pool.map()

compute_cluster_score(params)
This function computes the MEME score for a cluster

compute_mean_score(pvalue_matrix, membership, organism)
cluster-specific mean scores

get_remove_atgs_filter(distance)
returns a remove ATG filter

get_remove_low_complexity_filter(meme_suite)
Factory method that returns a low complexity filter

meme_json(run_result)

pvalues2matrix(all_pvalues, num_clusters, gene_names, reverse_map)
converts a map from {cluster: {feature: pvalue}} to a scoring matrix

unique_filter(seqs, feature_ids)
returns a map that contains only the keys that are in feature_ids and only contains unique sequences

Data

MEMBERSIP = None
ORGANISM = None
SEQUENCE_FILTERS = None

Data
		MEMBERSIP = None ORGANISM = None SEQUENCE_FILTERS = None