evcouplings.utils package¶
evcouplings.utils.app module¶
evcouplings.utils.batch module¶
Looping through batches of jobs (former submit_job.py and buildali loop)
- Authors:
- Benjamin Schubert, Thomas A. Hopf
-
class
evcouplings.utils.batch.
AClusterSubmitter
[source]¶ Bases:
evcouplings.utils.batch.ASubmitter
Abstract subclass of a cluster submitter
-
cancel
(command)[source]¶ Consumes a list of jobIDs and trys to cancel them
Parameters: command (Command) – The Command jobejct to cancel Returns: If job was canceled Return type: bool
-
cancel_command
¶
-
db
¶ The persistent DB to keep track of all submitted jobs and their status
Returns: The Persistent DB Return type: PersistentDict
-
job_id_pattern
¶
-
monitor
(command)[source]¶ Returns the status of the consumed command
Parameters: command (Command) – The command object whose status is inquired Returns: The status of the Command Return type: Enum(Status)
-
monitor_command
¶
-
resource_flags
¶
-
-
class
evcouplings.utils.batch.
APluginRegister
(name, bases, nmspc)[source]¶ Bases:
abc.ABCMeta
This class allows automatic registration of new plugins.
-
class
evcouplings.utils.batch.
ASubmitter
[source]¶ Bases:
object
Interface for all submitters
-
cancel
(command)[source]¶ Consumes a list of jobIDs and trys to cancel them
Parameters: command (Command) – The Command jobejct to cancel Returns: If job was canceled Return type: bool
-
isBlocking
¶ Indicator whether the submitter is blocking or not
Returns: whether submitter blocks by calling join or not Return type: bool
-
monitor
(command)[source]¶ Returns the status of the consumed command
Parameters: command (Command) – The command object whose status is inquired Returns: The status of the Command Return type: Enum(Status)
-
registry
= {'local': <class 'evcouplings.utils.batch.LocalSubmitter'>, 'lsf': <class 'evcouplings.utils.batch.LSFSubmitter'>, 'sge': <class 'evcouplings.utils.batch.SGESubmitter'>, 'slurm': <class 'evcouplings.utils.batch.SlurmSubmitter'>}¶
-
-
class
evcouplings.utils.batch.
Command
(command, name=None, environment=None, workdir=None, resources=None)[source]¶ Bases:
object
Wrapper around the command parameters needed to execute a script
-
class
evcouplings.utils.batch.
EJob
[source]¶ Bases:
enum.Enum
An enumeration.
-
CANCEL
= 2¶
-
MONITOR
= 1¶
-
PID
= 5¶
-
STOP
= 3¶
-
SUBMIT
= 0¶
-
UPDATE
= 4¶
-
-
evcouplings.utils.batch.
EResource
¶ alias of
evcouplings.utils.batch.Enum
-
evcouplings.utils.batch.
EStatus
¶ alias of
evcouplings.utils.batch.Enum
-
class
evcouplings.utils.batch.
LSFSubmitter
(blocking=False, db_path=None)[source]¶ Bases:
evcouplings.utils.batch.AClusterSubmitter
Implements an LSF submitter
-
cancel_command
¶
-
db
¶ The persistent DB to keep track of all submitted jobs and their status
Returns: The Persistent DB Return type: PersistentDict
-
isBlocking
¶ Indicator whether the submitter is blocking or not
Returns: whether submitter blocks by calling join or not Return type: bool
-
job_id_pattern
¶
-
monitor_command
¶
-
resource_flags
¶
-
submit_command
¶
-
-
class
evcouplings.utils.batch.
LocalSubmitter
(blocking=True, db_path=None, ncpu=1)[source]¶ Bases:
evcouplings.utils.batch.ASubmitter
-
cancel
(command)[source]¶ Consumes a list of jobIDs and trys to cancel them
Parameters: command (Command) – The Command jobejct to cancel Returns: If job was canceled Return type: bool
-
isBlocking
¶ Indicator whether the submitter is blocking or not
Returns: whether submitter blocks by calling join or not Return type: bool
-
-
class
evcouplings.utils.batch.
SGESubmitter
(blocking=False, db_path=None)[source]¶ Bases:
evcouplings.utils.batch.AClusterSubmitter
Implements an LSF submitter
-
cancel_command
¶
-
db
¶ The persistent DB to keep track of all submitted jobs and their status
Returns: The Persistent DB Return type: PersistentDict
-
isBlocking
¶ Indicator whether the submitter is blocking or not
Returns: whether submitter blocks by calling join or not Return type: bool
-
job_id_pattern
¶
-
monitor_command
¶
-
resource_flags
¶
-
submit_command
¶
-
-
class
evcouplings.utils.batch.
SlurmSubmitter
(blocking=False, db_path=None)[source]¶ Bases:
evcouplings.utils.batch.AClusterSubmitter
Implements an LSF submitter
-
cancel_command
¶
-
db
¶ The persistent DB to keep track of all submitted jobs and their status
Returns: The Persistent DB Return type: PersistentDict
-
isBlocking
¶ Indicator whether the submitter is blocking or not
Returns: whether submitter blocks by calling join or not Return type: bool
-
job_id_pattern
¶
-
monitor_command
¶
-
resource_flags
¶
-
submit_command
¶
-
evcouplings.utils.calculations module¶
General calculation functions.
- Authors:
- Thomas A. Hopf
-
evcouplings.utils.calculations.
dihedral_angle
(p0, p1, p2, p3)[source]¶ Compute dihedral angle given four points
Adapted from the following source: http://stackoverflow.com/questions/20305272/dihedral-torsion-angle-from-four-points-in-cartesian-coordinates-in-python (answer by user Praxeolitic)
Parameters: - p0 (np.array) – Coordinates of first point
- p1 (np.array) – Coordinates of second point
- p2 (np.array) – Coordinates of third point
- p3 (np.array) – Coordinates of fourth point
Returns: Dihedral angle (in radians)
Return type: numpy.float
-
evcouplings.utils.calculations.
entropy
(X, normalize=False)[source]¶ Calculate entropy of distribution
Parameters: - X (np.array) – Vector for which entropy will be calculated
- normalize – Rescale entropy to range from 0 (“variable”, “flat”) to 1 (“conserved”)
Returns: Entropy of X
Return type:
-
evcouplings.utils.calculations.
entropy_map
(model, normalize=True)[source]¶ Compute dictionary of positional entropies for single-site frequencies in a CouplingsModel
Parameters: - model (CouplingsModel) – Model for which entropy of sequence alignment will be computed (based on single-site frequencies f_i(A_i) contained in model)
- normalize (bool, default: True) – Normalize entropy to range 0 (variable) to 1 (conserved) instead of raw values
Returns: Map from positions in sequence (int) to entropy of column (float) in alignment
Return type:
-
evcouplings.utils.calculations.
entropy_vector
(model, normalize=True)[source]¶ Compute vector of positional entropies for single-site frequencies in a CouplingsModel
Parameters: - model (CouplingsModel) – Model for which entropy of sequence alignment will be computed (based on single-site frequencies f_i(A_i) contained in model)
- normalize (bool, default: True) – Normalize entropy to range 0 (variable) to 1 (conserved) instead of raw values
Returns: Vector of length model.L containing entropy for each position
Return type: np.array
-
evcouplings.utils.calculations.
median_absolute_deviation
(x, scale=1.4826)[source]¶ Compute median absolute deviation of a set of numbers (median of deviations from median)
Parameters: - x (list-like of float) – Numbers for which median absolute deviation will be computed
- scale (float, optional (default: 1.4826)) – Rescale median absolute deviation by this factor; default value is such that median absolute deviation will match regular standard deviation of Gaussian distribution
evcouplings.utils.config module¶
Configuration handling
Todo
switch ruamel.yaml to round trip loading to preserver order and comments?
- Authors:
- Thomas A. Hopf
-
exception
evcouplings.utils.config.
InvalidParameterError
[source]¶ Bases:
Exception
Exception for invalid parameter settings
-
exception
evcouplings.utils.config.
MissingParameterError
[source]¶ Bases:
Exception
Exception for missing parameters
-
evcouplings.utils.config.
check_required
(params, keys)[source]¶ Verify if required set of parameters is present in configuration
Parameters: - params (dict) – Dictionary with parameters
- keys (list-like) – Set of parameters that has to be present in params
Raises:
-
evcouplings.utils.config.
iterate_files
(outcfg, subset=None)[source]¶ Generator function to iterate a list of file items in an outconfig
Parameters: Returns: Generator over tuples (file path, entry key, index). index will be None if this is a single file entry (i.e. ending with _file rather than _files).
Return type:
-
evcouplings.utils.config.
parse_config
(config_str, preserve_order=False)[source]¶ Parse a configuration string
Parameters: Returns: Configuration dictionary
Return type:
evcouplings.utils.constants module¶
Useful values and constants for all of package
- Authors:
- Thomas A. Hopf
evcouplings.utils.database module¶
evcouplings.utils.helpers module¶
Useful Python helpers
- Authors:
- Thomas A. Hopf, Benjamin Schubert
-
class
evcouplings.utils.helpers.
DefaultOrderedDict
(default_factory=None, **kwargs)[source]¶ Bases:
collections.OrderedDict
Source: http://stackoverflow.com/questions/36727877/inheriting-from-defaultddict-and-ordereddict Answer by http://stackoverflow.com/users/3555845/daniel
Maybe this one would be better? http://stackoverflow.com/questions/6190331/can-i-do-an-ordered-default-dict-in-python
-
class
evcouplings.utils.helpers.
PersistentDict
(filename, flag='c', mode=None, format='json', *args, **kwds)[source]¶ Bases:
dict
Persistent dictionary with an API compatible with shelve and anydbm.
The dict is kept in memory, so the dictionary operations run as fast as a regular dictionary.
Write to disk is delayed until close or sync (similar to gdbm’s fast mode).
Input file format is automatically discovered. Output file format is selectable between pickle, json, and csv. All three serialization formats are backed by fast C implementations.
-
class
evcouplings.utils.helpers.
Progressbar
(total_size, bar_length=60)[source]¶ Bases:
object
Progress bar for command line programs
Parameters:
-
evcouplings.utils.helpers.
find_segments
(data)[source]¶ Find consecutive number segments, based on Python 2.7 itertools recipe
Parameters: data (iterable) – Iterable in which to look for consecutive number segments (has to be in order)
-
evcouplings.utils.helpers.
range_overlap
(a, b)[source]¶ - Source: http://stackoverflow.com/questions/2953967/
- built-in-function-for-computing-overlap-in-python
Function assumes that start < end for a and b
Note
Ends of range are not inclusive
Parameters: Returns: Length of overlap between ranges a and b
Return type:
-
evcouplings.utils.helpers.
render_template
(template_file, mapping)[source]¶ Render a template using jinja2 and substitute values from mapping
Parameters: Returns: Rendered template
Return type:
-
evcouplings.utils.helpers.
retry
(func, retry_max_number=None, retry_wait=None, exceptions=None, retry_action=None, fail_action=None)[source]¶ Retry to execute a function as often as requested
Parameters: - func (callable) – Function to be executed until succcessful
- retry_max_number (int, optional (default: None)) – Maximum number of retries. If None, will retry forever.
- retry_wait (int, optional (default: None)) – Number of seconds to wait before attempting retry
- exceptions (exception or tuple(exception)) – Single or tuple of exceptions to catch for retrying (any other exception will cause immediate fail)
- retry_action (callable) – Function to execute upon a retry
- fail_action – Function to execute upon final failure
evcouplings.utils.pipeline module¶
evcouplings.utils.summarize module¶
evcouplings.utils.system module¶
System-level calls to external tools, directory creation, etc.
- Authors:
- Thomas A. Hopf
-
exception
evcouplings.utils.system.
ExternalToolError
[source]¶ Bases:
Exception
Exception for failing external calculations
-
exception
evcouplings.utils.system.
ResourceError
[source]¶ Bases:
Exception
Exception for missing resources (files, URLs, …)
-
evcouplings.utils.system.
create_prefix_folders
(prefix)[source]¶ Create a directory tree contained in a prefix.
- prefix : str
- Prefix containing directory tree
-
evcouplings.utils.system.
get
(url, output_path=None, allow_redirects=False)[source]¶ Download external resource
Parameters: Returns: r – Response object, use r.text to access text, r.json() to decode json, and r.content for raw bytestring
Return type: requests.models.Response
Raises:
-
evcouplings.utils.system.
get_urllib
(url, output_path)[source]¶ Download external resource to file using urllib. This function is intended for cases where get() implemented using requests can not be used, e.g. for download from an FTP server.
Parameters:
-
evcouplings.utils.system.
insert_dir
(prefix, *dirs, rootname_subdir=True)[source]¶ Create new path by inserting additional directories into the folder tree of prefix (but keeping the filename prefix at the end),
Parameters: Returns: Extended path
Return type:
-
evcouplings.utils.system.
makedirs
(directories)[source]¶ Create directory subtree, some or all of the folders may already exist.
Parameters: directories (str) – Directory subtree to create
-
evcouplings.utils.system.
run
(cmd, stdin=None, check_returncode=True, working_dir=None, shell=False, env=None)[source]¶ Run external program as subprocess.
Parameters: - cmd (str or list of str) – Command (and optional command line arguments)
- stdin (str or byte sequence, optional (default: None)) – Input to be sent to STDIN of the process
- check_returncode (bool, optional (default=True)) – Verify if call had returncode == 0, otherwise raise ExternalToolError
- working_dir (str, optional (default: None)) – Change to this directory before running command
- shell (bool, optional (default: False)) – Invoke shell when calling subprocess (default: False)
- env (dict, optional (default: None)) – Use this environment for executing the subprocess
Returns: - int – Return code of process
- stdout – Byte string with stdout output
- stderr – Byte string of stderr output
Raises:
-
evcouplings.utils.system.
temp
()[source]¶ Create a temporary file
Returns: Path of temporary file Return type: str
-
evcouplings.utils.system.
tempdir
()[source]¶ Create a temporary directory
Returns: Path of temporary directory Return type: str
-
evcouplings.utils.system.
valid_file
(file_path)[source]¶ Verify if a file exists and is not empty.
Parameters: file_path (str) – Path to file to check Returns: True if file exists and is non-zero size, False otherwise. Return type: bool
-
evcouplings.utils.system.
verify_resources
(message, *args)[source]¶ Verify if a set of files exists and is not empty.
Parameters: - message (str) – Message to display with raised ResourceError
- *args (List of str) – Path(s) of file(s) to be checked
Raises: ResourceError
– If any of the resources does not exist or is empty
evcouplings.utils.update_database module¶
command-line app to update the necessary databases
- Authors:
- Benjamin Schubert
-
evcouplings.utils.update_database.
download_ftp_file
(ftp_url, ftp_cwd, file_url, output_path, file_handling='wb', gziped=False, verbose=False)[source]¶ Downloads a gzip file from a remote ftp server and decompresses it on the fly into an output file
Parameters: - ftp_url (str) – the FTP server url
- ftp_cwd (str) – the FTP directory of the file to download
- file_url (str) – the file name that gets downloaded
- output_path (str) – the path to the output file on the local system
- file_handling (str) – the file handling mode (default: ‘wb’)
- verbose (bool) – determines whether a progressbar is printed