evcouplings.mutate package¶
evcouplings.mutate.calculations module¶
High-level mutation calculation functions for EVmutation
Todo
implement segment handling
- Authors:
- Thomas A. Hopf Anna G. Green (generalization for multiple segments)
-
evcouplings.mutate.calculations.
extract_mutations
(mutation_string, offset=0, sep=', ')[source]¶ Turns a string containing mutations of the format I100V into a list of tuples with format (100, ‘I’, ‘V’) (index, from, to)
Parameters: Returns: List of tuples of the form (index+offset, from, to)
Return type: list of tuples
-
evcouplings.mutate.calculations.
predict_mutation_table
(model, table, output_column='prediction_epistatic', mutant_column='mutant', hamiltonian='full', segment=None)[source]¶ Predicts all mutants in a dataframe and adds predictions as a new column.
If mutant_column is None, the dataframe index is used, otherwise the given column.
Mutations which cannot be calculated (e.g. not covered by alignment, or invalid substitution) using object are set to NaN.
Parameters: - model (CouplingsModel) – CouplingsModel instance used to compute mutation effects
- table (pandas.DataFrame) – DataFrame with mutants to which delta of statistical energy will be added
- mutant_column (str) – Name of column in table that contains mutants
- output_column (str) – Name of column in returned dataframe that will contain computed effects
- hamiltonian ({"full", "couplings", "fields"},) – default: “full” Use full Hamiltonian of exponential model (default), or only couplings / fields for statistical energy calculation.
- segment (str, default: None) – Specificy a segment identifier to use for the positions in the mutation table. This will only be used if the mutation table doesn’t already have a segments column.
Returns: Dataframe with added column (mutant_column) that contains computed mutation effects
Return type: pandas.DataFrame
-
evcouplings.mutate.calculations.
single_mutant_matrix
(model, output_column='prediction_epistatic', exclude_self_subs=True)[source]¶ Create table with all possible single substitutions of target sequence in CouplingsModel object.
Parameters: - model (CouplingsModel) – Model that will be used to predict single mutants
- output_column (str, default: "prediction_epistatic") – Name of column in Dataframe that will contain predictions
- exclude_self_subs (bool, default: True) – Exclude self-substitutions (e.g. A100A) from results
Returns: DataFrame with predictions for all single mutants
Return type: pandas.DataFrame
-
evcouplings.mutate.calculations.
split_mutants
(x, mutant_column='mutant')[source]¶ Splits mutation strings into individual columns in DataFrame (wild-type symbol(s), position(s), substitution(s), number of mutations). This function is e.g. helpful when computing average effects per position using pandas groupby() operations
Parameters: - x (pandas.DataFrame) – Table with mutants
- mutant_column (str, default: "mutant") – Column which contains mutants, set to None to use index of DataFrame
Returns: DataFrame with added columns “num_subs”, “pos”, “wt” and “subs” that contain the number of mutations, and split mutation strings (if higher-order mutations, symbols/numbers are comma-separated)
Return type: pandas.DataFrame
evcouplings.mutate.protocol module¶
Sequence statistical energy and mutation effect computation protocols
- Authors:
- Thomas A. Hopf Anna G. Green (complex)
-
evcouplings.mutate.protocol.
complex
(**kwargs)[source]¶ Protocol: Mutation effect prediction and visualization for protein complexes
Parameters: kwargs arguments (Mandatory) – See list below in code where calling check_required Returns: outcfg – Output configuration of the pipeline, including the following fields: - mutation_matrix_file
- [mutation_dataset_predicted_file]
Return type: dict
-
evcouplings.mutate.protocol.
run
(**kwargs)[source]¶ Run mutation protocol
Parameters: kwargs arguments (Mandatory) – protocol: EC protocol to run prefix: Output prefix for all generated files Returns: outcfg – Output configuration of stage (see individual protocol for fields) Return type: dict
-
evcouplings.mutate.protocol.
standard
(**kwargs)[source]¶ Protocol: Mutation effect calculation and visualization for protein monomers
TODO: eventually merge with complexes to make a protocol agnostic to the number of segments
Parameters: kwargs arguments (Mandatory) – See list below in code where calling check_required Returns: outcfg – Output configuration of the pipeline, including the following fields: - mutation_matrix_file
- [mutation_dataset_predicted_file]
Return type: dict