#!/usr/bin/env python # coding: utf-8 # # *This notebook contains material from [PyRosetta](https://RosettaCommons.github.io/PyRosetta.notebooks); # content is available [on Github](https://github.com/RosettaCommons/PyRosetta.notebooks.git).* # # < [*De Novo* Parametric Backbone Design](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/06.06-Introduction-to-Parametric-backbone-design.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [**Point Mutation Scan**](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/06.08-Point-Mutation-Scan.ipynb) >

# # *De Novo* Protein Design with PyRosetta # Keywords: # # ## Overview # **Make sure you are in the directory with the `.pdb` files:** # # `cd google_drive/MyDrive/student-notebooks/` # *Warning*: This notebook uses `pyrosetta.distributed.viewer` code, which runs in `jupyter notebook` and might not run if you're using `jupyterlab`. # *Note:* This Jupyter notebook requires the PyRosetta distributed layer which is obtained by building PyRosetta with the `--serialization` flag or installing PyRosetta from the RosettaCommons conda channel (for more information, visit: http://www.pyrosetta.org/dow). # In[ ]: get_ipython().system('pip install pyrosettacolabsetup') import pyrosettacolabsetup; pyrosettacolabsetup.install_pyrosetta() import pyrosetta; pyrosetta.init() import Bio.Data.IUPACData as IUPACData import Bio.SeqUtils import logging logging.basicConfig(level=logging.INFO) import os import pyrosetta import pyrosetta.distributed import pyrosetta.distributed.viewer as viewer import site import sys # Initialize PyRosetta: # In[ ]: flags = """ -linmem_ig 10 -ignore_unrecognized_res 1 -mute core.select.residue_selector.SecondaryStructureSelector -mute core.select.residue_selector.PrimarySequenceNeighborhoodSelector -mute protocols.DsspMover """ pyrosetta.distributed.init(flags) # Let's setup the pose of a *de novo* helical bundle from the PDB for downstream design: https://www.rcsb.org/structure/5J0J # In[ ]: start_pose = pyrosetta.io.pose_from_file("inputs/5J0J.clean.pdb") pose = start_pose.clone() # ## Design Strategy # # Minimize the crystal structure coordinates, then convert chain "A" to poly-alanine, and then perform one-sided protein-protein interface design designing chain A while only re-packing the homotrimer interface residues of chains B and C! Therefore, we are designing a homotrimer into a heterotrimer. Furthermore, prevent backbone torsions from minimizing and only minimize the side-chains of the homotrimer interface and all of chain A using the `FastDesign` mover! After design, repack and minimize all side-chains. # Prior to and after design, we want to relax the protein with a scorefunction that packs the rotamers and minimizes the side-chain degrees of freedom while optimizing for realistic energies. If you were to allow backbone minimization (which you may optionally choose to), we would want use a Cartesian scorefunction (in this case, `ref2015_cart.wts`) which automatically sets `cart_bonded` scoreterm to a weight of 1.0, which helps to close chain breaks in the backbone. Similarly, you might want to also turn on the `coordinate_constraint` scoreterm to penalize deviations of the backbone coordinates from their initial coordinates during minimization. In this tutorial, we demonstrate that concept but will prevent backbone torsions from being minimized (however, feel free to turn on backbone minimization!): # In[ ]: relax_scorefxn = pyrosetta.create_score_function("ref2015_cart.wts") relax_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.coordinate_constraint, 1.0) print("The starting pose total_score is {}".format(relax_scorefxn(start_pose))) # For *design*, if we allowed backbone minimzation we again would turn on the `coordinate_constraint` scoreterm to penalize deviations of the backbone coordinates from their initial coordinates during minimization. Additionally, we will use "non-pairwise decomposable" scoreterms to guide the packer trajectories (i.e. fixed backbone sequence design trajectories) in favor of our hypothetical design requirements to solve our biological problem. In this tutorial, we will make use of the following additional scoreterms during design: # - `aa_composition`: This scoring term is intended for use during design, to penalize deviations from a desired residue type composition. Applies to any amino acid composition requirements specified by the user within a ResidueSelector. This scoreterm also applies to the `AddHelixSequenceConstraints` mover which sets up ideal sequence constraints for each helix in a pose or in a selection. # - `voids_penalty`: This scoring term is intended for use during design, to penalize buried voids or cavities and to guide the packer to design solutions in which all buried volume is filled with side-chains. # - `aa_repeat`: This wholebody scoring term is intended for use during design, to penalize long stretches in which the same residue type repeats over and over (e.g. poly-Q sequences). # - `buried_unsatisfied_penalty`: This scoring term is intended for use during design, to provide a penalty for buried hydrogen bond donors or acceptors that are unsatisfied. # - `netcharge`: This scoring term is intended for use during design, to penalize deviations from a desired net charge in a pose or in a selection. # - `hbnet`: This scoring term is intended for use during design, to provide a bonus for hydrogen bond network formation. # In[ ]: design_scorefxn = pyrosetta.create_score_function("ref2015_cart.wts") design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.coordinate_constraint, 10.0) design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.aa_composition, 1.0) design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.voids_penalty, 0.25) design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.aa_repeat, 1.0) design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.buried_unsatisfied_penalty, 1.0) design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.hbnet, 1.0) design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.netcharge, 1.0) print("The starting pose total_score is {}".format(design_scorefxn(start_pose))) # By using the `relax_scorefxn` before and after the `design_scorefxn`, we ensure that these "non-pairwise decomposable" scoreterms are not forcing unrealistic rotamers that would otherwise not be held in place without these additional scoreterms. # Prior to any deviation from crystal structure coordinates, apply coordinate constraints to all of the backbone heavy atoms: # In[ ]: true_selector = pyrosetta.rosetta.core.select.residue_selector.TrueResidueSelector() # Select all residues # Apply a virtual root onto the pose to prevent large lever-arm effects while minimizing with coordinate constraints virtual_root = pyrosetta.rosetta.protocols.simple_moves.VirtualRootMover() virtual_root.set_removable(True) virtual_root.set_remove(False) virtual_root.apply(pose) # Construct the CoordinateConstraintGenerator coord_constraint_gen = pyrosetta.rosetta.protocols.constraint_generator.CoordinateConstraintGenerator() coord_constraint_gen.set_id("contrain_all_backbone_atoms!") coord_constraint_gen.set_ambiguous_hnq(False) coord_constraint_gen.set_bounded(False) coord_constraint_gen.set_sidechain(False) coord_constraint_gen.set_sd(1.0) # Sets a standard deviation of contrained atoms to (an arbitrary) 1.0 Angstroms RMSD. Set higher or lower for different results. coord_constraint_gen.set_ca_only(False) coord_constraint_gen.set_residue_selector(true_selector) # Apply the CoordinateConstraintGenerator using the AddConstraints mover add_constraints = pyrosetta.rosetta.protocols.constraint_generator.AddConstraints() add_constraints.add_generator(coord_constraint_gen) add_constraints.apply(pose) # Prior to design, minimize with the `FastRelax` mover to optimize the pose within the `relax_scorefxn` scorefunction. Note: this takes ~1min 10s # In[ ]: tf = pyrosetta.rosetta.core.pack.task.TaskFactory() tf.push_back(pyrosetta.rosetta.core.pack.task.operation.InitializeFromCommandline()) tf.push_back(pyrosetta.rosetta.core.pack.task.operation.IncludeCurrent()) tf.push_back(pyrosetta.rosetta.core.pack.task.operation.NoRepackDisulfides()) tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset( pyrosetta.rosetta.core.pack.task.operation.PreventRepackingRLT(), true_selector)) # Set to RestrictToRepackingRLT for slower/better results mm = pyrosetta.rosetta.core.kinematics.MoveMap() mm.set_bb(False) # Set to true if desired mm.set_chi(True) mm.set_jump(False) fast_relax = pyrosetta.rosetta.protocols.relax.FastRelax(scorefxn_in=relax_scorefxn, standard_repeats=1) # 2-5 repeats suggested for real applications fast_relax.cartesian(True) fast_relax.set_task_factory(tf) fast_relax.set_movemap(mm) fast_relax.minimize_bond_angles(True) fast_relax.minimize_bond_lengths(True) fast_relax.min_type("lbfgs_armijo_nonmonotone") fast_relax.ramp_down_constraints(False) if not os.getenv("DEBUG"): get_ipython().run_line_magic('time', 'fast_relax.apply(pose)') # Optionally, instead of running this you could reload the saved pose from a previously run trajectory: #pose = pyrosetta.pose_from_file("minimized_start_pose.pdb") # Let's check the delta `total_score` per residue after minimizing in the `relax_scorefxn` scorefunction: # In[ ]: if not os.getenv("DEBUG"): initial_score_res = relax_scorefxn(start_pose)/start_pose.size() final_score_res = relax_scorefxn(pose)/pose.size() delta_total_score_res = final_score_res - initial_score_res print("{0} kcal/(mol*res) - {1} kcal/(mol*res) = {2} kcal/(mol*res)".format(final_score_res, initial_score_res, delta_total_score_res)) # We can see that the crystal structure coordinates were not quite optimal according to the `relax_scorefxn` scorefunction. So which model is correct? # # By how many Angstroms RMSD did the backbone Cα atoms move? # In[ ]: YOUR-CODE-HERE # For downstream analysis, we want to save the pose: # In[ ]: minimized_start_pose = pose.clone() pyrosetta.dump_pdb(minimized_start_pose, "outputs/minimized_start_pose.pdb") # ## *De Novo* Protein Design # # Prior to designing chain A, first let's make chain A poly-alanine so that we can re-design the sidechains onto the backbone: # In[ ]: # Since we will do direct pose manipulation, first remove the constraints remove_constraints = pyrosetta.rosetta.protocols.constraint_generator.RemoveConstraints() remove_constraints.add_generator(coord_constraint_gen) remove_constraints.apply(pose) # Obtain chain A and convert it to poly-alanine # In[ ]: keep_chA = pyrosetta.rosetta.protocols.grafting.simple_movers.KeepRegionMover(res_start=str(pose.chain_begin(1)), res_end=str(pose.chain_end(1))) keep_chA.apply(pose) polyA_chA = pyrosetta.rosetta.protocols.pose_creation.MakePolyXMover(aa="ALA", keep_pro=0, keep_gly=0, keep_disulfide_cys=0) polyA_chA.apply(pose) # Obtain chains B and C # In[ ]: pose_chBC = minimized_start_pose.clone() keep_chBC = pyrosetta.rosetta.protocols.grafting.simple_movers.KeepRegionMover(res_start=str(pose_chBC.chain_begin(2)), res_end=str(pose_chBC.chain_end(3)-1)) keep_chBC.apply(pose_chBC) # Append chains B and C onto the poly-alanine version of chain A # In[ ]: pyrosetta.rosetta.core.pose.append_pose_to_pose(pose1=pose, pose2=pose_chBC, new_chain=True) # Pose is now considered to have only 2 chains. # In[ ]: pose.num_chains() # Let's re-establish that there are 3 chains. # In[ ]: switch_chains = pyrosetta.rosetta.protocols.simple_moves.SwitchChainOrderMover() switch_chains.chain_order("12") switch_chains.apply(pose) print(pose.pdb_info()) print("Now the number of chains = {}".format(pose.num_chains())) # Re-apply backbone coordinate constraints: # In[ ]: virtual_root.apply(pose) add_constraints.apply(pose) # Have a look at the new pose, chain A in which is ready to be designed! # Next, we need to apply certain movers with our design specifications to activate certain non-pairwise decomposable scoreterms in the `design_scorefxn` scorefunction, that will be implemented when we run the `FastDesign` mover (or any downstream mover that calls the packer). # # We will frequently be using the following residue selectors: # In[ ]: chain_A = pyrosetta.rosetta.core.select.residue_selector.ChainSelector("A") chain_BC = pyrosetta.rosetta.core.select.residue_selector.NotResidueSelector(chain_A) # Apply the `AddCompositionConstraintMover` mover, which utilizes the `aa_composition` scoreterm. Applying the following mover to the pose imposes the design constraints that the ResidueSelector have 40% aliphatic or aromatic residues other than leucine (i.e. ALA, PHE, ILE, MET, PRO, VAL, TRP, or TYR), and 5% leucines. For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/AACompositionEnergy # In[ ]: add_composition_constraint = pyrosetta.rosetta.protocols.aa_composition.AddCompositionConstraintMover() add_composition_constraint.create_constraint_from_file_contents(""" PENALTY_DEFINITION OR_PROPERTIES AROMATIC ALIPHATIC NOT_TYPE LEU FRACT_DELTA_START -0.05 FRACT_DELTA_END 0.05 PENALTIES 1 0 1 # The above two lines mean that if we're 5% below or 5% above the desired content, we get a 1-point penalty. FRACTION 0.4 # Forty percent aromatic or aliphatic, but not leucine BEFORE_FUNCTION CONSTANT AFTER_FUNCTION CONSTANT END_PENALTY_DEFINITION PENALTY_DEFINITION TYPE LEU DELTA_START -1 DELTA_END 1 PENALTIES 1 0 1 FRACTION 0.05 # Five percent leucine BEFORE_FUNCTION CONSTANT AFTER_FUNCTION CONSTANT END_PENALTY_DEFINITION """) add_composition_constraint.add_residue_selector(chain_A) add_composition_constraint.apply(pose) # Apply the `AddHelixSequenceConstraints` mover, which utilizes the `aa_composition` scoreterm. By default, this mover adds five types of sequence constraints to the designable residues in each alpha helix in the pose. Any of these behaviours may be disabled or modified by invoking advanced options, but no advanced options need be set in most cases. The five types of sequence constraints are: # # - A strong sequence constraint requiring at least two negatively-charged residues in the first (N-terminal) three residues of each alpha-helix. # # - A strong sequence constraint requiring at least two positively-charged residues in the last (C-terminal) three residues of each alpha-helix. # # - A weak but strongly ramping sequence constraint penalizing helix-disfavoring residue types (by default, Asn, Asp, Ser, Gly, Thr, and Val) throughout each helix. (A single such residue is sometimes tolerated, but the penalty for having more than one residue in this category increases quadratically with the count of helix-disfavouring residues.) # # - A weak sequence constraint coaxing the helix to have 10% alanine. Because this constraint is weak, deviations from this value are tolerated, but this should prevent an excessive abundance of alanine residues. # # - A weak sequence constraint coaxing the helix to have at least 25% hydrophobic content. This constraint is also weak, so slightly less hydrophobic helices will be tolerated to some degree. Note that alanine is not considered to be "hydrophobic" within Rosetta. # # ### For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/AACompositionEnergy # In[ ]: add_helix_sequence_constraints = pyrosetta.rosetta.protocols.aa_composition.AddHelixSequenceConstraintsMover() add_helix_sequence_constraints.set_residue_selector(chain_A) add_helix_sequence_constraints.apply(pose) # Note: the `aa_repeat` scoreterm works out-of-the-box, and does not need to be applied to the pose to work, it just needs to have a weight of >0 in the scorefunction used by the packer. It imposes a penalty for each stretch of repeating amino acids, with the penalty value depending nonlinearly on the length of the repeating stretch. By default, 1- or 2-residue stretches incur no penalty, 3-residue stretches incur a penalty of +1, 4-residue stretches incur a penalty of +10, and 5-residue stretches or longer incur a penalty of +100. Since the term is sequence-based, it is really only useful for design -- that is, it will impose an identical penalty for a fixed-sequence pose, regardless its conformation. This also means that the term has no conformational derivatives: the minimizer ignores it completely. The term is not pairwise-decomposible, but has been made packer-compatible, so it can direct the sequence composition during a packer run. For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/Repeat-stretch-energy # Similarly, the `voids_penalty` scoreterm does not need to be applied to the pose to work, it just needs to have a weight of >0 in the scorefunction used by the packer. For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/VoidsPenaltyEnergy # Similarly, the `buried_unsatisfied_penalty` scoreterm does not need to be applied to the pose to work, it just needs to have a weight of >0 in the scorefunction used by the packer. For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/BuriedUnsatPenalty # Similarly, the `hbnet` scoreterm does not need to be applied to the pose to work, it just needs to have a weight of >0 (ideally between 1.0 to 10.0) in the scorefunction used by the packer. For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/HBNetEnergy # Apply the `AddNetChargeConstraintMover` mover, which utilizes the `netcharge` scoreterm. In this case, we require that the net charge in chain A must be exactly 0. # In[ ]: add_net_charge_constraint = pyrosetta.rosetta.protocols.aa_composition.AddNetChargeConstraintMover() add_net_charge_constraint.create_constraint_from_file_contents(""" DESIRED_CHARGE 0 #Desired net charge is zero. PENALTIES_CHARGE_RANGE -1 1 #Penalties are listed in the observed net charge range of -1 to +1. PENALTIES 1 0 1 #The penalties are 1 for an observed charge of -1, 0 for an observed charge of 0, and 1 for an observed charge of +1. BEFORE_FUNCTION QUADRATIC #Ramp quadratically for observed net charges of -2 or less. AFTER_FUNCTION QUADRATIC #Ramp quadratically for observed net charges of +2 or greater. """) add_net_charge_constraint.add_residue_selector(chain_A) add_net_charge_constraint.apply(pose) # Specify a custom `relaxscript` that optimizes the `ramp_repack_min` weights to prevent too many alanines from being designed (demonstrated at Pre-RosettaCON 2018): # Specify TaskOperations to be applied to chain A. In this case, let's use the latest LayerDesign implementation (Note: this is still being actively developed) using the XmlObjects class. # In[ ]: layer_design_task = pyrosetta.rosetta.protocols.rosetta_scripts.XmlObjects.create_from_string(""" """).get_task_operation("layer_design") # Also prepare a ResidueSelector for the heterotrimer interface within chains B and C, and the rest of chains B and C. We will use these with `RestrictAbsentCanonicalAASRLT`, `RestrictToRepackingRLT`, and `PreventRepackingRLT` TaskOperations: # In[ ]: interface = pyrosetta.rosetta.core.select.residue_selector.InterGroupInterfaceByVectorSelector() interface.group1_selector(chain_A) interface.group2_selector(chain_BC) chain_BC_interface = pyrosetta.rosetta.core.select.residue_selector.AndResidueSelector(interface, chain_BC) not_chain_BC_interface = pyrosetta.rosetta.core.select.residue_selector.NotResidueSelector(chain_BC_interface) chain_BC_not_interface = pyrosetta.rosetta.core.select.residue_selector.AndResidueSelector(not_chain_BC_interface, chain_BC) # For minimization, we also need the following ResidueSelector: # In[ ]: chain_A_and_BC_interface = pyrosetta.rosetta.core.select.residue_selector.OrResidueSelector(chain_A, chain_BC_interface) # Make sure each ResidueSelector selects the regions as desired (in the following case, the `ResidueSelector` `interface` is visualized: # In[ ]: view = viewer.init(pose) \ + viewer.setStyle() \ + viewer.setStyle(residue_selector=interface, colorscheme="greyCarbon") view() # Create TaskFactory to be used with the `FastRelax` mover (We use this instead of the FastDesign mover as the constructor allows us to set a relax script): # In[ ]: tf.clear() tf = pyrosetta.rosetta.core.pack.task.TaskFactory() tf.push_back(pyrosetta.rosetta.core.pack.task.operation.InitializeFromCommandline()) tf.push_back(pyrosetta.rosetta.core.pack.task.operation.IncludeCurrent()) tf.push_back(pyrosetta.rosetta.core.pack.task.operation.NoRepackDisulfides()) # Prevent repacking on chain_BC_not_interface tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset( pyrosetta.rosetta.core.pack.task.operation.PreventRepackingRLT(), chain_BC_not_interface)) # Repack only on chain_BC_interface tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset( pyrosetta.rosetta.core.pack.task.operation.RestrictToRepackingRLT(), chain_BC_interface)) # Enable design on chain_A aa_to_design = pyrosetta.rosetta.core.pack.task.operation.RestrictAbsentCanonicalAASRLT() aa_to_design.aas_to_keep("ACDEFGHIKLMNPQRSTVWY") tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset( aa_to_design, chain_A)) # Apply layer design tf.push_back(layer_design_task) # Convert the task factory into a PackerTask packer_task = tf.create_task_and_apply_taskoperations(pose) # View the PackerTask print(packer_task) # The PackerTask looks as intended. Now setup the `MoveMapFactory`: # In[ ]: # Set up a MoveMapFactory mmf = pyrosetta.rosetta.core.select.movemap.MoveMapFactory() mmf.all_bb(setting=False) # Set to true if needed mmf.all_bondangles(setting=False) mmf.all_bondlengths(setting=False) mmf.all_chi(setting=True) mmf.all_jumps(setting=False) mmf.set_cartesian(setting=False) # Set movemap actions to turn on or off certain torsions, overriding the above defaults enable = pyrosetta.rosetta.core.select.movemap.move_map_action.mm_enable disable = pyrosetta.rosetta.core.select.movemap.move_map_action.mm_disable # Set custom minimizable torsions mmf.add_bondangles_action(action=enable, selector=chain_A_and_BC_interface) mmf.add_bondlengths_action(action=enable, selector=chain_A_and_BC_interface) mmf.add_chi_action(action=enable, selector=chain_A_and_BC_interface) mmf.add_bondangles_action(action=disable, selector=chain_BC_not_interface) mmf.add_bondlengths_action(action=disable, selector=chain_BC_not_interface) mmf.add_chi_action(action=disable, selector=chain_BC_not_interface) # Now let's double-check some more `pose` information to verify that we are ready for `FastDesign`: # In[ ]: display_pose = pyrosetta.rosetta.protocols.fold_from_loops.movers.DisplayPoseLabelsMover() display_pose.tasks(tf) display_pose.movemap_factory(mmf) display_pose.apply(pose) # We are ready to setup the `FastRelax` mover: # In[ ]: #fast_design = pyrosetta.rosetta.protocols.denovo_design.movers.FastDesign(scorefxn_in=design_scorefxn, standard_repeats=1) # 2-5 repeats suggested for real applications fast_design = pyrosetta.rosetta.protocols.relax.FastRelax(scorefxn_in=design_scorefxn, standard_repeats=1, script_file="KillA2019") fast_design.cartesian(False) fast_design.set_task_factory(tf) fast_design.set_movemap_factory(mmf) fast_design.min_type("lbfgs_armijo_nonmonotone") fast_design.ramp_down_constraints(False) # Note that this takes ~37min 13s: # In[ ]: if not os.getenv("DEBUG"): get_ipython().run_line_magic('time', 'fast_design.apply(pose)') # Optionally, instead of running this you could reload the saved pose from a previously run trajectory: pose = pyrosetta.pose_from_file("expected_outputs/designed_pose.pdb") # Save this pose for downstream reference: # In[ ]: if not os.getenv("DEBUG"): designed_pose = pose.clone() #pyrosetta.dump_pdb(designed_pose, "outputs/designed_pose.pdb") # Now that we have re-designed chain A, it is strongly recommended to use `FastRelax` with a Cartesian scorefunction to repack and minimize with a realistic scorefuction. Note: this takes ~6min 27s # In[ ]: tf.clear() tf = pyrosetta.rosetta.core.pack.task.TaskFactory() tf.push_back(pyrosetta.rosetta.core.pack.task.operation.InitializeFromCommandline()) tf.push_back(pyrosetta.rosetta.core.pack.task.operation.IncludeCurrent()) tf.push_back(pyrosetta.rosetta.core.pack.task.operation.NoRepackDisulfides()) tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset( pyrosetta.rosetta.core.pack.task.operation.RestrictToRepackingRLT(), true_selector)) mm = pyrosetta.rosetta.core.kinematics.MoveMap() mm.set_bb(False) # Set to true if desired mm.set_chi(True) mm.set_jump(False) fast_relax = pyrosetta.rosetta.protocols.relax.FastRelax(scorefxn_in=relax_scorefxn, standard_repeats=1) # 2-5 repeats suggested for real applications fast_relax.cartesian(True) fast_relax.set_task_factory(tf) fast_relax.set_movemap(mm) fast_relax.minimize_bond_angles(True) fast_relax.minimize_bond_lengths(True) fast_relax.min_type("lbfgs_armijo_nonmonotone") # Cartisian scorefunction fast_relax.ramp_down_constraints(False) # To run the FastRelax trajectory. if not os.getenv("DEBUG"): get_ipython().run_line_magic('time', 'fast_relax.apply(pose)') #Or for speed, we will load the pose from a previous trajectory. pose = pyrosetta.pose_from_file("expected_outputs/designed_relaxed_pose.pdb") # Save the pose for downstream analysis: # In[ ]: designed_relaxed_pose = pose.clone() pyrosetta.dump_pdb(designed_relaxed_pose, "expected_outputs/designed_relaxed_pose.pdb") # Let's compare sequences to see what happened: # In[ ]: print(minimized_start_pose.chain_sequence(1)) print(designed_relaxed_pose.chain_sequence(1)) # View the resulting design! # In[ ]: chA = pyrosetta.rosetta.core.select.residue_selector.ChainSelector("A") chB = pyrosetta.rosetta.core.select.residue_selector.ChainSelector("B") chC = pyrosetta.rosetta.core.select.residue_selector.ChainSelector("C") view = sum( [ viewer.init(designed_relaxed_pose), viewer.setStyle(cartoon_color="lightgrey", radius=0.25), viewer.setSurface(residue_selector=chA, colorscheme="orangeCarbon", opacity=0.5, surface_type="VDW"), viewer.setSurface(residue_selector=chB, color="greenCarbon", opacity=0.5, surface_type="VDW"), viewer.setSurface(residue_selector=chB, color="violetCarbon", opacity=0.5, surface_type="VDW"), viewer.setDisulfides(radius=0.25), viewer.setZoom(factor=1.5) ] ) view() # ## Analysis: # By how many Angstroms RMSD did the backbone Cα atoms move? # # # In[ ]: # What is the delta `total_score` from `minimized_start_pose` to `designed_relaxed_pose`? # In[ ]: # What is the per-residue energy difference for each mutated position between `minimized_start_pose` and `designed_relaxed_pose`? # In[ ]: # Are the sequence constraints imposed by the `aa_repeat` scoreterm satisfied? Re-write the following python code that counts the number of residue types in chain A to check for the longest stretch of each residue type in the primary amino acid sequence in chain A: # In[ ]: for aa in IUPACData.protein_letters: aa_selector = pyrosetta.rosetta.core.select.residue_selector.ResidueNameSelector(str.upper(Bio.SeqUtils.seq3(aa))) aa_and_chain_A = pyrosetta.rosetta.core.select.residue_selector.AndResidueSelector(chain_A, aa_selector) sel_res_count_metric = pyrosetta.rosetta.core.simple_metrics.metrics.SelectedResidueCountMetric() sel_res_count_metric.set_residue_selector(aa_and_chain_A) print(aa, int(sel_res_count_metric.calculate(designed_relaxed_pose))) # Does chain A have 40% percent aromatic or aliphatic (but not leucine) and 5% leucine, satisfying the `aa_composition` scoreterm? For additional practice, re-write the following python implementation using PyRosetta ResidueSelectors and SimpleMetrics: # In[ ]: num_aro_ali = 0 num_leucine = 0 for aa in pose.chain_sequence(1): if aa in "WYFAVIMP": num_aro_ali += 1 if aa == "L": num_leucine += 1 print("The % aromatic residues in chain A is {0}%".format((num_aro_ali*100)/len(pose.chain_sequence(1)))) print("The % leucine in chain A is {0}%".format((num_leucine*100)/len(pose.chain_sequence(1)))) # Are the sequence constraints imposed by the `AddHelixSequenceConstraints` mover with the `aa_composition` scoreterm satisfied? # Uses sasa (solvent-accessible surface area) to asses whether there are there more or less voids in `designed_relaxed_pose` as compared to `minimized_start_pose`, satisfying the `voids_penalty` scoreterm. # In[ ]: tf.clear() tf = pyrosetta.rosetta.core.pack.task.TaskFactory() tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset( pyrosetta.rosetta.core.pack.task.operation.RestrictToRepackingRLT(), true_selector)) sasa = pyrosetta.rosetta.protocols.simple_filters.TaskAwareSASAFilter() sasa.task_factory(tf) print("The sasa of minimized_start_pose is {}".format(sasa.score(minimized_start_pose))) print("The sasa of designed_relaxed_pose is {}".format(sasa.score(designed_relaxed_pose))) # Is the net charge of chain A equal to exactly zero, satisfying the `netcharge` scoreterm? # In[ ]: # One method to calculate net charge of chain A num_negative = 0 num_positive = 0 for aa in pose.chain_sequence(1): if aa in "DE": num_negative += 1 if aa in "KR": num_positive += 1 print("The net charge of chain A = {0}".format(num_positive - num_negative)) # Another method to calculate net charge of chain A test_pose = pose.clone() keep_chA = pyrosetta.rosetta.protocols.grafting.simple_movers.KeepRegionMover(res_start=str(test_pose.chain_begin(1)), res_end=str(test_pose.chain_end(1))) keep_chA.apply(test_pose) net_charge = pyrosetta.rosetta.protocols.simple_filters.NetChargeFilter() print("The net charge of chain A = {0}".format(net_charge.score(test_pose))) # Are there any buried unsatisfied polar atoms, satisfying the `buried_unsatisfied_penalty` scoreterm? # In[ ]: uhb = pyrosetta.rosetta.protocols.rosetta_scripts.XmlObjects.create_from_string(""" """) # In[ ]: print("The number of unsatisfied side-chain heavy atoms in minimized_start_pose is {}".format(uhb.get_filter("uhb_sc").score(minimized_start_pose))) print("The number of unsatisfied backbone heavy atoms in minimized_start_pose is {}".format(uhb.get_filter("uhb_bb").score(minimized_start_pose))) print("The number of unsatisfied side-chain heavy atoms in designed_relaxed_pose is {}".format(uhb.get_filter("uhb_sc").score(designed_relaxed_pose))) print("The number of unsatisfied backbone heavy atoms in designed_relaxed_pose is {}".format(uhb.get_filter("uhb_bb").score(designed_relaxed_pose))) # Inspect the tracer output from the `BuriedUnsatHbonds` filter, and inspect `designed_relaxed_pose` very closely in PyMol or py3Dmol. Would you agree with the `BuriedUnsatHbonds` filter? How does the number of buried unsatisfied heavy atoms compare to `minimized_start_pose`? # Packer results are stochastic based on a random number generator, that is after running `pyrosetta.init()` you see: # >rosetta:core.init.random: RandomGenerator:init: Normal mode, seed=937978431 RG_type=mt19937 # # How do your results compare with your neighbors' results? Ideally you would run the same protocol hundreds of times, and filter them down using Rosetta filters that recaptiulate your design requirements to a number of designs that you can then experimentally validate to answer your biological question. # # See Also # # Note these may not be available in PyRosetta through code or even by xml (remodel), but they are extremely useful tools when doing denovo protein design - and you should be aware of them. # # - **RosettaRemodel** # - https://www.rosettacommons.org/docs/latest/application_documentation/design/rosettaremodel # # # - **Sewing** # # - https://www.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/composite_protocols/sewing/SEWING # # # - **Parametric Design** # - Previous Workshop! # In[ ]: # # < [*De Novo* Parametric Backbone Design](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/06.06-Introduction-to-Parametric-backbone-design.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [**Point Mutation Scan**](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/06.08-Point-Mutation-Scan.ipynb) >