Make sure you are in the directory with the .pdb
files:
cd google_drive/MyDrive/student-notebooks/
Warning: This notebook uses pyrosetta.distributed.viewer
code, which runs in jupyter notebook
and might not run if you're using jupyterlab
.
Note: This Jupyter notebook requires the PyRosetta distributed layer which is obtained by building PyRosetta with the --serialization
flag or installing PyRosetta from the RosettaCommons conda channel (for more information, visit: http://www.pyrosetta.org/dow).
!pip install pyrosettacolabsetup
import pyrosettacolabsetup; pyrosettacolabsetup.install_pyrosetta()
import pyrosetta; pyrosetta.init()
import Bio.Data.IUPACData as IUPACData
import Bio.SeqUtils
import logging
logging.basicConfig(level=logging.INFO)
import os
import pyrosetta
import pyrosetta.distributed
import pyrosetta.distributed.viewer as viewer
import site
import sys
Initialize PyRosetta:
flags = """
-linmem_ig 10
-ignore_unrecognized_res 1
-mute core.select.residue_selector.SecondaryStructureSelector
-mute core.select.residue_selector.PrimarySequenceNeighborhoodSelector
-mute protocols.DsspMover
"""
pyrosetta.distributed.init(flags)
Let's setup the pose of a de novo helical bundle from the PDB for downstream design: https://www.rcsb.org/structure/5J0J
start_pose = pyrosetta.io.pose_from_file("inputs/5J0J.clean.pdb")
pose = start_pose.clone()
Minimize the crystal structure coordinates, then convert chain "A" to poly-alanine, and then perform one-sided protein-protein interface design designing chain A while only re-packing the homotrimer interface residues of chains B and C! Therefore, we are designing a homotrimer into a heterotrimer. Furthermore, prevent backbone torsions from minimizing and only minimize the side-chains of the homotrimer interface and all of chain A using the FastDesign
mover! After design, repack and minimize all side-chains.
Prior to and after design, we want to relax the protein with a scorefunction that packs the rotamers and minimizes the side-chain degrees of freedom while optimizing for realistic energies. If you were to allow backbone minimization (which you may optionally choose to), we would want use a Cartesian scorefunction (in this case, ref2015_cart.wts
) which automatically sets cart_bonded
scoreterm to a weight of 1.0, which helps to close chain breaks in the backbone. Similarly, you might want to also turn on the coordinate_constraint
scoreterm to penalize deviations of the backbone coordinates from their initial coordinates during minimization. In this tutorial, we demonstrate that concept but will prevent backbone torsions from being minimized (however, feel free to turn on backbone minimization!):
relax_scorefxn = pyrosetta.create_score_function("ref2015_cart.wts")
relax_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.coordinate_constraint, 1.0)
print("The starting pose total_score is {}".format(relax_scorefxn(start_pose)))
For design, if we allowed backbone minimzation we again would turn on the coordinate_constraint
scoreterm to penalize deviations of the backbone coordinates from their initial coordinates during minimization. Additionally, we will use "non-pairwise decomposable" scoreterms to guide the packer trajectories (i.e. fixed backbone sequence design trajectories) in favor of our hypothetical design requirements to solve our biological problem. In this tutorial, we will make use of the following additional scoreterms during design:
aa_composition
: This scoring term is intended for use during design, to penalize deviations from a desired residue type composition. Applies to any amino acid composition requirements specified by the user within a ResidueSelector. This scoreterm also applies to the AddHelixSequenceConstraints
mover which sets up ideal sequence constraints for each helix in a pose or in a selection.voids_penalty
: This scoring term is intended for use during design, to penalize buried voids or cavities and to guide the packer to design solutions in which all buried volume is filled with side-chains.aa_repeat
: This wholebody scoring term is intended for use during design, to penalize long stretches in which the same residue type repeats over and over (e.g. poly-Q sequences).buried_unsatisfied_penalty
: This scoring term is intended for use during design, to provide a penalty for buried hydrogen bond donors or acceptors that are unsatisfied.netcharge
: This scoring term is intended for use during design, to penalize deviations from a desired net charge in a pose or in a selection.hbnet
: This scoring term is intended for use during design, to provide a bonus for hydrogen bond network formation.design_scorefxn = pyrosetta.create_score_function("ref2015_cart.wts")
design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.coordinate_constraint, 10.0)
design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.aa_composition, 1.0)
design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.voids_penalty, 0.25)
design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.aa_repeat, 1.0)
design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.buried_unsatisfied_penalty, 1.0)
design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.hbnet, 1.0)
design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.netcharge, 1.0)
print("The starting pose total_score is {}".format(design_scorefxn(start_pose)))
By using the relax_scorefxn
before and after the design_scorefxn
, we ensure that these "non-pairwise decomposable" scoreterms are not forcing unrealistic rotamers that would otherwise not be held in place without these additional scoreterms.
Prior to any deviation from crystal structure coordinates, apply coordinate constraints to all of the backbone heavy atoms:
true_selector = pyrosetta.rosetta.core.select.residue_selector.TrueResidueSelector() # Select all residues
# Apply a virtual root onto the pose to prevent large lever-arm effects while minimizing with coordinate constraints
virtual_root = pyrosetta.rosetta.protocols.simple_moves.VirtualRootMover()
virtual_root.set_removable(True)
virtual_root.set_remove(False)
virtual_root.apply(pose)
# Construct the CoordinateConstraintGenerator
coord_constraint_gen = pyrosetta.rosetta.protocols.constraint_generator.CoordinateConstraintGenerator()
coord_constraint_gen.set_id("contrain_all_backbone_atoms!")
coord_constraint_gen.set_ambiguous_hnq(False)
coord_constraint_gen.set_bounded(False)
coord_constraint_gen.set_sidechain(False)
coord_constraint_gen.set_sd(1.0) # Sets a standard deviation of contrained atoms to (an arbitrary) 1.0 Angstroms RMSD. Set higher or lower for different results.
coord_constraint_gen.set_ca_only(False)
coord_constraint_gen.set_residue_selector(true_selector)
# Apply the CoordinateConstraintGenerator using the AddConstraints mover
add_constraints = pyrosetta.rosetta.protocols.constraint_generator.AddConstraints()
add_constraints.add_generator(coord_constraint_gen)
add_constraints.apply(pose)
Prior to design, minimize with the FastRelax
mover to optimize the pose within the relax_scorefxn
scorefunction. Note: this takes ~1min 10s
tf = pyrosetta.rosetta.core.pack.task.TaskFactory()
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.InitializeFromCommandline())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.IncludeCurrent())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.NoRepackDisulfides())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset(
pyrosetta.rosetta.core.pack.task.operation.PreventRepackingRLT(), true_selector)) # Set to RestrictToRepackingRLT for slower/better results
mm = pyrosetta.rosetta.core.kinematics.MoveMap()
mm.set_bb(False) # Set to true if desired
mm.set_chi(True)
mm.set_jump(False)
fast_relax = pyrosetta.rosetta.protocols.relax.FastRelax(scorefxn_in=relax_scorefxn, standard_repeats=1) # 2-5 repeats suggested for real applications
fast_relax.cartesian(True)
fast_relax.set_task_factory(tf)
fast_relax.set_movemap(mm)
fast_relax.minimize_bond_angles(True)
fast_relax.minimize_bond_lengths(True)
fast_relax.min_type("lbfgs_armijo_nonmonotone")
fast_relax.ramp_down_constraints(False)
if not os.getenv("DEBUG"):
%time fast_relax.apply(pose)
# Optionally, instead of running this you could reload the saved pose from a previously run trajectory:
#pose = pyrosetta.pose_from_file("minimized_start_pose.pdb")
Let's check the delta total_score
per residue after minimizing in the relax_scorefxn
scorefunction:
if not os.getenv("DEBUG"):
initial_score_res = relax_scorefxn(start_pose)/start_pose.size()
final_score_res = relax_scorefxn(pose)/pose.size()
delta_total_score_res = final_score_res - initial_score_res
print("{0} kcal/(mol*res) - {1} kcal/(mol*res) = {2} kcal/(mol*res)".format(final_score_res, initial_score_res, delta_total_score_res))
We can see that the crystal structure coordinates were not quite optimal according to the relax_scorefxn
scorefunction. So which model is correct?
By how many Angstroms RMSD did the backbone Cα atoms move?
YOUR-CODE-HERE
For downstream analysis, we want to save the pose:
minimized_start_pose = pose.clone()
pyrosetta.dump_pdb(minimized_start_pose, "outputs/minimized_start_pose.pdb")
Prior to designing chain A, first let's make chain A poly-alanine so that we can re-design the sidechains onto the backbone:
# Since we will do direct pose manipulation, first remove the constraints
remove_constraints = pyrosetta.rosetta.protocols.constraint_generator.RemoveConstraints()
remove_constraints.add_generator(coord_constraint_gen)
remove_constraints.apply(pose)
Obtain chain A and convert it to poly-alanine
keep_chA = pyrosetta.rosetta.protocols.grafting.simple_movers.KeepRegionMover(res_start=str(pose.chain_begin(1)), res_end=str(pose.chain_end(1)))
keep_chA.apply(pose)
polyA_chA = pyrosetta.rosetta.protocols.pose_creation.MakePolyXMover(aa="ALA", keep_pro=0, keep_gly=0, keep_disulfide_cys=0)
polyA_chA.apply(pose)
Obtain chains B and C
pose_chBC = minimized_start_pose.clone()
keep_chBC = pyrosetta.rosetta.protocols.grafting.simple_movers.KeepRegionMover(res_start=str(pose_chBC.chain_begin(2)), res_end=str(pose_chBC.chain_end(3)-1))
keep_chBC.apply(pose_chBC)
Append chains B and C onto the poly-alanine version of chain A
pyrosetta.rosetta.core.pose.append_pose_to_pose(pose1=pose, pose2=pose_chBC, new_chain=True)
Pose is now considered to have only 2 chains.
pose.num_chains()
Let's re-establish that there are 3 chains.
switch_chains = pyrosetta.rosetta.protocols.simple_moves.SwitchChainOrderMover()
switch_chains.chain_order("12")
switch_chains.apply(pose)
print(pose.pdb_info())
print("Now the number of chains = {}".format(pose.num_chains()))
Re-apply backbone coordinate constraints:
virtual_root.apply(pose)
add_constraints.apply(pose)
Have a look at the new pose, chain A in which is ready to be designed!
Next, we need to apply certain movers with our design specifications to activate certain non-pairwise decomposable scoreterms in the design_scorefxn
scorefunction, that will be implemented when we run the FastDesign
mover (or any downstream mover that calls the packer).
We will frequently be using the following residue selectors:
chain_A = pyrosetta.rosetta.core.select.residue_selector.ChainSelector("A")
chain_BC = pyrosetta.rosetta.core.select.residue_selector.NotResidueSelector(chain_A)
Apply the AddCompositionConstraintMover
mover, which utilizes the aa_composition
scoreterm. Applying the following mover to the pose imposes the design constraints that the ResidueSelector have 40% aliphatic or aromatic residues other than leucine (i.e. ALA, PHE, ILE, MET, PRO, VAL, TRP, or TYR), and 5% leucines. For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/AACompositionEnergy
add_composition_constraint = pyrosetta.rosetta.protocols.aa_composition.AddCompositionConstraintMover()
add_composition_constraint.create_constraint_from_file_contents("""
PENALTY_DEFINITION
OR_PROPERTIES AROMATIC ALIPHATIC
NOT_TYPE LEU
FRACT_DELTA_START -0.05
FRACT_DELTA_END 0.05
PENALTIES 1 0 1 # The above two lines mean that if we're 5% below or 5% above the desired content, we get a 1-point penalty.
FRACTION 0.4 # Forty percent aromatic or aliphatic, but not leucine
BEFORE_FUNCTION CONSTANT
AFTER_FUNCTION CONSTANT
END_PENALTY_DEFINITION
PENALTY_DEFINITION
TYPE LEU
DELTA_START -1
DELTA_END 1
PENALTIES 1 0 1
FRACTION 0.05 # Five percent leucine
BEFORE_FUNCTION CONSTANT
AFTER_FUNCTION CONSTANT
END_PENALTY_DEFINITION
""")
add_composition_constraint.add_residue_selector(chain_A)
add_composition_constraint.apply(pose)
Apply the AddHelixSequenceConstraints
mover, which utilizes the aa_composition
scoreterm. By default, this mover adds five types of sequence constraints to the designable residues in each alpha helix in the pose. Any of these behaviours may be disabled or modified by invoking advanced options, but no advanced options need be set in most cases. The five types of sequence constraints are:
A strong sequence constraint requiring at least two negatively-charged residues in the first (N-terminal) three residues of each alpha-helix.
A strong sequence constraint requiring at least two positively-charged residues in the last (C-terminal) three residues of each alpha-helix.
A weak but strongly ramping sequence constraint penalizing helix-disfavoring residue types (by default, Asn, Asp, Ser, Gly, Thr, and Val) throughout each helix. (A single such residue is sometimes tolerated, but the penalty for having more than one residue in this category increases quadratically with the count of helix-disfavouring residues.)
A weak sequence constraint coaxing the helix to have 10% alanine. Because this constraint is weak, deviations from this value are tolerated, but this should prevent an excessive abundance of alanine residues.
A weak sequence constraint coaxing the helix to have at least 25% hydrophobic content. This constraint is also weak, so slightly less hydrophobic helices will be tolerated to some degree. Note that alanine is not considered to be "hydrophobic" within Rosetta.
add_helix_sequence_constraints = pyrosetta.rosetta.protocols.aa_composition.AddHelixSequenceConstraintsMover()
add_helix_sequence_constraints.set_residue_selector(chain_A)
add_helix_sequence_constraints.apply(pose)
Note: the aa_repeat
scoreterm works out-of-the-box, and does not need to be applied to the pose to work, it just needs to have a weight of >0 in the scorefunction used by the packer. It imposes a penalty for each stretch of repeating amino acids, with the penalty value depending nonlinearly on the length of the repeating stretch. By default, 1- or 2-residue stretches incur no penalty, 3-residue stretches incur a penalty of +1, 4-residue stretches incur a penalty of +10, and 5-residue stretches or longer incur a penalty of +100. Since the term is sequence-based, it is really only useful for design -- that is, it will impose an identical penalty for a fixed-sequence pose, regardless its conformation. This also means that the term has no conformational derivatives: the minimizer ignores it completely. The term is not pairwise-decomposible, but has been made packer-compatible, so it can direct the sequence composition during a packer run. For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/Repeat-stretch-energy
Similarly, the voids_penalty
scoreterm does not need to be applied to the pose to work, it just needs to have a weight of >0 in the scorefunction used by the packer. For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/VoidsPenaltyEnergy
Similarly, the buried_unsatisfied_penalty
scoreterm does not need to be applied to the pose to work, it just needs to have a weight of >0 in the scorefunction used by the packer. For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/BuriedUnsatPenalty
Similarly, the hbnet
scoreterm does not need to be applied to the pose to work, it just needs to have a weight of >0 (ideally between 1.0 to 10.0) in the scorefunction used by the packer. For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/HBNetEnergy
Apply the AddNetChargeConstraintMover
mover, which utilizes the netcharge
scoreterm. In this case, we require that the net charge in chain A must be exactly 0.
add_net_charge_constraint = pyrosetta.rosetta.protocols.aa_composition.AddNetChargeConstraintMover()
add_net_charge_constraint.create_constraint_from_file_contents("""
DESIRED_CHARGE 0 #Desired net charge is zero.
PENALTIES_CHARGE_RANGE -1 1 #Penalties are listed in the observed net charge range of -1 to +1.
PENALTIES 1 0 1 #The penalties are 1 for an observed charge of -1, 0 for an observed charge of 0, and 1 for an observed charge of +1.
BEFORE_FUNCTION QUADRATIC #Ramp quadratically for observed net charges of -2 or less.
AFTER_FUNCTION QUADRATIC #Ramp quadratically for observed net charges of +2 or greater.
""")
add_net_charge_constraint.add_residue_selector(chain_A)
add_net_charge_constraint.apply(pose)
Specify a custom relaxscript
that optimizes the ramp_repack_min
weights to prevent too many alanines from being designed (demonstrated at Pre-RosettaCON 2018):
Specify TaskOperations to be applied to chain A. In this case, let's use the latest LayerDesign implementation (Note: this is still being actively developed) using the XmlObjects class.
layer_design_task = pyrosetta.rosetta.protocols.rosetta_scripts.XmlObjects.create_from_string("""
<RESIDUE_SELECTORS>
<Layer name="surface" select_core="false" select_boundary="false" select_surface="true" use_sidechain_neighbors="true"/>
<Layer name="boundary" select_core="false" select_boundary="true" select_surface="false" use_sidechain_neighbors="true"/>
<Layer name="core" select_core="true" select_boundary="false" select_surface="false" use_sidechain_neighbors="true"/>
<SecondaryStructure name="sheet" overlap="0" minH="3" minE="2" include_terminal_loops="false" use_dssp="true" ss="E"/>
<SecondaryStructure name="entire_loop" overlap="0" minH="3" minE="2" include_terminal_loops="true" use_dssp="true" ss="L"/>
<SecondaryStructure name="entire_helix" overlap="0" minH="3" minE="2" include_terminal_loops="false" use_dssp="true" ss="H"/>
<And name="helix_cap" selectors="entire_loop">
<PrimarySequenceNeighborhood lower="1" upper="0" selector="entire_helix"/>
</And>
<And name="helix_start" selectors="entire_helix">
<PrimarySequenceNeighborhood lower="0" upper="1" selector="helix_cap"/>
</And>
<And name="helix" selectors="entire_helix">
<Not selector="helix_start"/>
</And>
<And name="loop" selectors="entire_loop">
<Not selector="helix_cap"/>
</And>
</RESIDUE_SELECTORS>
<TASKOPERATIONS>
<DesignRestrictions name="layer_design">
<Action selector_logic="surface AND helix_start" aas="EHKPQR"/>
<Action selector_logic="surface AND helix" aas="EHKQR"/>
<Action selector_logic="surface AND sheet" aas="DEHKNQRST"/>
<Action selector_logic="surface AND loop" aas="DEGHKNPQRST"/>
<Action selector_logic="boundary AND helix_start" aas="ADEIKLMNPQRSTVWY"/>
<Action selector_logic="boundary AND helix" aas="ADEIKLMNQRSTVWY"/>
<Action selector_logic="boundary AND sheet" aas="DEFIKLNQRSTVWY"/>
<Action selector_logic="boundary AND loop" aas="ADEFGIKLMNPQRSTVWY"/>
<Action selector_logic="core AND helix_start" aas="AFILMPVWY"/>
<Action selector_logic="core AND helix" aas="AFILMVWY"/>
<Action selector_logic="core AND sheet" aas="FILVWY"/>
<Action selector_logic="core AND loop" aas="AFGILMPVWY"/>
<Action selector_logic="helix_cap" aas="DNST"/>
</DesignRestrictions>
</TASKOPERATIONS>
""").get_task_operation("layer_design")
Also prepare a ResidueSelector for the heterotrimer interface within chains B and C, and the rest of chains B and C. We will use these with RestrictAbsentCanonicalAASRLT
, RestrictToRepackingRLT
, and PreventRepackingRLT
TaskOperations:
interface = pyrosetta.rosetta.core.select.residue_selector.InterGroupInterfaceByVectorSelector()
interface.group1_selector(chain_A)
interface.group2_selector(chain_BC)
chain_BC_interface = pyrosetta.rosetta.core.select.residue_selector.AndResidueSelector(interface, chain_BC)
not_chain_BC_interface = pyrosetta.rosetta.core.select.residue_selector.NotResidueSelector(chain_BC_interface)
chain_BC_not_interface = pyrosetta.rosetta.core.select.residue_selector.AndResidueSelector(not_chain_BC_interface, chain_BC)
For minimization, we also need the following ResidueSelector:
chain_A_and_BC_interface = pyrosetta.rosetta.core.select.residue_selector.OrResidueSelector(chain_A, chain_BC_interface)
Make sure each ResidueSelector selects the regions as desired (in the following case, the ResidueSelector
interface
is visualized:
view = viewer.init(pose) \
+ viewer.setStyle() \
+ viewer.setStyle(residue_selector=interface, colorscheme="greyCarbon")
view()
Create TaskFactory to be used with the FastRelax
mover (We use this instead of the FastDesign mover as the constructor allows us to set a relax script):
tf.clear()
tf = pyrosetta.rosetta.core.pack.task.TaskFactory()
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.InitializeFromCommandline())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.IncludeCurrent())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.NoRepackDisulfides())
# Prevent repacking on chain_BC_not_interface
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset(
pyrosetta.rosetta.core.pack.task.operation.PreventRepackingRLT(), chain_BC_not_interface))
# Repack only on chain_BC_interface
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset(
pyrosetta.rosetta.core.pack.task.operation.RestrictToRepackingRLT(), chain_BC_interface))
# Enable design on chain_A
aa_to_design = pyrosetta.rosetta.core.pack.task.operation.RestrictAbsentCanonicalAASRLT()
aa_to_design.aas_to_keep("ACDEFGHIKLMNPQRSTVWY")
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset(
aa_to_design, chain_A))
# Apply layer design
tf.push_back(layer_design_task)
# Convert the task factory into a PackerTask
packer_task = tf.create_task_and_apply_taskoperations(pose)
# View the PackerTask
print(packer_task)
The PackerTask looks as intended. Now setup the MoveMapFactory
:
# Set up a MoveMapFactory
mmf = pyrosetta.rosetta.core.select.movemap.MoveMapFactory()
mmf.all_bb(setting=False) # Set to true if needed
mmf.all_bondangles(setting=False)
mmf.all_bondlengths(setting=False)
mmf.all_chi(setting=True)
mmf.all_jumps(setting=False)
mmf.set_cartesian(setting=False)
# Set movemap actions to turn on or off certain torsions, overriding the above defaults
enable = pyrosetta.rosetta.core.select.movemap.move_map_action.mm_enable
disable = pyrosetta.rosetta.core.select.movemap.move_map_action.mm_disable
# Set custom minimizable torsions
mmf.add_bondangles_action(action=enable, selector=chain_A_and_BC_interface)
mmf.add_bondlengths_action(action=enable, selector=chain_A_and_BC_interface)
mmf.add_chi_action(action=enable, selector=chain_A_and_BC_interface)
mmf.add_bondangles_action(action=disable, selector=chain_BC_not_interface)
mmf.add_bondlengths_action(action=disable, selector=chain_BC_not_interface)
mmf.add_chi_action(action=disable, selector=chain_BC_not_interface)
Now let's double-check some more pose
information to verify that we are ready for FastDesign
:
display_pose = pyrosetta.rosetta.protocols.fold_from_loops.movers.DisplayPoseLabelsMover()
display_pose.tasks(tf)
display_pose.movemap_factory(mmf)
display_pose.apply(pose)
We are ready to setup the FastRelax
mover:
#fast_design = pyrosetta.rosetta.protocols.denovo_design.movers.FastDesign(scorefxn_in=design_scorefxn, standard_repeats=1) # 2-5 repeats suggested for real applications
fast_design = pyrosetta.rosetta.protocols.relax.FastRelax(scorefxn_in=design_scorefxn, standard_repeats=1, script_file="KillA2019")
fast_design.cartesian(False)
fast_design.set_task_factory(tf)
fast_design.set_movemap_factory(mmf)
fast_design.min_type("lbfgs_armijo_nonmonotone")
fast_design.ramp_down_constraints(False)
Note that this takes ~37min 13s:
if not os.getenv("DEBUG"):
%time fast_design.apply(pose)
# Optionally, instead of running this you could reload the saved pose from a previously run trajectory:
pose = pyrosetta.pose_from_file("expected_outputs/designed_pose.pdb")
Save this pose for downstream reference:
if not os.getenv("DEBUG"):
designed_pose = pose.clone()
#pyrosetta.dump_pdb(designed_pose, "outputs/designed_pose.pdb")
Now that we have re-designed chain A, it is strongly recommended to use FastRelax
with a Cartesian scorefunction to repack and minimize with a realistic scorefuction. Note: this takes ~6min 27s
tf.clear()
tf = pyrosetta.rosetta.core.pack.task.TaskFactory()
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.InitializeFromCommandline())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.IncludeCurrent())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.NoRepackDisulfides())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset(
pyrosetta.rosetta.core.pack.task.operation.RestrictToRepackingRLT(), true_selector))
mm = pyrosetta.rosetta.core.kinematics.MoveMap()
mm.set_bb(False) # Set to true if desired
mm.set_chi(True)
mm.set_jump(False)
fast_relax = pyrosetta.rosetta.protocols.relax.FastRelax(scorefxn_in=relax_scorefxn, standard_repeats=1) # 2-5 repeats suggested for real applications
fast_relax.cartesian(True)
fast_relax.set_task_factory(tf)
fast_relax.set_movemap(mm)
fast_relax.minimize_bond_angles(True)
fast_relax.minimize_bond_lengths(True)
fast_relax.min_type("lbfgs_armijo_nonmonotone") # Cartisian scorefunction
fast_relax.ramp_down_constraints(False)
# To run the FastRelax trajectory.
if not os.getenv("DEBUG"):
%time fast_relax.apply(pose)
#Or for speed, we will load the pose from a previous trajectory.
pose = pyrosetta.pose_from_file("expected_outputs/designed_relaxed_pose.pdb")
Save the pose for downstream analysis:
designed_relaxed_pose = pose.clone()
pyrosetta.dump_pdb(designed_relaxed_pose, "expected_outputs/designed_relaxed_pose.pdb")
Let's compare sequences to see what happened:
print(minimized_start_pose.chain_sequence(1))
print(designed_relaxed_pose.chain_sequence(1))
View the resulting design!
chA = pyrosetta.rosetta.core.select.residue_selector.ChainSelector("A")
chB = pyrosetta.rosetta.core.select.residue_selector.ChainSelector("B")
chC = pyrosetta.rosetta.core.select.residue_selector.ChainSelector("C")
view = sum(
[
viewer.init(designed_relaxed_pose),
viewer.setStyle(cartoon_color="lightgrey", radius=0.25),
viewer.setSurface(residue_selector=chA, colorscheme="orangeCarbon", opacity=0.5, surface_type="VDW"),
viewer.setSurface(residue_selector=chB, color="greenCarbon", opacity=0.5, surface_type="VDW"),
viewer.setSurface(residue_selector=chB, color="violetCarbon", opacity=0.5, surface_type="VDW"),
viewer.setDisulfides(radius=0.25),
viewer.setZoom(factor=1.5)
]
)
view()
By how many Angstroms RMSD did the backbone Cα atoms move?
What is the delta total_score
from minimized_start_pose
to designed_relaxed_pose
?
What is the per-residue energy difference for each mutated position between minimized_start_pose
and designed_relaxed_pose
?
Are the sequence constraints imposed by the aa_repeat
scoreterm satisfied? Re-write the following python code that counts the number of residue types in chain A to check for the longest stretch of each residue type in the primary amino acid sequence in chain A:
for aa in IUPACData.protein_letters:
aa_selector = pyrosetta.rosetta.core.select.residue_selector.ResidueNameSelector(str.upper(Bio.SeqUtils.seq3(aa)))
aa_and_chain_A = pyrosetta.rosetta.core.select.residue_selector.AndResidueSelector(chain_A, aa_selector)
sel_res_count_metric = pyrosetta.rosetta.core.simple_metrics.metrics.SelectedResidueCountMetric()
sel_res_count_metric.set_residue_selector(aa_and_chain_A)
print(aa, int(sel_res_count_metric.calculate(designed_relaxed_pose)))
Does chain A have 40% percent aromatic or aliphatic (but not leucine) and 5% leucine, satisfying the aa_composition
scoreterm? For additional practice, re-write the following python implementation using PyRosetta ResidueSelectors and SimpleMetrics:
num_aro_ali = 0
num_leucine = 0
for aa in pose.chain_sequence(1):
if aa in "WYFAVIMP":
num_aro_ali += 1
if aa == "L":
num_leucine += 1
print("The % aromatic residues in chain A is {0}%".format((num_aro_ali*100)/len(pose.chain_sequence(1))))
print("The % leucine in chain A is {0}%".format((num_leucine*100)/len(pose.chain_sequence(1))))
Are the sequence constraints imposed by the AddHelixSequenceConstraints
mover with the aa_composition
scoreterm satisfied?
Uses sasa (solvent-accessible surface area) to asses whether there are there more or less voids in designed_relaxed_pose
as compared to minimized_start_pose
, satisfying the voids_penalty
scoreterm.
tf.clear()
tf = pyrosetta.rosetta.core.pack.task.TaskFactory()
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset(
pyrosetta.rosetta.core.pack.task.operation.RestrictToRepackingRLT(), true_selector))
sasa = pyrosetta.rosetta.protocols.simple_filters.TaskAwareSASAFilter()
sasa.task_factory(tf)
print("The sasa of minimized_start_pose is {}".format(sasa.score(minimized_start_pose)))
print("The sasa of designed_relaxed_pose is {}".format(sasa.score(designed_relaxed_pose)))
Is the net charge of chain A equal to exactly zero, satisfying the netcharge
scoreterm?
# One method to calculate net charge of chain A
num_negative = 0
num_positive = 0
for aa in pose.chain_sequence(1):
if aa in "DE":
num_negative += 1
if aa in "KR":
num_positive += 1
print("The net charge of chain A = {0}".format(num_positive - num_negative))
# Another method to calculate net charge of chain A
test_pose = pose.clone()
keep_chA = pyrosetta.rosetta.protocols.grafting.simple_movers.KeepRegionMover(res_start=str(test_pose.chain_begin(1)), res_end=str(test_pose.chain_end(1)))
keep_chA.apply(test_pose)
net_charge = pyrosetta.rosetta.protocols.simple_filters.NetChargeFilter()
print("The net charge of chain A = {0}".format(net_charge.score(test_pose)))
Are there any buried unsatisfied polar atoms, satisfying the buried_unsatisfied_penalty
scoreterm?
uhb = pyrosetta.rosetta.protocols.rosetta_scripts.XmlObjects.create_from_string("""
<SCOREFXNS>
<ScoreFunction name="fa_default" weights="ref2015"/>
</SCOREFXNS>
<FILTERS>
<BuriedUnsatHbonds name="uhb_sc" use_reporter_behavior="true" use_hbnet_behavior="false" scorefxn="fa_default" report_bb_heavy_atom_unsats="false" report_sc_heavy_atom_unsats="true" cutoff="99999" print_out_info_to_pdb="false" ignore_surface_res="true" residue_surface_cutoff="20.0" ignore_bb_heavy_unsats="false" confidence="0"/>
<BuriedUnsatHbonds name="uhb_bb" use_reporter_behavior="true" use_hbnet_behavior="false" scorefxn="fa_default" report_bb_heavy_atom_unsats="true" report_sc_heavy_atom_unsats="false" cutoff="99999" print_out_info_to_pdb="false" ignore_surface_res="true" residue_surface_cutoff="20.0" ignore_bb_heavy_unsats="false" confidence="0"/>
</FILTERS>
""")
print("The number of unsatisfied side-chain heavy atoms in minimized_start_pose is {}".format(uhb.get_filter("uhb_sc").score(minimized_start_pose)))
print("The number of unsatisfied backbone heavy atoms in minimized_start_pose is {}".format(uhb.get_filter("uhb_bb").score(minimized_start_pose)))
print("The number of unsatisfied side-chain heavy atoms in designed_relaxed_pose is {}".format(uhb.get_filter("uhb_sc").score(designed_relaxed_pose)))
print("The number of unsatisfied backbone heavy atoms in designed_relaxed_pose is {}".format(uhb.get_filter("uhb_bb").score(designed_relaxed_pose)))
Inspect the tracer output from the BuriedUnsatHbonds
filter, and inspect designed_relaxed_pose
very closely in PyMol or py3Dmol. Would you agree with the BuriedUnsatHbonds
filter? How does the number of buried unsatisfied heavy atoms compare to minimized_start_pose
?
Packer results are stochastic based on a random number generator, that is after running pyrosetta.init()
you see:
rosetta:core.init.random: RandomGenerator:init: Normal mode, seed=937978431 RG_type=mt19937
How do your results compare with your neighbors' results? Ideally you would run the same protocol hundreds of times, and filter them down using Rosetta filters that recaptiulate your design requirements to a number of designs that you can then experimentally validate to answer your biological question.
Note these may not be available in PyRosetta through code or even by xml (remodel), but they are extremely useful tools when doing denovo protein design - and you should be aware of them.
RosettaRemodel
https://www.rosettacommons.org/docs/latest/application_documentation/design/rosettaremodel
Sewing
Parametric Design
Previous Workshop!