GALigandDock
Protocol with pyrosetta.distributed
Using the beta_cart.wts
Scorefunction¶Warning: This notebook uses pyrosetta.distributed.viewer
code, which runs in jupyter notebook
and might not run if you're using jupyterlab
.
Note: This Jupyter notebook requires the PyRosetta distributed layer. Please make sure to activate the PyRosetta.notebooks
conda environment before running this notebook. The kernel is set to use this environment.
import logging
logging.basicConfig(level=logging.INFO)
import matplotlib
%matplotlib inline
import os
import pandas as pd
import pyrosetta
import pyrosetta.distributed
import pyrosetta.distributed.io as io
import pyrosetta.distributed.viewer as viewer
import pyrosetta.distributed.packed_pose as packed_pose
import pyrosetta.distributed.tasks.rosetta_scripts as rosetta_scripts
import seaborn
seaborn.set()
import sys
Load the TPA.am1-bcc.gp.params
file when using the -beta_cart
flag, which has gen_potential
atom typing and AM1-BCC partial charges:
pdb_filename = "inputs/test_lig.pdb"
ligand_params = "inputs/TPA.am1-bcc.gp.params"
flags = f"""
-ignore_unrecognized_res 1
-extra_res_fa {ligand_params}
-beta_cart
-out:level 200
"""
pyrosetta.distributed.init(flags)
pose_obj = io.pose_from_file(filename=pdb_filename)
Now we change the scorefunction in our RosettaScripts script to beta_cart.wts
, the weights of which were optimized on protein-ligand complexes using ligands with AM1-BCC partial charges generated with Amber's antechamber
.
GALigandDock
within RosettaScripts normally outputs multiple .pdb
files to disk if run by the command line. However, when using the MultioutputRosettaScriptsTask
function in pyrosetta.distributed
, the outputs will be captured in memory within this Jupyter session!
xml = f"""
<ROSETTASCRIPTS>
<SCOREFXNS>
<ScoreFunction name="fa_standard" weights="beta_cart.wts"/>
</SCOREFXNS>
<MOVERS>
<GALigandDock name="dock"
scorefxn="fa_standard"
scorefxn_relax="fa_standard"
grid_step="0.25"
padding="5.0"
hashsize="8.0"
subhash="3"
nativepdb="{pdb_filename}"
final_exact_minimize="sc"
random_oversample="10"
rotprob="0.9"
rotEcut="100"
sidechains="auto"
initial_pool="{pdb_filename}">
<Stage repeats="10" npool="50" pmut="0.2" smoothing="0.375" rmsdthreshold="2.5" maxiter="50" pack_cycles="100" ramp_schedule="0.1,1.0"/>
<Stage repeats="10" npool="50" pmut="0.2" smoothing="0.375" rmsdthreshold="1.5" maxiter="50" pack_cycles="100" ramp_schedule="0.1,1.0"/>
</GALigandDock>
</MOVERS>
<PROTOCOLS>
<Add mover="dock"/>
</PROTOCOLS>
</ROSETTASCRIPTS>
"""
xml_obj = rosetta_scripts.MultioutputRosettaScriptsTask(xml)
xml_obj.setup()
MultioutputRosettaScriptsTask
is a python generator object. Therefore, we need to call list()
or set()
on it to run it.
Warning, the following cell runs for ~45 minutes CPU time.
if not os.getenv("DEBUG"):
%time results = list(xml_obj(pose_obj))
GALigandDock
trajectories:¶if not os.getenv("DEBUG"):
df = pd.DataFrame.from_records(packed_pose.to_dict(results))
df
GALigandDock
, we can plot the ligand binding energy landscape:¶if not os.getenv("DEBUG"):
matplotlib.rcParams["figure.figsize"] = [12.0, 8.0]
seaborn.scatterplot(x="lig_rms", y="total_score", data=df)
Let's look at the ligand dock with the lowest total_score
score!
if not os.getenv("DEBUG"):
ppose_lowest_total_score = results[df.sort_values(by="total_score").index[0]]
view = viewer.init(ppose_lowest_total_score)
view.add(viewer.setStyle())
view.add(viewer.setStyle(command=({"hetflag": True}, {"stick": {"colorscheme": "brownCarbon", "radius": 0.2}})))
view.add(viewer.setSurface(residue_selector=pyrosetta.rosetta.core.select.residue_selector.ChainSelector("E"), opacity=0.7, color='white'))
view.add(viewer.setHydrogenBonds())
view()