Keywords: pose_from_sequence(), random move, scoring move, Metropolis, assign(), Pose()
!pip install pyrosettacolabsetup
import pyrosettacolabsetup; pyrosettacolabsetup.install_pyrosetta()
import pyrosetta; pyrosetta.init()
from pyrosetta import *
from pyrosetta.teaching import *
init()
In this workshop, you will be folding a 10 residue protein by building a simple de novo folding algorithm. Start by initializing PyRosetta as usual.
Create a simple poly-alanine pose
with 10 residues for testing your folding algorithm. Store the pose in a variable called "polyA."
YOUR-CODE-HERE
polyA.pdb_info().name("polyA")
Question:
Check the backbone dihedrals of a few residues (except the first and last) using the .phi()
and .psi()
methods in Pose
. What are the values of $\phi$ and $\psi$ dihedrals? You should see ideal bond lengths and angles, but the dihedrals may not be as realistic.
YOUR-CODE-HERE
OPTIONAL:
We may want to visualize folding as it happens. Before starting with the folding protocol, instantiate a PyMOL mover and use a UNIQUE port number between 10,000 and 65,536. We will retain history in order to view the entire folding process by utilizing the .keep_history()
method. Make sure it says PyMOL <---> PyRosetta link started!
on its command line.
pmm = PyMOLMover()
pmm.keep_history(True)
Use the PyMOL mover to view the polyA
Pose
. You should see a long thread-like structure in PyMOL.
pmm.apply(polyA)
Now, write a program that implements a Monte Carlo algorithm to optimize the protein conformation. You can do this here in the notebook, or you may use a code editor to write a .py
file and execute in a Python or iPython shell.
Our main program will include 100 iterations of making a random trial move, scoring the protein, and accepting/rejecting the move. Therefore, we can break this algorithm down into three smaller subroutines: random, score, and decision.
For the random trial move, write a subroutine to choose one residue at random using random.randint()
and then randomly perturb either the φ or ψ angles by a random number chosen from a Gaussian distribution. Use the Python built-in function random.gauss()
from the random
library with a mean of the current angle and a standard deviation of 25°. After changing the torsion angle, use pmm.apply(polyA)
to update the structure in PyMOL.
import math
import random
def randTrial(your_pose):
YOUR-CODE-HERE
return your_pose
For the scoring step, we need to create a scoring function and make a subroutine that simply returns the numerical energy score of the pose.
sfxn = get_fa_scorefxn()
def score(your_pose):
YOUR-CODE-HERE
For the decision step, we need to make a subroutine that either accepts or rejects the new conformatuon based on the Metropolis criterion. The Metropolis criterion has a probability of accepting a move as $P = \exp( -\Delta G / kT )$. When $ΔE ≥ 0$, the Metropolis criterion probability of accepting the move is $P = \exp( -\Delta G / kT )$. When $ΔE < 0$, the Metropolis criterion probability of accepting the move is $P = 1$. Use $kT = 1$ Rosetta Energy Unit (REU).
def decision(before_pose, after_pose):
YOUR-CODE-HERE
Now we can put these three subroutines together in our main program! Write a loop in the main program so that it performs 100 iterations of: making a random trial move, scoring the protein, and accepting/rejecting the move.
After each iteration of the search, output the current pose energy and the lowest energy ever observed. The final output of this program should be the lowest energy conformation that is achieved at any point during the simulation. Be sure to use low_pose.assign(pose)
rather than low_pose = pose
, since the latter will only copy a pointer to the original pose.
def basic_folding(your_pose):
"""Your basic folding algorithm that completes 100 Monte-Carlo iterations on a given pose"""
lowest_pose = Pose() # Create an empty pose for tracking the lowest energy pose.
YOUR-CODE-HERE
return lowest_pose
Finally, output the last pose and the lowest-scoring pose observed and view them in PyMOL. Plot the energy and lowest-energy observed vs. cycle number. What are the energies of the initial, last, and lowest-scoring pose? Is your program working? Has it converged to a good solution?
basic_folding(polyA)
Here's an example of the PyMOL view:
from IPython.display import Image
Image('./Media/folding.gif',width='300')
Using the program you wrote for Workshop #2, force the $A_{10}$ sequence into an ideal α-helix.
Questions: Does this helical structure have a lower score than that produced by your folding algorithm above? What does this mean about your sampling or discrimination?
Since your program is a stochastic search algorithm, it may not produce an ideal structure consistently, so try running the simulation multiple times or with a different number of cycles (if necessary). Using a kT of 1, your program may need to make up to 500,000 iterations.