Notebook

This notebook contains material from PyRosetta; content is available on Github.

< Introduction to Folding | Contents | Index | Low-Res Scoring and Fragments >

Basic Folding Algorithm¶

Keywords: pose_from_sequence(), random move, scoring move, Metropolis, assign(), Pose()

In [ ]:

!pip install pyrosettacolabsetup
import pyrosettacolabsetup; pyrosettacolabsetup.install_pyrosetta()
import pyrosetta; pyrosetta.init()
from pyrosetta import *
from pyrosetta.teaching import *
init()

Building the Pose¶

In this workshop, you will be folding a 10 residue protein by building a simple de novo folding algorithm. Start by initializing PyRosetta as usual.

Create a simple poly-alanine pose with 10 residues for testing your folding algorithm. Store the pose in a variable called "polyA."

In [ ]:

YOUR-CODE-HERE

polyA.pdb_info().name("polyA")

Question: Check the backbone dihedrals of a few residues (except the first and last) using the .phi() and .psi() methods in Pose. What are the values of $\phi$ and $\psi$ dihedrals? You should see ideal bond lengths and angles, but the dihedrals may not be as realistic.

In [ ]:

YOUR-CODE-HERE

OPTIONAL: We may want to visualize folding as it happens. Before starting with the folding protocol, instantiate a PyMOL mover and use a UNIQUE port number between 10,000 and 65,536. We will retain history in order to view the entire folding process by utilizing the .keep_history() method. Make sure it says PyMOL <---> PyRosetta link started! on its command line.

In [ ]:

pmm = PyMOLMover()
pmm.keep_history(True)

Use the PyMOL mover to view the polyA Pose. You should see a long thread-like structure in PyMOL.

In [ ]:

pmm.apply(polyA)

Building A Basic de Novo Folding Algorithm¶

Now, write a program that implements a Monte Carlo algorithm to optimize the protein conformation. You can do this here in the notebook, or you may use a code editor to write a .py file and execute in a Python or iPython shell.

Our main program will include 100 iterations of making a random trial move, scoring the protein, and accepting/rejecting the move. Therefore, we can break this algorithm down into three smaller subroutines: random, score, and decision.

Step 1: Random Move¶

For the random trial move, write a subroutine to choose one residue at random using random.randint() and then randomly perturb either the φ or ψ angles by a random number chosen from a Gaussian distribution. Use the Python built-in function random.gauss() from the random library with a mean of the current angle and a standard deviation of 25°. After changing the torsion angle, use pmm.apply(polyA) to update the structure in PyMOL.

In [ ]:

import math
import random

def randTrial(your_pose):
YOUR-CODE-HERE
    return your_pose

Step 2: Scoring Move¶

For the scoring step, we need to create a scoring function and make a subroutine that simply returns the numerical energy score of the pose.

In [ ]:

sfxn = get_fa_scorefxn()

def score(your_pose):
    YOUR-CODE-HERE

Step 3: Accepting/Rejecting Move¶

For the decision step, we need to make a subroutine that either accepts or rejects the new conformatuon based on the Metropolis criterion. The Metropolis criterion has a probability of accepting a move as $P = \exp( -\Delta G / kT )$. When $ΔE ≥ 0$, the Metropolis criterion probability of accepting the move is $P = \exp( -\Delta G / kT )$. When $ΔE < 0$, the Metropolis criterion probability of accepting the move is $P = 1$. Use $kT = 1$ Rosetta Energy Unit (REU).

In [ ]:

def decision(before_pose, after_pose):
    YOUR-CODE-HERE

Step 4: Execution¶

Now we can put these three subroutines together in our main program! Write a loop in the main program so that it performs 100 iterations of: making a random trial move, scoring the protein, and accepting/rejecting the move.

After each iteration of the search, output the current pose energy and the lowest energy ever observed. The final output of this program should be the lowest energy conformation that is achieved at any point during the simulation. Be sure to use low_pose.assign(pose) rather than low_pose = pose, since the latter will only copy a pointer to the original pose.

In [ ]:

def basic_folding(your_pose):
    """Your basic folding algorithm that completes 100 Monte-Carlo iterations on a given pose"""
    
    lowest_pose = Pose() # Create an empty pose for tracking the lowest energy pose.
    
    YOUR-CODE-HERE
    return lowest_pose

Finally, output the last pose and the lowest-scoring pose observed and view them in PyMOL. Plot the energy and lowest-energy observed vs. cycle number. What are the energies of the initial, last, and lowest-scoring pose? Is your program working? Has it converged to a good solution?

In [ ]:

basic_folding(polyA)

Here's an example of the PyMOL view:

In [ ]:

from IPython.display import Image
Image('./Media/folding.gif',width='300')

Exercise 1: Comparing to Alpha Helices¶

Using the program you wrote for Workshop #2, force the $A_{10}$ sequence into an ideal α-helix.

Questions: Does this helical structure have a lower score than that produced by your folding algorithm above? What does this mean about your sampling or discrimination?

Exercise 2: Optimizing Algorithm¶

Since your program is a stochastic search algorithm, it may not produce an ideal structure consistently, so try running the simulation multiple times or with a different number of cycles (if necessary). Using a kT of 1, your program may need to make up to 500,000 iterations.

In [ ]:

< Introduction to Folding | Contents | Index | Low-Res Scoring and Fragments >