Before you turn this problem in, make sure everything runs as expected. First, restart the kernel (in the menubar, select Kernel$\rightarrow$Restart) and then run all cells (in the menubar, select Cell$\rightarrow$Run All).
Make sure you fill in any place that says YOUR CODE HERE
or "YOUR ANSWER HERE", as well as your name and collaborators below:
NAME = ""
COLLABORATORS = ""
Keywords: centroid, SwitchResidueTypeSetMover(), create_score_function(), score3, fa_standard, ScoreFunction(), set_weight(), read_fragment_file(), ClassicFragmentMover()
# Notebook setup
import sys
if 'google.colab' in sys.modules:
!pip install pyrosettacolabsetup
import pyrosettacolabsetup
pyrosettacolabsetup.mount_pyrosetta_install()
print ("Notebook is set for PyRosetta use in Colab. Have fun!")
from pyrosetta import *
from pyrosetta.teaching import *
init()
Make sure you are in the directory with the pdb files:
cd google_drive/My\ Drive/student-notebooks/
Following the treatment of Simons et al. (1999), Rosetta can score a protein conformation using a low-resolution representation. This will make the energy calculation faster.
Load chain A of Ras, a protein from a the previous workshop 3. Also calculate the full-atom energy of the pose.
pose = pyrosetta.pose_from_pdb("6Q21_A.pdb")
sfxn = pyrosetta.get_score_function()
sfxn(pose)
# YOUR CODE HERE
raise NotImplementedError()
Question: Print residue 5. Note the number of atoms and coordinates of residue 5.
print(pose.residue(5))
# YOUR CODE HERE
raise NotImplementedError()
Now, convert the pose
to the centroid form by using a SwitchResidueTypeSetMover
object and the apply method:
switch = SwitchResidueTypeSetMover("centroid")
switch.apply(pose)
print(pose.residue(5))
Question: How many atoms are now in residue 5? How is this different than before switching it into centroid mode?
# YOUR CODE HERE
raise NotImplementedError()
Score the new, centroid-based pose by creating and using the standard centroid score function "score3".
cen_sfxn = pyrosetta.create_score_function("score3")
cen_sfxn(pose)
Question: What is the new total score? What scoring terms are included in "score3" (print
the cen_sfxn
)? Do these match Simons?
# YOUR CODE HERE
raise NotImplementedError()
Convert the pose
back to all-atom form by using another switch object, SwitchResidueTypeSetMover("fa_standard")
.
fa_switch = SwitchResidueTypeSetMover("fa_standard")
fa_switch.apply(pose)
print(pose.residue(5))
Question: Confirm that you have all the atoms back. Are the atoms in the same coordinate position as before?
# YOUR CODE HERE
raise NotImplementedError()
Go back and adjust your folding algorithm to use centroid mode. Create a ScoreFunction
that uses only van der Waals (fa_atr
and fa_rep
) and hbond_sr_bb
energy score terms.
Question: How much faster does your program run?
polyA = pyrosetta.pose_from_sequence('A' * 10)
polyA.pdb_info().name("polyA")
# Apply the SwitchResidueTypeSetMover to the pose polyA
# YOUR CODE HERE
raise NotImplementedError()
# Create new score function with only VDW and hbond_sr_bb energy score terms.
# YOUR CODE HERE
raise NotImplementedError()
# Use the basic_folding function in the previous chapter,
# overwrite your scoring subroutine, and run the program.
Movers
¶Not counting the PyMOLMover
, which is a special case, SwitchResidueTypeSetMover
is the first example we have seen of a Mover
class in PyRosetta. Every Mover
object in PyRosetta has been designed to apply specific and complex changes (or “moves”) to a pose
. Every Mover
must be “constructed” and have any options set before being applied to a pose
with the apply()
method. SwitchResidueTypeSetMover
has a relatively simple construction with only the single option "centroid"
. (Some Movers
, as we shall see, require no options and are programmed to operate with default values).
Look at the provided 3mer.frags
fragments. These fragments are generated from the Robetta server (http://robetta.bakerlab.org/fragmentsubmit.jsp) for a given sequence. You should see sets of three-lines describing each fragment.
Questions: For the first fragment, which PDB file does it come from? Is this fragment helical, sheet, in a loop, or a combination? What are the φ, ψ, and ω angles of the middle residue of the first fragment window?
Create a new subroutine in your folding code for an alternate random move based upon a “fragment insertion”. A fragment insertion is the replacement of the torsion angles for a set of consecutive residues with new torsion angles pulled at random from a fragment library file. Prior to calling the subroutine, load the set of fragments from the fragment file:
from pyrosetta.rosetta.core.fragment import *
fragset = ConstantLengthFragSet(3)
fragset.read_fragment_file("3mer.frags")
# YOUR CODE HERE
raise NotImplementedError()
Next, we will construct another Mover
object — this time a FragmentMover
— using the above fragment set and a MoveMap
object as options. A MoveMap
specifies which degrees of freedom are allowed to change in the pose
when the Mover
is applied (in this case, all backbone torsion angles):
from pyrosetta.rosetta.protocols.simple_moves import ClassicFragmentMover
movemap = MoveMap()
movemap.set_bb(True)
mover_3mer = ClassicFragmentMover(fragset, movemap)
# YOUR CODE HERE
raise NotImplementedError()
Note that when a MoveMap is constructed, all degrees of freedom are set to False initially. If you still have a PyMOL_Mover instantiated, you can quickly visualize which degrees of freedom will be allowed by sending your move map to PyMOL with
test_pose = pyrosetta.pose_from_sequence("RFPMMSTFKVLLCGAVLSRIDAG")
pmm.apply(test_pose)
pmm.send_movemap(test_pose, movemap)
# YOUR CODE HERE
raise NotImplementedError()
Each time this mover is applied, it will select a random 3-mer window and insert only the backbone torsion angles from a random matching fragment in the fragment set. Here is an example using the above test_pose
:
mover_3mer.apply(test_pose)
pmm.apply(test_pose)
# YOUR CODE HERE
raise NotImplementedError()
Question: When you change your random move in your poly-alanine folding algorithm to a fragment insertion, how much faster is your protocol? Does it converge to a protein-like conformation more quickly?
Fold a 10-mer poly-alanine using 100 independent trajectories, using any variant of the folding algorithm that you like. (A trajectory is a path through the conformation space traveled during the calculation. The end result of each independent trajectory is called a “decoy”. Given enough sampling, the lowest energy decoy may correspond to the global minimum.) Create a Ramachandran plot using the lowest-scoring conformations (decoys) from all 100 independent trajectories. Repeat this for a 10-mer poly-glycine. How do the plots differ? Compare with the plots in Richardson’s article.
Test your folding program’s ability to predict a real fold from scratch. Choose a small protein to keep the computation time down, such as Hox-B1 homeobox protein (1B72) or RecA (2REB). How many iterations and how many independent trajectories do you need to run to find a good structure?
Modify your folding program to include a simulated annealing temperature schedule, decaying exponentially from kT = 100 to kT = 0.1 over the course of the search. Again, fold a test protein. Does this approach work better?
Modify your folding program to remove the Metropolis criterion and instead accept trial moves only when the energy decreases. Plot energy vs. iteration and examine the final output structures from multiple runs. How is the convergence and performance affected? Why?
[Introductory] What are the limitations of these types of folding algorithms?
[Advanced] How might you design an intermediate-resolution representation of side chains that has more detail than the centroid approach yet is faster than the full-atom approach? Which types of residues would most benefit from this type of representation?