Notebook

(Embarrassingly) Parallel TIS vs. RETIS: Setup¶

The OPS DefaultScheme is designed to provide reasonable default behaviors for TIS. These include replica exchange moves, path reversal moves, the minus interface move, as well as shooting. In general, replica exchange TIS is a more efficient way to sample than TIS without replica exchange. However, replica exchange TIS is much harder to parallelize, because the duration of each trajectory is not known before running the trajectory.

Therefore, in some cases you may want to sample each interface independently. This allows a naïve parallelization, since each interface is its own simulation.

In this notebook, we'll make move schemes for a custom version of RETIS (similar to the default scheme, but without the minus move) and for running each interface of a TIS simulation independently (in parallel). We'll use the OPS CLI to run it, and in the next notebook we'll compare the results.

We'll use the same simple toy model as we used in notebooks 5 and 6.

In [ ]:

import openpathsampling as paths
from openpathsampling import strategies

In [ ]:

import sys
if sys.version_info < (3, 8):
    f_version = ""
else:
    f_version = "_38"

In [ ]:

storage = paths.Storage(f"./inputs/2_state_toy{f_version}.nc", mode='r')
state_A = storage.volumes['A']
state_B = storage.volumes['B']
cv = storage.cvs['x']
engine = storage.engines['toy_engine']
initial_conditions = storage.tags['initial_conditions']

In [ ]:

interfaces = paths.VolumeInterfaceSet(cv=cv, minvals=float("-inf"), 
                                      maxvals=[-0.6, -0.45, -0.375, -0.3, -0.2])
tis_network = paths.MISTISNetwork([(state_A, interfaces, state_B)]).named("tis")

All move schemes need a "global" level strategy. This is usually the OrganizeByMoveGroupStrategy. We can re-use the same strategy in all our move schemes.

In [ ]:

global_strategy = strategies.OrganizeByMoveGroupStrategy()

In [ ]:

# this is basically the DefaultScheme without the minus interface move
retis_scheme = paths.MoveScheme(network=tis_network).named("retis")
retis_scheme.append([
    strategies.OneWayShootingStrategy(engine=engine),
    strategies.PathReversalStrategy(),
    strategies.NearestNeighborRepExStrategy(),
    global_strategy
])

The OneWayShootingStrategy includes an ensembles option, which selects specific ensembles to use. The (normal TIS) ensembles sampled by the TIS network are in the attribute sampling_ensembles. (Aside: other ensembles, such as the minus interface ensembles, are in the special_ensembles attribute.) This means that we can create a shooting strategy only samples a single ensemble (instead of the default, which is to sample all the sampling_ensembles in the network).

In [ ]:

ens_0_strategy = strategies.OneWayShootingStrategy(
    ensembles=[tis_network.sampling_ensembles[0]],
    engine=engine
)

From here, we can make a moves scheme for each interface. I'll do that manually, although in practice you might come up with a loop to make this easier:

In [ ]:

scheme_0 = paths.MoveScheme(tis_network).named("scheme_0")
scheme_1 = paths.MoveScheme(tis_network).named("scheme_1")
scheme_2 = paths.MoveScheme(tis_network).named("scheme_2")
scheme_3 = paths.MoveScheme(tis_network).named("scheme_3")
scheme_4 = paths.MoveScheme(tis_network).named("scheme_4")

In [ ]:

# YOUR TURN: Make the correct strategies and append things to the scheme
# 1. Create a OneWayShootingStrategy for each ensemble
# 2. Append the global_strategy and the appropriate shooting strategy to each scheme

Running TIS¶

Again, we'll use the command line interface to run the TIS. So the first stage is to save the relevant things to a setup file, and then we can use the --scheme option in the pathsampling command to select which scheme to run.

First, we check that all our schemes match our initial conditions:

In [ ]:

_ = retis_scheme.initial_conditions_from_trajectories(initial_conditions)

In [ ]:

_ = scheme_0.initial_conditions_from_trajectories(initial_conditions)

In [ ]:

_ = scheme_1.initial_conditions_from_trajectories(initial_conditions)

In [ ]:

_ = scheme_2.initial_conditions_from_trajectories(initial_conditions)

In [ ]:

_ = scheme_3.initial_conditions_from_trajectories(initial_conditions)

In [ ]:

_ = scheme_4.initial_conditions_from_trajectories(initial_conditions)

In [ ]:

# saving everything will take a few minutes
parallel_setup = paths.Storage("parallel_setup.nc", mode='w')
parallel_setup.tags['initial_conditions'] = initial_conditions
parallel_setup.save(retis_scheme)
parallel_setup.save(scheme_0)
parallel_setup.save(scheme_1)
parallel_setup.save(scheme_2)
parallel_setup.save(scheme_3)
parallel_setup.save(scheme_4)
parallel_setup.close()

In OPS, each individual move, such as an attempt to swap a specific pair of replicas, counts as a Monte Carlo step. So in order to make a fair comparison of the approaches with an without replica exchange, we want to ensure that they both have about the same number of shooting moves for each ensemble.

However, the MoveScheme can give us an estimate of how many total moves are required to get a certain number of moves of a certain mover. To get 300 trials of the 0th (only!) shooting mover in scheme_0, how many total steps do we need?

In [ ]:

scheme_0.n_steps_for_trials(scheme_0.movers['shooting'][0], 300)

That answer is probably pretty obvious. But what about our RETIS scheme? How many total steps to we need (on average) to get 300 trials of the 0th shooting mover from that move scheme?

In [ ]:

# YOUR TURN: Answer the question above

This should be a significantly larger number, and it is due to the many replica exchange and path reversal moves in that move scheme, as well as the fact that there are multiple ensembles to sample.

Now let's run the simulations. First, we equilibrate the initial condition, since it is a transition trajectory, which is highly unlikely in inner interfaces. You can do this with:

$ openpathsampling equilibrate parallel_setup.nc -o equil_retis.nc --scheme retis --extra-steps 50

That will first run until the first decorrelated path (no frames in common with the initial trajectory), and then run an additional 50 MC steps. The results will be saved in the equil_retis.nc file.

Then you can run the full simulation with:

$ openpathsampling pathsampling equil_retis.nc -o retis.nc -n $NSTEPS > retis.out &

where you should replace $NSTEPS with the number of steps you found for RETIS above.

If you're not familiar, the > retis.out redirects the output to the file retis.out (you can tail retis.out to see progress updates), and the & at the end of the command forces the command to run in the background, so that you can issue more commands from the same command line (i.e., run multiple things in parallel).

Why don't you need to specify a --scheme with the second command? (Hint: use the openpathsampling contents command on retis_equil.nc and parallel_setup.nc. How many move schemes are saved in each?)

Do the same for scheme_0, scheme_1, scheme_2, scheme_3 and scheme_4, running the path sampling with 300 steps each. In this way, you will be running all 5 interfaces in parallel.

In [ ]: