Notebook

Random walk¶

Models leading to diffusion equations, see the section diffu:app, are usually based on reasoning with averaged physical quantities such as concentration, temperature, and velocity. The underlying physical processes involve complicated microscopic movement of atoms and molecules, but an average of a large number of molecules is performed in a small volume before the modeling starts, and the averaged quantity inside this volume is assigned as a point value at the centroid of the volume. This means that concentration, temperature, and velocity at a space-time point represent averages around the point in a small time interval and small spatial volume.

Random walk is a principally different kind of modeling procedure compared to the reasoning behind partial differential equations. The idea in random walk is to have a large number of "particles" that undergo random movements. Averaging can then be used afterwards to compute macroscopic quantities like concentration. The "particles" and their random movement represent a very simplified microscopic behavior of molecules, much simpler and computationally much more efficient than direct molecular simulation, yet the random walk model has been very powerful to describe a wide range of phenomena, including heat conduction, quantum mechanics, polymer chains, population genetics, neuroscience, hazard games, and pricing of financial instruments.

mathcal{I}_t can be shown that random walk, when averaged, produces models that are mathematically equivalent to diffusion equations. This is the primary reason why we treat random walk in this chapter: two very different algorithms (finite difference stencils and random walk) solve the same type of problems. The simplicity of the random walk algorithm makes it particularly attractive for solving diffusion equations on massively parallel computers. The exposition here is as simple as possible, and good thorough derivation of the models is provided by Hjorth-Jensen [hjorten].

Random walk in 1D¶

Imagine that we have some particles that perform random moves, either to the right or to the left. We may flip a coin to decide the movement of each particle, say head implies movement to the right and tail means movement to the left. Each move is one unit length. Physicists use the term random walk for this type of movement. The movement is also known as drunkard's walk. You may try this yourself: flip the coin and make one step to the left or right, and repeat the process.

We introduce the symbol $N$ for the number of steps in a random walk. Figure shows four different random walks with $N=200$.

Ensemble of 4 random walks, each with 200 steps.

Statistical considerations¶

Let $S_k$ be the stochastic variable representing a step to the left or to the right in step number $k$. We have that $S_k=-1$ with probability $p$ and $S_k=1$ with probability $q=1-p$. The variable $S_k$ is known as a Bernoulli variable. The expectation of $S_k$ is

$$ \E{S_k} = p\cdot (-1) + q\cdot 1 = 1 - 2p, $$

and the variance is

$$ \Var{S_k} = \E{S_k^2} - \E{S_k}^2 = 1 - (1-2p)^2 = 4p(1-p)\thinspace . $$

The position after $k$ steps is another stochastic variable

$$ \bar X_k = \sum_{i=0}^{k-1} S_i\thinspace . $$

The expected position is

$$ \E{\bar X_k} = \E{\sum_{i=0}^{k-1} S_i} = \sum_{i=0}^{k-1} \E{S_i}= k(1-2p)\thinspace . $$

All the $S_k$ variables are independent. The variance therefore becomes

$$ \Var{\bar X_k} = \Var{\sum_{i=0}^{k-1} S_i} = \sum_{i=0}^{k-1} \Var{S_i}= k4p(1-p)\thinspace . $$

We see that $\Var{\bar X_k}$ is proportional with the number of steps $k$. For the very important case $p=q=\frac{1}{2}$, $\E{\bar X_k}=0$ and $\Var{\bar X_k}=k$.

How can we estimate $\E{\bar X_k}=0$ and $\Var{\bar X_k}=N$? We must have many random walks of the type in Figure. For a given $k$, say $k=100$, we find all the values of $\bar X_k$, name them $\bar x_{0,k}$, $\bar x_{1,k}$, $\bar x_{2,k}$, and so on. The empirical estimate of $\E{\bar X_k}$ is the average,

$$ \E{\bar X_k} \approx \frac{1}{W}\sum_{j=0}^{W-1} \bar x_{j,k}, $$

while an empirical estimate of $\Var{\bar X_k}$ is

$$ \Var{\bar X_k} \approx \frac{1}{W}\sum_{j=0}^{W-1} (\bar x_{j,k})^2 - \left(\frac{1}{W}\sum_{j=0}^{W-1} \bar x_{j,k}\right)^2\thinspace . $$

That is, we take the statistics for a given $K$ across the ensemble of random walks ("vertically" in Figure). The key quantities to record are $\sum_i \bar x_{i,k}$ and $\sum_i \bar x_{i,k}^2$.

Playing around with some code¶

Scalar code¶

Python has a random module for drawing random numbers, and this module has a function uniform(a, b) for drawing a uniformly distributed random number in the interval $[a,b)$. If an event happens with probability $p$, we can simulate this on the computer by drawing a random number $r$ in $[0,1)$, because then $r\leq p$ with probability $p$ and $r>p$ with probability $1-p$:

In [1]:

import random
r = random.uniform(0, 1)
if r <= p:
    # Event happens
else:
    # Event does not happen

A random walk with $N$ steps, starting at $x_0$, where we move to the left with probability $p$ and to the right with probability $1-p$ can now be implemented by

In [2]:

import random, numpy as np

def random_walk1D(x0, N, p):
    """1D random walk with 1 particle."""
    # Store position in step k in position[k]
    position = np.zeros(N)
    position[0] = x0
    current_pos = x0
    for k in range(N-1):
        r = random.uniform(0, 1)
        if r <= p:
            current_pos -= 1
        else:
            current_pos += 1
        position[k+1] = current_pos
    return position

Vectorized code¶

Since $N$ is supposed to be large and we want to repeat the process for many particles, we should speed up the code as much as possible. Vectorization is the obvious technique here: we draw all the random numbers at once with aid of numpy, and then we formulate vector operations to get rid of the loop over the steps (k). The numpy.random module has vectorized versions of the functions in Python's built-in random module. For example, numpy.random.uniform(a, b, N) returns N random numbers uniformly distributed between a (included) and b (not included).

We can then make an array of all the steps in a random walk: if the random number is less than or equal to $p$, the step is $-1$, otherwise the step is $1$:

In [3]:

r = np.random.uniform(0, 1, size=N)
steps = np.where(r <= p, -1, 1)

The value of position[k] is the sum of all steps up to step k. Such sums are often needed in vectorized algorithms and therefore available by the numpy.cumsum function:

In [4]:

import numpy as np
np.cumsum(np.array([1,3,4,6]))

The resulting array in this demo has elements $1$, $1+3=4$, $1+3+4=8$, and $1+3+4+6=14$.

We can now vectorize the random_walk1D function:

In [5]:

def random_walk1D_vec(x0, N, p):
    """Vectorized version of random_walk1D."""
    # Store position in step k in position[k]
    position = np.zeros(N+1)
    position[0] = x0
    r = np.random.uniform(0, 1, size=N)
    steps = np.where(r <= p, -1, 1)
    position[1:] = x0 + np.cumsum(steps)
    return position

This code runs about 10 times faster than the scalar version. With a parallel numpy library, the code can also automatically take advantage of hardware for parallel computing because each of the four array operations can be trivially parallelized.

Fixing the random sequence¶

During software development with random numbers it is advantageous to always generate the same sequence of random numbers, as this may help debugging processes. To fix the sequence, we set a seed of the random number generator to some chosen integer, e.g.,

In [6]:

np.random.seed(10)

Calls to random_walk1D_vec give positions of the particle as depicted in Figure. The particle starts at the origin and moves with $p=\frac{1}{2}$. Since the seed is the same, the plot to the left is just a magnification of the first 1,000 steps in the plot to the right.

1,000 (left) and 50,000 (right) steps of a random walk.

Verification¶

When we have a scalar and a vectorized code, it is always a good idea to develop a unit test for checking that they produce the same result. A problem in the present context is that the two versions apply two different random number generators. For a test to be meaningful, we need to fix the seed and use the same generator. This means that the scalar version must either use np.random or have this as an option. An option is the most flexible choice:

    import random

    def random_walk1D(x0, N, p, random=random):
        ...
        r = random.uniform(0, 1)

Using random=np.random, the r variable gets computed by np.random.uniform, and the sequence of random numbers will be the same as in the vectorized version that employs the same generator (given that the seed is also the same). A proper test function may be to check that the positions in the walk are the same in the scalar and vectorized implementations:

In [7]:

def test_random_walk1D():
    # For fixed seed, check that scalar and vectorized versions
    # produce the same result
    x0 = 2;  N = 4;  p = 0.6
    np.random.seed(10)
    scalar_computed = random_walk1D(x0, N, p, random=np.random)
    np.random.seed(10)
    vectorized_computed = random_walk1D_vec(x0, N, p)
    assert (scalar_computed == vectorized_computed).all()

Note that we employ == for arrays with real numbers, which is normally an inadequate test due to rounding errors, but in the present case, all arithmetics consists of adding or subtracting one, so these operations are expected to have no rounding errors. Comparing two numpy arrays with == results in a boolean array, so we need to call the all() method to ensure that all elements are True, i.e., that all elements in the two arrays match each other pairwise.

Equivalence with diffusion¶

The original random walk algorithm can be said to work with dimensionless coordinates $\bar x_i = -N + i$, $i=0,1,\ldots, 2N+1$ ($i\in [-N,N]$), and $\bar t_n=n$, $n=0,1,\ldots,N$. A mesh with spacings $\Delta x$ and $\Delta t$ with dimensions can be introduced by

$$ x_i = X_0 + \bar x_i \Delta x,\quad t_n = \bar t_n\Delta t\thinspace . $$

If we implement the algorithm with dimensionless coordinates, we can just use this rescaling to obtain the movement in a coordinate system without unit spacings.

Let $P^{n+1}_i$ be the probability of finding the particle at mesh point $\bar x_i$ at time $\bar t_{n+1}$. We can reach mesh point $(i,n+1)$ in two ways: either coming in from the left from $(i-1,n)$ or from the right ($i+1,n)$. Each has probability $\frac{1}{2}$ (if we assume $p=q=\frac{1}{2}$). The fundamental equation for $P^{n+1}_i$ is

$$ \begin{equation} P^{n+1}_i = \frac{1}{2} P^{n}_{i-1} + \frac{1}{2} P^{n}_{i+1}\thinspace . \label{diffu:randomwalk:1D:pde:Markov} \tag{1} \end{equation} $$

(This equation is easiest to understand if one looks at the random walk as a Markov process and applies the transition probabilities, but this is beyond scope of the present text.)

Subtracting $P^{n}_i$ from (1) results in

$$ P^{n+1}_i - P^{n}_i = \frac{1}{2} (P^{n}_{i-1} -2P^{n}_i + \frac{1}{2} P^{n}_{i+1})\thinspace . $$

Readers who have seen the Forward Euler discretization of a 1D diffusion equation recognize this scheme as very close to such a discretization. We have

$$ \frac{\partial}{\partial t}P(x_i,t_{n}) = \frac{P^{n+1}_i - P^{n}_i}{\Delta t} + \Oof{\Delta t}, $$

or in dimensionless coordinates

$$ \frac{\partial}{\partial\bar t}P(\bar x_i,\bar t_n) \approx P^{n+1}_i - P^{n}_i\thinspace . $$

Similarly, we have

$$ \begin{align*} \frac{\partial^2}{\partial x^2}P(x_i,t_n) &= \frac{P^{n}_{i-1} -2P^{n}_i + \frac{1}{2} P^{n}_{i+1}}{\Delta x^2} + \Oof{\Delta x^2},\\ \frac{\partial^2}{\partial x^2}P(\bar x_i,\bar t_n) &\approx P^{n}_{i-1} -2P^{n}_i + \frac{1}{2} P^{n}_{i+1}\thinspace . \end{align*} $$

Equation (1) is therefore equivalent with the dimensionless diffusion equation

$$ \begin{equation} \frac{\partial P}{\partial\bar t} = \frac{1}{2} \frac{\partial^2 P}{\partial \bar x^2}, \label{diffu:randomwalk:1D:pde:dimless} \tag{2} \end{equation} $$

or the diffusion equation

$$ \begin{equation} \frac{\partial P}{\partial t} = D\frac{\partial^2 P}{\partial x^2}, \label{diffu:randomwalk:1D:pde:dim} \tag{3} \end{equation} $$

with diffusion coefficient

$$ D = \frac{\Delta x^2}{2\Delta t}\thinspace . $$

This derivation shows the tight link between random walk and diffusion. If we keep track of where the particle is, and repeat the process many times, or run the algorithms for lots of particles, the histogram of the positions will approximate the solution of the diffusion equation for the local probability $P^n_i$.

Suppose all the random walks start at the origin. Then the initial condition for the probability distribution is the Dirac delta function $\delta(x)$. The solution of (2) can be shown to be

$$ \begin{equation} \bar P(\bar x,\bar t) = \frac{1}{\sqrt{4\pi\dfc t}}e^{-\frac{x^2}{4\dfc t}}, \label{diffu:randomwalk:1D:pde:dimless:sol} \tag{4} \end{equation} $$

where $\dfc = \frac{1}{2}$.

Implementation of multiple walks¶

Our next task is to implement an ensemble of walks (for statistics, see the section Statistical considerations) and also provide data from the walks such that we can compute the probabilities of the positions as introduced in the previous section. An appropriate representation of probabilities $P^n_i$ are histograms (with $i$ along the $x$ axis) for a few selected values of $n$.

To estimate the expectation and variance of the random walks, the section Statistical considerations points to recording $\sum_j x_{j,k}$ and $\sum_j x_{j,k}^2$, where $x_{j,k}$ is the position at time/step level $k$ in random walk number $j$. The histogram of positions needs the individual values $x_{i,k}$ for all $i$ values and some selected $k$ values.

We introduce position[k] to hold $\sum_j x_{j,k}$, position2[k] to hold $\sum_j (x_{j,k})^2$, and pos_hist[i,k] to hold $x_{i,k}$. A selection of $k$ values can be specified by saying how many, num_times, and let them be equally spaced through time:

In [8]:

pos_hist_times = [(N//num_times)*i for i in range(num_times)]

This is one of the few situations where we want integer division (//) or real division rounded to an integer.

Scalar version¶

Our scalar implementation of running num_walks random walks may go like this:

In [9]:

def random_walks1D(x0, N, p, num_walks=1, num_times=1,
                   random=random):
    """Simulate num_walks random walks from x0 with N steps."""
    position = np.zeros(N+1)    # Accumulated positions
    position[0] = x0*num_walks
    position2 = np.zeros(N+1)   # Accumulated positions**2
    position2[0] = x0**2*num_walks
    # Histogram at num_times selected time points
    pos_hist = np.zeros((num_walks, num_times))
    pos_hist_times = [(N//num_times)*i for i in range(num_times)]
    #print 'save hist:', post_hist_times

    for n in range(num_walks):
        num_times_counter = 0
        current_pos = x0
        for k in range(N):
            if k in pos_hist_times:
                #print 'save, k:', k, num_times_counter, n
                pos_hist[n,num_times_counter] = current_pos
                num_times_counter += 1
            # current_pos corresponds to step k+1
            r = random.uniform(0, 1)
            if r <= p:
                current_pos -= 1
            else:
                current_pos += 1
            position [k+1] += current_pos
            position2[k+1] += current_pos**2
    return position, position2, pos_hist, np.array(pos_hist_times)

Vectorized version¶

We have already vectorized a single random walk. The additional challenge here is to vectorize the computation of the data for the histogram, pos_hist, but given the selected steps in pos_hist_times, we can find the corresponding positions by indexing with the list pos_hist_times: position[post_hist_times], which are to be inserted in pos_hist[n,:].

In [10]:

def random_walks1D_vec1(x0, N, p, num_walks=1, num_times=1):
    """Vectorized version of random_walks1D."""
    position  = np.zeros(N+1)    # Accumulated positions
    position2 = np.zeros(N+1)    # Accumulated positions**2
    walk = np.zeros(N+1)         # Positions of current walk
    walk[0] = x0
    # Histogram at num_times selected time points
    pos_hist = np.zeros((num_walks, num_times))
    pos_hist_times = [(N//num_times)*i for i in range(num_times)]

    for n in range(num_walks):
        r = np.random.uniform(0, 1, size=N)
        steps = np.where(r <= p, -1, 1)
        walk[1:] = x0 + np.cumsum(steps)  # Positions of this walk
        position  += walk
        position2 += walk**2
        pos_hist[n,:] = walk[pos_hist_times]
    return position, position2, pos_hist, np.array(pos_hist_times)

Improved vectorized version¶

Looking at the vectorized version above, we still have one potentially long Python loop over n. Normally, num_walks will be much larger than N. The vectorization of the loop over N certainly speeds up the program, but if we think of vectorization as also a way to parallelize the code, all the independent walks (the n loop) can be executed in parallel. Therefore, we should include this loop as well in the vectorized expressions, at the expense of using more memory.

We introduce the array walks to hold the $N+1$ steps of all the walks: each row represents the steps in one walk.

In [11]:

walks = np.zeros((num_walks, N+1))  # Positions of each walk
walks[:,0] = x0

Since all the steps are independent, we can just make one long vector of enough random numbers (N*num_walks), translate these numbers to $\pm 1$, then we reshape the array such that the steps of each walk are stored in the rows.

In [12]:

r = np.random.uniform(0, 1, size=N*num_walks)
steps = np.where(r <= p, -1, 1).reshape(num_walks, N)

The next step is to sum up the steps in each walk. We need the np.cumsum function for this, with the argument axis=1 for indicating a sum across the columns:

In [13]:

a = np.arange(6).reshape(2,3)
a

In [14]:

np.cumsum(a, axis=1)

Now walks can be computed by

In [15]:

walks[:,1:] = x0 + np.cumsum(steps, axis=1)

The position vector is the sum of all the walks. That is, we want to sum all the rows, obtained by

In [16]:

position  = np.sum(walks, axis=0)

A corresponding expression computes the squares of the positions. Finally, we need to compute pos_hist, but that is a matter of grabbing some of the walks (according to pos_hist_times):

In [17]:

pos_hist[:,:] = walks[:,pos_hist_times]

The complete vectorized algorithm without any loop can now be summarized:

In [18]:

def random_walks1D_vec2(x0, N, p, num_walks=1, num_times=1):
    """Vectorized version of random_walks1D; no loops."""
    position  = np.zeros(N+1)    # Accumulated positions
    position2 = np.zeros(N+1)    # Accumulated positions**2
    walks = np.zeros((num_walks, N+1))  # Positions of each walk
    walks[:,0] = x0
    # Histogram at num_times selected time points
    pos_hist = np.zeros((num_walks, num_times))
    pos_hist_times = [(N//num_times)*i for i in range(num_times)]

    r = np.random.uniform(0, 1, size=N*num_walks)
    steps = np.where(r <= p, -1, 1).reshape(num_walks, N)
    walks[:,1:] = x0 + np.cumsum(steps, axis=1)
    position  = np.sum(walks,    axis=0)
    position2 = np.sum(walks**2, axis=0)
    pos_hist[:,:] = walks[:,pos_hist_times]
    return position, position2, pos_hist, np.array(pos_hist_times)

What is the gain of the vectorized implementations? One important gain is that each vectorized operation can be automatically parallelized if one applies a parallel numpy library like Numba. On a single CPU, however, the speed up of the vectorized operations is also significant. With $N=1,000$ and 50,000 repeated walks, the two vectorized versions run about 25 and 18 times faster than the scalar version, with random_walks1D_vec1 being fastest.

Remark on vectorized code and parallelization¶

Our first attempt on vectorization removed the loop over the $N$ steps in a single walk. However, the number of walks is usually much larger than $N$, because of the need for accurate statistics. Therefore, we should rather remove the loop over all walks. mathcal{I}_t turns out, from our efficiency experiments, that the function random_walks1D_vec2 (with no loops) is slower than random_walks1D_vec1. This is a bit surprising and may be explained by less efficiency in the statements involving very large arrays, containing all steps for all walks at once.

From a parallelization and improved vectorization point of view, it would be more natural to switch the sequence of the loops in the serial code such that the shortest loop is the outer loop:

In [19]:

def random_walks1D2(x0, N, p, num_walks=1, num_times=1, ...):
    ...
    current_pos = x0 + np.zeros(num_walks)
    num_times_counter = -1

    for k in range(N):
        if k in pos_hist_times:
	    num_times_counter += 1
	    store_hist = True
	else:
	    store_hist = False

        for n in range(num_walks):
            # current_pos corresponds to step k+1
            r = random.uniform(0, 1)
	    if r <= p:
                current_pos[n] -= 1
            else:
                current_pos[n] += 1
            position [k+1] += current_pos[n]
            position2[k+1] += current_pos[n]**2
            if store_hist:
                pos_hist[n,num_times_counter] = current_pos[n]
    return position, position2, pos_hist, np.array(pos_hist_times)

The vectorized version of this code, where we just vectorize the loop over n, becomes

In [20]:

def random_walks1D2_vec1(x0, N, p, num_walks=1, num_times=1):
    """Vectorized version of random_walks1D2."""
    position  = np.zeros(N+1)    # Accumulated positions
    position2 = np.zeros(N+1)    # Accumulated positions**2
    # Histogram at num_times selected time points
    pos_hist = np.zeros((num_walks, num_times))
    pos_hist_times = [(N//num_times)*i for i in range(num_times)]

    current_pos = np.zeros(num_walks)
    current_pos[0] = x0
    num_times_counter = -1

    for k in range(N):
        if k in pos_hist_times:
	    num_times_counter += 1
	    store_hist = True  # Store histogram data for this k
	else:
	    store_hist = False

        # Move all walks one step
        r = np.random.uniform(0, 1, size=num_walks)
        steps = np.where(r <= p, -1, 1)
        current_pos += steps
        position[k+1]  = np.sum(current_pos)
        position2[k+1] = np.sum(current_pos**2)
        if store_hist:
            pos_hist[:,num_times_counter] = current_pos
    return position, position2, pos_hist, np.array(pos_hist_times)

This function runs significantly faster than the random_walks1D_vec1 function above, typically 1.7 times faster. The code is also more appropriate in a parallel computing context since each vectorized statement can work with data of size num_walks over the compute units, repeated N times (compared with data of size N, repeated num_walks times, in random_walks1D_vec1).

The scalar code with switched loops, random_walks1D2 runs a bit slower than the original code in random_walks1D, so with the longest loop as the inner loop, the vectorized function random_walks1D2_vec1 is almost 60 times faster than the scalar counterpart, while the code random_walks1D_vec2 without loops is only around 18 times faster. Taking into account the very large arrays required by the latter function, we end up with random_walks1D2_vec1 as the preferred implementation.

Test function¶

During program development, it is highly recommended to carry out computations by hand for, e.g., N=4 and num_walks=3. Normally, this is done by executing the program with these parameters and checking with pen and paper that the computations make sense. The next step is to use this test for correctness in a formal test function.

First, we need to check that the simulation of multiple random walks reproduces the results of random_walk1D, random_walk1D_vec1, and random_walk1D_vec2 for the first walk, if the seed is the same. Second, we run three random walks (N=4) with the scalar and the two vectorized versions and check that the returned arrays are identical.

For this type of test to be successful, we must be sure that exactly the same set of random numbers are used in the three versions, a fact that requires the same random number generator and the same seed, of course, but also the same sequence of computations. This is not obviously the case with the three random_walk1D* functions we have presented. The critical issue in random_walk1D_vec1 is that the first random numbers are used for the first walk, the second set of random numbers is used for the second walk and so on, to be compatible with how the random numbers are used in the function random_walk1D. For the function random_walk1D_vec2 the situation is a bit more complicated since we generate all the random numbers at once. However, the critical step now is the reshaping of the array returned from np.where: we must reshape as (num_walks, N) to ensure that the first N random numbers are used for the first walk, the next N numbers are used for the second walk, and so on.

We arrive at the test function below.

In [21]:

def test_random_walks1D():
    # For fixed seed, check that scalar and vectorized versions
    # produce the same result
    x0 = 0;  N = 4;  p = 0.5

    # First, check that random_walks1D for 1 walk reproduces
    # the walk in random_walk1D
    num_walks = 1
    np.random.seed(10)
    computed = random_walks1D(
        x0, N, p, num_walks, random=np.random)
    np.random.seed(10)
    expected = random_walk1D(
        x0, N, p, random=np.random)
    assert (computed[0] == expected).all()

    # Same for vectorized versions
    np.random.seed(10)
    computed = random_walks1D_vec1(x0, N, p, num_walks)
    np.random.seed(10)
    expected = random_walk1D_vec(x0, N, p)
    assert (computed[0] == expected).all()
    np.random.seed(10)
    computed = random_walks1D_vec2(x0, N, p, num_walks)
    np.random.seed(10)
    expected = random_walk1D_vec(x0, N, p)
    assert (computed[0] == expected).all()

    # Second, check multiple walks: scalar == vectorized
    num_walks = 3
    num_times = N
    np.random.seed(10)
    serial_computed = random_walks1D(
        x0, N, p, num_walks, num_times, random=np.random)
    np.random.seed(10)
    vectorized1_computed = random_walks1D_vec1(
        x0, N, p, num_walks, num_times)
    np.random.seed(10)
    vectorized2_computed = random_walks1D_vec2(
        x0, N, p, num_walks, num_times)
    # positions: [0, 1, 0, 1, 2]
    # Can test without tolerance since everything is +/- 1
    return_values = ['pos', 'pos2', 'pos_hist', 'pos_hist_times']
    for s, v, r in zip(serial_computed,
                       vectorized1_computed,
                       return_values):
        msg = '%s: %s (serial) vs %s (vectorized)' % (r, s, v)
        assert (s == v).all(), msg
    for s, v, r in zip(serial_computed,
                       vectorized2_computed,
                       return_values):
        msg = '%s: %s (serial) vs %s (vectorized)' % (r, s, v)
        assert (s == v).all(), msg

Such test functions are indispensable for further development of the code as we can at any time test whether the basic computations remain correct or not. This is particularly important in stochastic simulations since without test functions and fixed seeds, we always experience variations from run to run, and it can be very difficult to spot bugs through averaged statistical quantities.

Demonstration of multiple walks¶

Assuming now that the code works, we can just scale up the number of steps in each walk and the number of walks. The latter influences the accuracy of the statistical estimates. Figure shows the impact of the number of walks on the expectation, which should approach zero. Figure displays the corresponding estimate of the variance of the position, which should grow linearly with the number of steps. mathcal{I}_t does, seemingly very accurately, but notice that the scale on the $y$ axis is so much larger than for the expectation, so irregularities due to the stochastic nature of the process become so much less visible in the variance plots. The probability of finding a particle at a certain position at time (or step) 800 is shown in Figure. The dashed red line is the theoretical distribution (4) arising from solving the diffusion equation (2) instead. As always, we realize that one needs significantly more statistical samples to estimate a histogram accurately than the expectation or variance.

Estimated expected value for 1000 steps, using 100 walks (upper left), 10,000 (upper right), 100,000 (lower left), and 1,000,000 (lower right).

Estimated variance over 1000 steps, using 100 walks (upper left), 10,000 (upper right), 100,000 (lower left), and 1,000,000 (lower right).

Estimated probability distribution at step 800, using 100 walks (upper left), 10,000 (upper right), 100,000 (lower left), and 1,000,000 (lower right).

Ascii visualization of 1D random walk¶

If we want to study (very) long time series of random walks, it can be convenient to plot the position in a terminal window with the time axis pointing downwards. The module avplotter in SciTools has a class Plotter for plotting functions in the terminal window with the aid of ascii symbols only. Below is the code required to visualize a simple random walk, starting at the origin, and considered over when the point $x=-1$ is reached. We use a spacing $\Delta x = 0.05$ (so $x=-1$ corresponds to $i=-20$).

In [22]:

%matplotlib inline

def run_random_walk():
    from scitools.avplotter import Plotter
    import time, numpy as np
    p = Plotter(-1, 1, width=75)   # Horizontal axis: 75 chars wide
    dx = 0.05
    np.random.seed(10)

    x = 0
    while True:
        random_step = 1 if np.random.random() > 0.5 else -1
        x = x + dx*random_step
        if x < -1:
            break                  # Destination reached!
        print p.plot(0, x)

        # Allow Ctrl+c to abort the simulation
        try:
            time.sleep(0.1)  # Wait for interrupt
        except KeyboardInterrupt:
            print 'Interrupted by Ctrl+c'
            break

Observe that we implement an infinite loop, but allow a smooth interrupt of the program by Ctrl+c through Python's KeyboardInterrupt exception. This is a useful recipe that can be used in many occasions!

The output looks typically like

                                *         |
                                  *       |
                                *         |
                                  *       |
                                *         |
                                  *       |
                                    *     |
                                  *       |
                                *         |
                              *           |
                            *             |
                          *               |
                            *             |
                          *               |
                         *                |
                       *                  |
                     *                    |
                   *                      |

Positions beyond the limits of the $x$ axis appear with a value. A long file contains the complete ascii plot corresponding to the function run_random_walk above.

Random walk as a stochastic equation¶

The (dimensionless) position in a random walk, $\bar X_k$, can be expressed as a stochastic difference equation:

$$ \begin{equation} \bar X_k = \bar X_{k-1} + s, \quad x_0=0, \label{diffu:randomwalk:1D:ode:x} \tag{5} \end{equation} $$

where $s$ is a Bernoulli variable, taking on the two values $s=-1$ and $s=1$ with equal probability:

$$ \hbox{P}(s=1)=\frac{1}{2},\quad \hbox{P}(s=-1)=\frac{1}{2}\thinspace . $$

The $s$ variable in a step is independent of the $s$ variable in other steps.

The difference equation expresses essentially the sum of independent Bernoulli variables. Because of the central limit theorem, $X_k$, will then be normally distributed with expectation $k\E{s}$ and $k\Var{s}$. The expectation and variance of a Bernoulli variable with values $r=0$ and $r=1$ are $p$ and $p(1-p)$, respectively. The variable $s=2r-1$ then has expectation $2\E{r}-1=2p-1=0$ and variance $2^2\Var{r}=4p(1-p)=1$. The position $X_k$ is normally distributed with zero expectation and variance $k$, as we found in the section Statistical considerations.

The central limit theorem tells that as long as $k$ is not small, the distribution of $X_k$ remains the same if we replace the Bernoulli variable $s$ by any other stochastic variable with the same expectation and variance. In particular, we may let $s$ be a standardized Gaussian variable (zero mean, unit variance).

Dividing (5) by $\Delta t$ gives

$$ \frac{\bar X_k - \bar X_{k-1}}{\Delta t} = \frac{1}{\Delta t} s\thinspace . $$

In the limit $\Delta t\rightarrow 0$, $s/\Delta t$ approaches a white noise stochastic process. With $\bar X(t)$ as the continuous process in the limit $\Delta t\rightarrow 0$ ($X_k\rightarrow X(t_k)$), we formally get the stochastic differential equation

$$ \begin{equation} d\bar X = dW, \label{_auto1} \tag{6} \end{equation} $$

where $W(t)$ is a Wiener process. Then $X$ is also a Wiener process. mathcal{I}_t follows from the stochastic ODE $dX=dW$ that the probability distribution of $X$ is given by the Fokker-Planck equation (2). In other words, the key results for random walk we found earlier can alternatively be derived via a stochastic ordinary differential equation and its related Fokker-Planck equation.

Random walk in 2D¶

The most obvious generalization of 1D random walk to two spatial dimensions is to allow movements to the north, east, south, and west, with equal probability $\frac{1}{4}$.

In [23]:

def random_walk2D(x0, N, p, random=random):
    """2D random walk with 1 particle and N moves: N, E, W, S."""
    # Store position in step k in position[k]
    d = len(x0)
    position = np.zeros((N+1, d))
    position[0,:] = x0
    current_pos = np.array(x0, dtype=float)
    for k in range(N):
        r = random.uniform(0, 1)
        if r <= 0.25:
            current_pos += np.array([0, 1])   # Move north
        elif 0.25 < r <= 0.5:
            current_pos += np.array([1, 0])   # Move east
        elif 0.5 < r <= 0.75:
            current_pos += np.array([0, -1])  # Move south
        else:
            current_pos += np.array([-1, 0])  # Move west
        position[k+1,:] = current_pos
    return position

The left plot in Figure provides an example on 200 steps with this kind of walk. We may refer to this walk as a walk on a rectangular mesh as we move from any spatial mesh point $(i,j)$ to one of its four neighbors in the rectangular directions: $(i+1,j)$, $(i-1,j)$, $(i,j+1)$, or $(i,j-1)$.

Random walks in 2D with 200 steps: rectangular mesh (left) and diagonal mesh (right).

Random walk in any number of space dimensions¶

From a programming point of view, especially when implementing a random walk in any number of dimensions, it is more natural to consider a walk in the diagonal directions NW, NE, SW, and SE. On a two-dimensional spatial mesh it means that we go from $(i,j)$ to either $(i+1,j+1)$, $(i-1,j+1)$, $(i+1,j-1)$, or $(i-1,j-1)$. We can with such a diagonal mesh (see right plot in Figure) draw a Bernoulli variable for the step in each spatial direction and trivially write code that works in any number of spatial directions:

In [24]:

def random_walkdD(x0, N, p, random=random):
    """Any-D (diagonal) random walk with 1 particle and N moves."""
    # Store position in step k in position[k]
    d = len(x0)
    position = np.zeros((N+1, d))
    position[0,:] = x0
    current_pos = np.array(x0, dtype=float)
    for k in range(N):
        for i in range(d):
            r = random.uniform(0, 1)
            if r <= p:
                current_pos[i] -= 1
            else:
                current_pos[i] += 1
        position[k+1,:] = current_pos
    return position

A vectorized version is desired. We follow the ideas from the section Playing around with some code, but each step is now a vector in $d$ spatial dimensions. We therefore need to draw $Nd$ random numbers in r, compute steps in the various directions through np.where(r <=p, -1, 1) (each step being $-1$ or $1$), and then we can reshape this array to an $N\times d$ array of step vectors. Doing an np.cumsum summation along axis 0 will add the vectors, as this demo shows:

In [25]:

a = np.arange(6).reshape(3,2)
a

In [26]:

np.cumsum(a, axis=0)

With such summation of step vectors, we get all the positions to be filled in the position array:

In [27]:

def random_walkdD_vec(x0, N, p):
    """Vectorized version of random_walkdD."""
    d = len(x0)
    # Store position in step k in position[k]
    position = np.zeros((N+1,d))
    position[0] = np.array(x0, dtype=float)
    r = np.random.uniform(0, 1, size=N*d)
    steps = np.where(r <= p, -1, 1).reshape(N,d)
    position[1:,:] = x0 + np.cumsum(steps, axis=0)
    return position

Four random walks with 5000 steps in 2D.

Multiple random walks in any number of space dimensions¶

As we did in 1D, we extend one single walk to a number of walks (num_walks in the code).

Scalar code¶

As always, we start with implementing the scalar case:

In [28]:

def random_walksdD(x0, N, p, num_walks=1, num_times=1,
                   random=random):
    """Simulate num_walks random walks from x0 with N steps."""
    d = len(x0)
    position  = np.zeros((N+1, d))   # Accumulated positions
    position2 = np.zeros((N+1, d))   # Accumulated positions**2
    # Histogram at num_times selected time points
    pos_hist = np.zeros((num_walks, num_times, d))
    pos_hist_times = [(N//num_times)*i for i in range(num_times)]

    for n in range(num_walks):
        num_times_counter = 0
        current_pos = np.array(x0, dtype=float)
        for k in range(N):
            if k in pos_hist_times:
                pos_hist[n,num_times_counter,:] = current_pos
                num_times_counter += 1
            # current_pos corresponds to step k+1
            for i in range(d):
                r = random.uniform(0, 1)
                if r <= p:
                    current_pos[i] -= 1
                else:
                    current_pos[i] += 1
            position [k+1,:] += current_pos
            position2[k+1,:] += current_pos**2
    return position, position2, pos_hist, np.array(pos_hist_times)

Vectorized code¶

Significant speed-ups can be obtained by vectorization. We get rid of the loops in the previous function and arrive at the following vectorized code.

In [29]:

def random_walksdD_vec(x0, N, p, num_walks=1, num_times=1):
    """Vectorized version of random_walks1D; no loops."""
    d = len(x0)
    position  = np.zeros((N+1, d))  # Accumulated positions
    position2 = np.zeros((N+1, d))  # Accumulated positions**2
    walks = np.zeros((num_walks, N+1, d))  # Positions of each walk
    walks[:,0,:] = x0
    # Histogram at num_times selected time points
    pos_hist = np.zeros((num_walks, num_times, d))
    pos_hist_times = [(N//num_times)*i for i in range(num_times)]

    r = np.random.uniform(0, 1, size=N*num_walks*d)
    steps = np.where(r <= p, -1, 1).reshape(num_walks, N, d)
    walks[:,1:,:] = x0 + np.cumsum(steps, axis=1)
    position  = np.sum(walks,    axis=0)
    position2 = np.sum(walks**2, axis=0)
    pos_hist[:,:,:] = walks[:,pos_hist_times,:]
    return position, position2, pos_hist, np.array(pos_hist_times)