Notebook

Interactive Spatial Autocorrelation¶

Dani Arribas-Bel (@darribas)

In [17]:

# If you are running this notebook interactively, please run 
# all of the other cells before attempting to play with the
# interactive version
%matplotlib inline

import seaborn as sns
from libpysal.weights import lat2W, lag_spatial
from pysal.model import spreg
import numpy as np
import matplotlib.pyplot as plt
from scipy.linalg import inv

def draw_map(lamb):
    '''
    Draw a map with a synthetic variable generated from 
    a spatially autoregressive GDP of strength `lamb`
    
    ...
    
    Arguments
    ---------
    lamb     : float
               Strength of the SAR process that generates 
               the sythetic data to map
    
    Returns
    -------
    None
    '''
    s = 20
    n = s**2
    w = lat2W(s, s, rook=False)
    w.transform = 'R'
    e = np.random.random((n, 1))
    u = inv(np.eye(n) - lamb * w.full()[0])
    u = np.dot(u, e)
    ul = lag_spatial(w, u)
    u = (u - u.mean()) / np.std(u)
    ul = (ul - ul.mean()) / np.std(ul)
    gu = u.reshape((s, s))
    # Figure
    f = plt.figure(figsize=(15, 8))
    ax1 = f.add_subplot(121)
    ax1.matshow(gu, cmap=plt.cm.YlGn)
    ax1.set_frame_on(False)
    ax1.axes.get_xaxis().set_visible(False)
    ax1.axes.get_yaxis().set_visible(False)
    #---
    ax2 = f.add_subplot(122)
    ols = spreg.OLS(ul, u)
    tag = "b = %.3f"%ols.betas[1][0]
    sc = sns.regplot(u.ravel(), ul.ravel(), ax=ax2)
    sc = sns.regplot(u.ravel(), ul.ravel(), ax=ax2, \
                     scatter=False, label=tag)
    ax2.axvline(0, c='0.5', linewidth=0.5)
    ax2.axhline(0, c='0.5', linewidth=0.5)
    ax2.legend()
    plt.xlabel('u')
    plt.ylabel('W u')
    plt.suptitle("$\lambda$ = %.2f"%lamb)
    plt.show()
    return None

Spatial autocorrelation¶

"Everything is related to everything else, but near things are more related than distant things"

Waldo Tobler (1970)

Spatial autocorrelation has to do with the degree to which the similarity in values between observations in a dataset is related to the similarity in locations of such observations. Not completely unlike the traditional correlation between two variables -which informs us about how the values in one variable change as a function of those in the other- and analogous to its time-series counterpart -which relates the value of a variable at a given point in time with those in previous periods-, spatial autocorrelation relates the value of the variable of interest in a given location, with values of the same variable in surrounding locations.

A key idea in this context is that of spatial randomness: a situation in which the location of an observation gives no information whatsoever about its value. In other words, a variable is spatially random if it is distributed following no discernible pattern over space. Spatial autocorrelation can thus be formally defined as the "absence of spatial randomness", which gives room for two main classes of autocorrelation, similar to the traditional case:

Positive spatial autocorrelation: similar values tend to group together in similar locations.
Negative spatial autocorrelation: similar values tend to be dispersed and further apart from each other.

This notebook illustrates the concept using some of the widgets available in the Jupyter Notebook. Thanks to the interactivity they afford, it is possible to modify the degree of spatial autocorrelation embedded in a synthetic variable that is generated, mapped, and displayed in the so called Moran Plot.

Interactive example¶

Executing the following code cell sets up an interactive exploration of spatial autocorrelation. You can modify the value in the horizontal slider to control how spatially correlated the resulting map will be: 0 implies complete spatial randomness; 0.9 is a high degree of positive spatial autocorrelation; and -0.9 represents extremely high negative spatial autocorrelation.

Once you have selected a value for the lamb (for lambda, $\lambda$) parameter, the visualization returns two figures:

On the left* hand side*, you can see a lattice map representing the location of each observation and the value assigned to each of them encoded in a color gradient ranging from light yellow (lowest) to dark green (highest).
On the right* hand side*, you can see the Moran Plot for the dataset generated. This is a graphical device that displays the values of a variable in each location (X axis) against its "spatial lag", that is, the average value in the "neighborhood" of each observation.

Go ahead and play with setting different degrees of spatial autocorrelation in the underlying data:

In [18]:

# IMPORTANT! To run this cell you first need to run the 
# cell that defines the function `draw_map`, which you can
# find below on the section "Code geek version"
from ipywidgets import interact
_ = interact(draw_map, lamb=(-0.9, 0.9))

interactive(children=(FloatSlider(value=0.0, description='lamb', max=0.9, min=-0.9), Output()), _dom_classes=(…

How does the value set for lamb affect the pattern in the map on the left?

How does the Moran Plot change as the map becomes more or less clustered?

More detail¶

If you want to learn more about the process behind the interactive above, you can check out:

This companion notebook
Block F of the GDS Course

License¶

NOTE: this notebook is a modified version of this other one and it's part of a Github repository on the following address:

https://github.com/darribas/int_sp_auto

A static HTML version can be found here.

Binder notebook for interactive spatial autocorrelation, by Dani Arribas-Bel is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.