Epidemiology 202

Network Models

Bruno Gonçalves
www.data4sci.com
@bgoncalves, @data4sci

In [1]:

from collections import Counter
from pprint import pprint

import pandas as pd
import numpy as np

import matplotlib
import matplotlib.pyplot as plt 

import networkx as nx

import scipy
from scipy.optimize import curve_fit

import tqdm as tq
from tqdm.notebook import tqdm

import watermark

import epidemik
from epidemik import EpiModel, NetworkEpiModel

%load_ext watermark
%matplotlib inline

We start by print out the versions of the libraries we're using for future reference

In [2]:

%watermark -n -v -m -g -iv

Python implementation: CPython
Python version       : 3.11.7
IPython version      : 8.12.3

Compiler    : Clang 14.0.6 
OS          : Darwin
Release     : 23.4.0
Machine     : arm64
Processor   : arm
CPU cores   : 16
Architecture: 64bit

Git hash: 6f3f47aa98b5141fcddbadd07afeedaa91c4cfb7

watermark : 2.4.3
pandas    : 2.1.4
matplotlib: 3.8.0
tqdm      : 4.66.2
networkx  : 3.1
scipy     : 1.11.4
epidemik  : 0.0.6
numpy     : 1.26.4

Load default figure style

In [3]:

plt.style.use('./d4sci.mplstyle')
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

A simple Model¶

Let us start with a network where eveyrone is connected to everybody else

In [4]:

np.random.seed(1234)

In [5]:

N = 300
beta = 0.05
G_full = nx.erdos_renyi_graph(N, p=1.)

And an SI model

In [6]:

SI_full = NetworkEpiModel(G_full)
SI_full.add_interaction('S', 'I', 'I', beta)
SI_full

Out[6]:

Epidemic Model with 2 compartments and 1 transitions:

S + I = I 0.050000

In [7]:

print("kavg=", SI_full.kavg_)
print(SI_full.transitions.edges(data=True))

kavg= 299.0
[('S', 'I', {'agent': 'I', 'rate': 0.05})]

We perform 100 runs

In [8]:

def simulate_runs(model, Nruns):
    values = []

    for i in tqdm(range(Nruns), total=Nruns):
        model.simulate(100, seeds={30: 'I'})
        values.append(model.I)

    values =  pd.DataFrame(values).T
    values.columns = np.arange(values.shape[1])
    
    return values

In [9]:

Nruns = 100
values_full = simulate_runs(SI_full, Nruns)

  0%|          | 0/100 [00:00<?, ?it/s]

And plot them. Each run has it's own stochastic path, despite the strong connectivity constraint

In [10]:

fig, ax = plt.subplots(1)

values_full.median(axis=1).plot(ax=ax, color=colors[0])
values_full.plot(ax=ax, color=colors[0], lw=.5)
ax.get_legend().remove()
ax.set_xlim(0, 20)
ax.legend(['Median Run', 'Individual Runs'])
ax.set_xlabel('Time')
ax.set_ylabel('Population')

Out[10]:

Text(0, 0.5, 'Population')

In [11]:

kmean = SI_full.kavg_
print(kmean)

299.0

We can obtain a specific average degree by setting: $$p=\frac{\langle k\rangle}{N}$$ We choose this specific value so that we match the average degree of the BA model below

In [12]:

G_small = nx.erdos_renyi_graph(N, p=0.0198)

In [13]:

SI_small = NetworkEpiModel(G_small)
SI_small.add_interaction('S', 'I', 'I', beta)
SI_small

Out[13]:

Epidemic Model with 2 compartments and 1 transitions:

S + I = I 0.050000

In [14]:

SI_small.kavg_

Out[14]:

5.926666666666667

In [15]:

values_small = simulate_runs(SI_small, Nruns)

  0%|          | 0/100 [00:00<?, ?it/s]

In [16]:

fig, ax = plt.subplots(1)

values_small.median(axis=1).plot(ax=ax, color=colors[1])
values_small.plot(ax=ax, color=colors[1], lw=.3)

ax.get_legend().remove()
ax.legend(['Median Run', 'Individual Runs'])
ax.set_xlabel('Time')
ax.set_ylabel('Population')

Out[16]:

Text(0, 0.5, 'Population')

In [17]:

time = np.arange(0, 100)
yt = lambda time, beta: N/(1+(N-1)*np.exp(-beta*N*time))

In [18]:

coef, covariance = curve_fit(yt, time, values_small.median(axis=1).values)

/opt/anaconda3/lib/python3.11/site-packages/scipy/optimize/_minpack_py.py:1010: OptimizeWarning: Covariance of the parameters could not be estimated
  warnings.warn('Covariance of the parameters could not be estimated',

In [19]:

coef

Out[19]:

array([1.])

In [20]:

fig, ax = plt.subplots(1)

ax.plot(time, yt(time, *coef), color=colors[5])
values_small.median(axis=1).plot(ax=ax, color=colors[1])
values_small.plot(ax=ax, color=colors[1], lw=.3)

ax.get_legend().remove()
ax.legend(['Fit', 'Median Run', 'Individual Runs'])
ax.set_xlabel('Time')
ax.set_ylabel('Population')

Out[20]:

Text(0, 0.5, 'Population')

We can compare the two scenarios above by plotting the median number of infected node as a function of time. Not surprisingly, the network with the smallest average connectivity is slower

In [21]:

fig, ax = plt.subplots(1)
values_small.median(axis=1).plot(color=colors[1], ax=ax, label='ER Network')
values_full.median(axis=1).plot(color=colors[0], ax=ax, label='Full Network')
ax.legend()
ax.set_xlabel('Time')
ax.set_ylabel('Population')

Out[21]:

Text(0, 0.5, 'Population')

In [22]:

plt.plot(dict(G_small.degree()).values())

ax = plt.gca()
ax.set_xlabel('Node')
ax.set_ylabel('Degree')

Out[22]:

Text(0, 0.5, 'Degree')

But what about two networks with exactly the same average degree, but different topologies? Let's run a Barabasi albert model

In [23]:

BA = nx.barabasi_albert_graph(N, m=3)

In [24]:

SI_BA = NetworkEpiModel(BA)
SI_BA.add_interaction('S', 'I', 'I', beta)
SI_BA

Out[24]:

Epidemic Model with 2 compartments and 1 transitions:

S + I = I 0.050000

The average degree is exactly the same

In [25]:

SI_BA.kavg_

Out[25]:

5.94

In [26]:

values_BA = simulate_runs(SI_BA, Nruns)

  0%|          | 0/100 [00:00<?, ?it/s]

In [27]:

fig, ax = plt.subplots(1)

values_BA.median(axis=1).plot(color=colors[2], ax=ax)
values_BA.plot(ax=ax, color=colors[2], lw=.1)

ax.get_legend().remove()
ax.legend(['Median Run', 'Individual Runs'])

ax.set_xlabel('Time')
ax.set_ylabel('Population')

Out[27]:

Text(0, 0.5, 'Population')

In [28]:

coef, covariance = curve_fit(yt, time, values_BA.median(axis=1).values)

/opt/anaconda3/lib/python3.11/site-packages/scipy/optimize/_minpack_py.py:1010: OptimizeWarning: Covariance of the parameters could not be estimated
  warnings.warn('Covariance of the parameters could not be estimated',

In [29]:

coef

Out[29]:

array([1.])

In [30]:

fig, ax = plt.subplots(1)

ax.plot(time, yt(time, *coef), color=colors[5])
values_BA.median(axis=1).plot(ax=ax, color=colors[2])
values_BA.plot(ax=ax, color=colors[2], lw=.3)

ax.get_legend().remove()
ax.legend(['Fit', 'Median Run', 'Individual Runs'])
ax.set_xlabel('Time')
ax.set_ylabel('Population')

Out[30]:

Text(0, 0.5, 'Population')

In [31]:

plt.plot(dict(BA.degree()).values())
ax = plt.gca()
ax.set_xlabel('Node')
ax.set_ylabel('Degree')

Out[31]:

Text(0, 0.5, 'Degree')

Topology comparison¶

In [32]:

fig, ax = plt.subplots(1)
values_small.median(axis=1).plot(color=colors[1], ax=ax, label='ER Network')
values_BA.median(axis=1).plot(color=colors[2], ax=ax, label='BA Network')
ax.legend()

ax.set_xlabel('Time')
ax.set_ylabel('Population')

Out[32]:

Text(0, 0.5, 'Population')

In [33]:

fig, ax = plt.subplots(1)
values_full.median(axis=1).plot(color=colors[0], ax=ax, label='Full Network')
values_small.median(axis=1).plot(color=colors[1], ax=ax, label='ER Network')
values_BA.median(axis=1).plot(color=colors[2], ax=ax, label='BA Network')
ax.legend()

ax.set_xlabel('Time')
ax.set_ylabel('Population')

Out[33]:

Text(0, 0.5, 'Population')