NTDS'19 tutorial 4: Manipulating graphs with NetworkX¶

Effrosyni Simou, EPFL LTS4. Adapted from NTDS'17 NetworkX demo.

In this session we will get introduced to NetworkX, explore some of the most common network models, look at their basic properties and compare them.

1 Creating graphs using network models¶

There are many libraries that deal with creation and manipulation of graph data. We will use NetworkX to create basic network models, as they are already implemented in the library. The full documentation of NetworkX 2.3 (installed in your ntds_2019 environment) can be found online.

In [ ]:

%matplotlib inline

import collections

import numpy as np
from scipy import spatial
from matplotlib import pyplot as plt
import networkx as nx

Create an Erdős-Rényi graph with $N=100$ vertices, and a probability of connecting each pair of vertices equal to $p=0.15$.

In [ ]:

N = 100  # number of nodes
p = 0.15  # probability of connection
er = nx.erdos_renyi_graph(N, p)

You can retrieve the adjacency matrix of the graph, from the Graph object er as follows:

In [ ]:

er_adj = nx.adjacency_matrix(er, range(N))
er_adj = er_adj.todense()

You can now visualise the adjacency matrix:

In [ ]:

plt.spy(er_adj);

2 Plotting graphs¶

With NetworkX and Matplotlib we can also plot a graph. For example, we can plot the Erdős-Rényi graph that we created before as follows:

In [ ]:

nx.draw(er)

2.1 Exercise¶

Create a Barabasi-Albert graph and a Watts-Strogatz graph and plot them.

In [ ]:

# Create a Barabasi-Albert graph.
ba =  # your code here

In [ ]:

# Create a Watts-Strogartz graph.
ws =  # your code here

3 Modifying graphs¶

It's easy to add or remove edges, but also nodes. If we add an edge between nodes that don't yet exist, they will be automatically created.

In [ ]:

er.add_node(100)

In [ ]:

er.nodes()

Similarly, you can add and remove a collection of nodes or edges, and add and remove one node or edge:

Adding nodes with:
- G.add_node: One node at a time
- G.add_nodes_from: A container of nodes
Adding edges with:
- G.add_edge: One edge at a time
- G.add_edges_from: A container of edges
Removing nodes with:
- G.remove_node: One node at a time
- G.remove_nodes_from: A container of nodes
Removing edges with:
- G.remove_edge: One edge at a time
- G.remove_edges_from: A container of edges

You can get the number of edges with G.size().

Add an edge between two non-existant vertices. Remove all nodes up to node 50. Draw the graph after each change.

In [ ]:

er.add_edge(101, 102)
nx.draw(er)

In [ ]:

er.remove_nodes_from(range(50))
nx.draw(er)
er.nodes()

In [ ]:

er.size()

4 Degree distribution¶

G.degree() returns a DegreeView object with pairs of nodes and their degree. If we specify a node, G.degree(node) will return the degree of that node.

Create an Erdős-Rényi network and plot a histogram of node degrees.

In [ ]:

N = 100  # number of nodes
p = 0.15  # probability of connection
er = nx.erdos_renyi_graph(N, p)

In [ ]:

d = er.degree()

In [ ]:

print(d)

In [ ]:

# Erdős-Rényi node degree histogram.
degree_sequence = sorted([d for n, d in er.degree()], reverse=True)  # degree sequence
degreeCount = collections.Counter(degree_sequence)
deg, cnt = zip(*degreeCount.items())

fig, ax = plt.subplots()
ax.bar(deg, cnt)
ax.set_title("Degree Histogram")
ax.set_ylabel("Count")
ax.set_xlabel("Degree");

4.1 Fitting a distribution¶

Try to fit a Poisson distribution.

In [ ]:

# Poisson distribution.
def poisson(mu, k):
    return np.exp(-mu) * mu**k * (np.math.factorial(k)**-1)

In [ ]:

# Erdős-Rényi node degree histogram.
degree_sequence = sorted([d for n, d in er.degree()], reverse=True)  # degree sequence
degreeCount = collections.Counter(degree_sequence)
deg, cnt = zip(*degreeCount.items())

fig, ax = plt.subplots()
ax.bar(deg, cnt, label='Histogram')

# Poisson distribution
mu = 2 * er.size() / 100
k = np.linspace(1, 25, 25)
deg = [100 * poisson(mu, i) for i in k]
ax.plot(k, deg, color='r', label='Poisson distribution')

ax.legend()
ax.set_title("Degree Histogram")
ax.set_ylabel("Count")
ax.set_xlabel("Degree");

4.2 Exercise¶

Do it for the Barabasi-Albert and Watts-Strogatz networks.

In [ ]:

# your code here

5 Creating a graph that approximates a manifold¶

We can represent data laying on a manifold (sampled from a manifold) as a graph of connected samples.

Generate 100 two-dimensional data points from a uniform random distribution in $[0, 1]$. These will be the coordinates of your nodes.

In [ ]:

N = 100
nodes_coords = np.random.rand(N, 2)

Two nodes are connected if the Euclidean distance between them is smaller than a threshold. In that case, the weight of the edge is set to $$w(i,j) = \exp \left( -{\frac{\operatorname{dist}^2(i,j)}{2\sigma^2}} \right),$$ for some kernel width $\sigma$.

In [ ]:

sigma = 0.9
threshold = 0.2

In [ ]:

def gaussian_kernel(dist, sigma):
    return np.exp(-dist**2 / (2*sigma**2))

In [ ]:

dist = spatial.distance.pdist(nodes_coords, metric='euclidean')
dist = spatial.distance.squareform(dist)
adj = gaussian_kernel(dist, sigma)
adj -= np.identity(N)
adj[dist > threshold] = 0

In [ ]:

plt.spy(adj);

Plot the graph with NetworkX.

Hints:

nx.from_numpy_array(adj) creates a graph object from an adjacency matrix (in numpy form)
nx.draw(G,pos) will draw vertices at coordinates specified in pos. Variable pos is a dictionary assigning a pair of coordinates to each node.

In [ ]:

g = nx.from_numpy_matrix(adj)
plt.spy(nx.adjacency_matrix(g).todense());

In [ ]:

pos = dict(zip(range(N), nodes_coords))
nx.draw(g, pos)

Plot a degree distribution of this graph. What can you say about the distribution?

In [ ]:

# node degree histogram
degree_sequence = sorted([d for n, d in g.degree()], reverse=True)  # degree sequence
degreeCount = collections.Counter(degree_sequence)
deg, cnt = zip(*degreeCount.items())

fig, ax = plt.subplots()
ax.bar(deg, cnt)
ax.set_title("Degree Histogram")
ax.set_ylabel("Count")
ax.set_xlabel("Degree");