Recall the network from last lecture:
We identified edge $(B,D)$ as a good candidate to remove to create two clusters based on betweenness.
Today we'll discuss an alternative clustering approach based on graph cuts.
What makes a good cut?
The cut-set $\{(H,C)\}$ is smaller than the set $\{(B,D), (C,G)\}$.
Why might we not like cut $\{(H,C)\}$?
A good cut is:
How to quantify these?
$Vol(S)$: volume of a set of $S$ nodes is the number of edges with at least one end in $S$.
$Vol(\{A,B,C\})=6$
Let $Cut(S,T)$ be the number of edges in cut-set of cut $C=(S,T)$ (the cut size).
The normalized cut value for $(S,T)$ is:
$$NCV(S,T) = \frac{Cut(S,T)}{Vol(S)} + \frac{Cut(S,T)}{Vol(T)}$$Example:
Consider the cut where $S=\{H\}$ and $T=\{A,B,C,D,E,F,G\}$.
Example:
Consider the cut where $S=\{A,B,C,H\}$ and $T=\{D,E,F,G\}$.
So, if one part of the cut has a small volume, the normalized cut value will be large.
Problem: How do we identify the cut with the smallest normalized cut value?
Like most good things, it is NP-hard.
Linear algebra to the rescue!
Below, we describe a way to approximate the optimal cuts using eigenvalue decomposition.
** Representing Graphs with Matrices **
%matplotlib inline
import matplotlib.pyplot as plt
import networkx as nx
graph = nx.Graph()
graph.add_edges_from([('A', 'B'), ('A', 'C'), ('B', 'C'), ('B', 'D'), ('D', 'E'), ('D', 'F'), ('D', 'G'), ('E', 'F'), ('G', 'F')])
nx.draw(graph, with_labels=True)
# Print the adjacency matrix.
def adjacency_matrix(graph):
return nx.adjacency_matrix(graph, sorted(graph.nodes()))
adjacency = adjacency_matrix(graph)
print('Adjacency matrix:\n', adjacency.todense())
Adjacency matrix: [[0 1 1 0 0 0 0] [1 0 1 1 0 0 0] [1 1 0 0 0 0 0] [0 1 0 0 1 1 1] [0 0 0 1 0 1 0] [0 0 0 1 1 0 1] [0 0 0 1 0 1 0]]
** What data structure should we use to store the adjacency matrix? **
E.g., list of tuples [(0,1), (0,2), (1,2), ...]
# Print the degree matrix.
import numpy as np
def degree_matrix(graph):
degrees = graph.degree().items()
degrees = sorted(degrees, key=lambda x: x[0])
degrees = [d[1] for d in degrees]
return np.diag(degrees)
degree = degree_matrix(graph)
print('Degree matrix:\n', degree)
Degree matrix: [[2 0 0 0 0 0 0] [0 3 0 0 0 0 0] [0 0 2 0 0 0 0] [0 0 0 4 0 0 0] [0 0 0 0 2 0 0] [0 0 0 0 0 3 0] [0 0 0 0 0 0 2]]
Laplacian matrix $L = D - A$
def laplacian_matrix(graph):
return degree_matrix(graph) - adjacency_matrix(graph)
laplacian = laplacian_matrix(graph)
print('Laplacian matrix:\n', laplacian)
Laplacian matrix: [[ 2 -1 -1 0 0 0 0] [-1 3 -1 -1 0 0 0] [-1 -1 2 0 0 0 0] [ 0 -1 0 4 -1 -1 -1] [ 0 0 0 -1 2 -1 0] [ 0 0 0 -1 -1 3 -1] [ 0 0 0 -1 0 -1 2]]
Properties of Laplacian matrix:
** Recall that a matrix is a type of linear transformation.**
For an arbitrary vector $v$, what does $Lv$ mean?
# For a smaller graph.
graph2 = nx.Graph()
graph2.add_edges_from([('A', 'B'), ('B', 'C')])
nx.draw(graph2, with_labels=True)
laplacian2 = laplacian_matrix(graph2)
print(laplacian2)
[[ 1 -1 0] [-1 2 -1] [ 0 -1 1]]
Let $v$ be a map from nodes to real values.
E.g., $v = \{f(A), f(B), f(C)\}$
Then, $Lv$ is:
$$ \begin{pmatrix} 1 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 1 \end{pmatrix} \begin{pmatrix} f(A)\\ f(B)\\ f(C)\\ \end{pmatrix} = \begin{pmatrix} f(A) - f(B)\\ -f(A) + 2f(B) - f(C)\\ -f(B) + f(C) \end{pmatrix}$$More generally:
$Lv[i]= [deg(i) ∗ (f(i) − $average of $f$ on neighbors of $i)]$
print(laplacian2.dot([1,2,3]))
[[-1 0 1]]
print(laplacian2.dot([3,2,1]))
[[ 1 0 -1]]
** Review of Linear Algebra**
A vector $\mathbf{v}$ of dimension $n$ is an eigenvector of a square $(n×n)$ matrix $A$ if and only if $$ A \mathbf{v} = \lambda \mathbf{v} $$
where $\lambda$ is a scalar, called an eigenvalue.
In otherwords, $\mathbf{v}$ is just a linear scaling of $A$.
Assume $A$ has $n$ linearly independent eigenvectors, $\{\mathbf{v}_1 \ldots \mathbf{v}_n\}$.
Let $V$ be a square matrix where column $i$ is $\mathbf{v}_i$.
Then A can be factorized as $A=V \Lambda V^{-1} $ where
$\Lambda$ is a diagonal matrix of eigenvalues: $\Lambda[i, i]=\lambda_i$.
What happens when we compute the eigenvalue decomposition of the Laplacian $L$?
print(laplacian2.dot([1,1,1]))
[[0 0 0]]
What about second eigenvector?
# Library to compute eigenvectors:
from numpy.linalg import eigh
help(eigh)
Help on function eigh in module numpy.linalg.linalg: eigh(a, UPLO='L') Return the eigenvalues and eigenvectors of a Hermitian or symmetric matrix. Returns two objects, a 1-D array containing the eigenvalues of `a`, and a 2-D square array or matrix (depending on the input type) of the corresponding eigenvectors (in columns). Parameters ---------- a : (..., M, M) array Hermitian/Symmetric matrices whose eigenvalues and eigenvectors are to be computed. UPLO : {'L', 'U'}, optional Specifies whether the calculation is done with the lower triangular part of `a` ('L', default) or the upper triangular part ('U'). Returns ------- w : (..., M) ndarray The eigenvalues in ascending order, each repeated according to its multiplicity. v : {(..., M, M) ndarray, (..., M, M) matrix} The column ``v[:, i]`` is the normalized eigenvector corresponding to the eigenvalue ``w[i]``. Will return a matrix object if `a` is a matrix object. Raises ------ LinAlgError If the eigenvalue computation does not converge. See Also -------- eigvalsh : eigenvalues of symmetric or Hermitian arrays. eig : eigenvalues and right eigenvectors for non-symmetric arrays. eigvals : eigenvalues of non-symmetric arrays. Notes ----- .. versionadded:: 1.8.0 Broadcasting rules apply, see the `numpy.linalg` documentation for details. The eigenvalues/eigenvectors are computed using LAPACK routines _syevd, _heevd The eigenvalues of real symmetric or complex Hermitian matrices are always real. [1]_ The array `v` of (column) eigenvectors is unitary and `a`, `w`, and `v` satisfy the equations ``dot(a, v[:, i]) = w[i] * v[:, i]``. References ---------- .. [1] G. Strang, *Linear Algebra and Its Applications*, 2nd Ed., Orlando, FL, Academic Press, Inc., 1980, pg. 222. Examples -------- >>> from numpy import linalg as LA >>> a = np.array([[1, -2j], [2j, 5]]) >>> a array([[ 1.+0.j, 0.-2.j], [ 0.+2.j, 5.+0.j]]) >>> w, v = LA.eigh(a) >>> w; v array([ 0.17157288, 5.82842712]) array([[-0.92387953+0.j , -0.38268343+0.j ], [ 0.00000000+0.38268343j, 0.00000000-0.92387953j]]) >>> np.dot(a, v[:, 0]) - w[0] * v[:, 0] # verify 1st e-val/vec pair array([2.77555756e-17 + 0.j, 0. + 1.38777878e-16j]) >>> np.dot(a, v[:, 1]) - w[1] * v[:, 1] # verify 2nd e-val/vec pair array([ 0.+0.j, 0.+0.j]) >>> A = np.matrix(a) # what happens if input is a matrix object >>> A matrix([[ 1.+0.j, 0.-2.j], [ 0.+2.j, 5.+0.j]]) >>> w, v = LA.eigh(A) >>> w; v array([ 0.17157288, 5.82842712]) matrix([[-0.92387953+0.j , -0.38268343+0.j ], [ 0.00000000+0.38268343j, 0.00000000-0.92387953j]])
eig_vals, eig_vectors = eigh(laplacian)
print('eigen values\n%s' % str(eig_vals))
print('eigen vectors\n%s' % str(eig_vectors))
eigen values [ -1.43917928e-15 3.98320868e-01 2.00000000e+00 3.00000000e+00 3.33987689e+00 4.00000000e+00 5.26180225e+00] eigen vectors [[ -3.77964473e-01 4.92886500e-01 7.61771655e-17 -7.07106781e-01 -3.20722630e-01 -8.40213177e-17 -1.06502348e-01] [ -3.77964473e-01 2.96559521e-01 1.92527950e-16 1.88737914e-15 7.50451469e-01 6.04607112e-16 4.53891948e-01] [ -3.77964473e-01 4.92886500e-01 -1.92527950e-16 7.07106781e-01 -3.20722630e-01 -4.36564477e-16 -1.06502348e-01] [ -3.77964473e-01 -2.14220282e-01 4.01736195e-17 9.99200722e-16 3.86384151e-01 2.68521841e-16 -8.13609130e-01] [ -3.77964473e-01 -3.56037413e-01 7.07106781e-01 -3.88578059e-16 -1.65130120e-01 -4.08248290e-01 1.90907293e-01] [ -3.77964473e-01 -3.56037413e-01 7.02923822e-17 -3.33066907e-16 -1.65130120e-01 8.16496581e-01 1.90907293e-01] [ -3.77964473e-01 -3.56037413e-01 -7.07106781e-01 -8.04911693e-16 -1.65130120e-01 -4.08248290e-01 1.90907293e-01]]
print('first eigen value=%g' % eig_vals[0])
print('first (normalized) eigen vector=\n%s' % eig_vectors[:,0])
first eigen value=-1.43918e-15 first (normalized) eigen vector= [[-0.37796447] [-0.37796447] [-0.37796447] [-0.37796447] [-0.37796447] [-0.37796447] [-0.37796447]]
Eigen vectors have been normalized to unit length by $ v = \frac{v'}{||v'||} $
v = np.array([1, 1, 1, 1, 1, 1, 1])
v / (np.sqrt(np.sum(v**2)))
array([ 0.37796447, 0.37796447, 0.37796447, 0.37796447, 0.37796447, 0.37796447, 0.37796447])
print('second eigen value=%.2f' % eig_vals[1])
print('second eigen vector=\n%s' % eig_vectors[:,1])
nx.draw(graph, with_labels=True)
second eigen value=0.40 second eigen vector= [[ 0.4928865 ] [ 0.29655952] [ 0.4928865 ] [-0.21422028] [-0.35603741] [-0.35603741] [-0.35603741]]
What do you notice about the second eigenvector?
(Note that first vector component corresponds to node 'A', the second to 'B', etc.)
We can think of eigenvectors as the cluster assignment to each node.
E.g., nodes with negative values are in cluster 1, nodes with positive values are in cluster 2.
From eigenvalues to clustering
Method 1:
From eigenvalues to clustering
Method 2:
Partitioning into more than 2 components?
print('third eigen value=%.2f' % eig_vals[2])
print('third eigen vector=\n%s' % eig_vectors[:,2])
third eigen value=2.00 third eigen vector= [[ 7.61771655e-17] [ 1.92527950e-16] [ -1.92527950e-16] [ 4.01736195e-17] [ 7.07106781e-01] [ 7.02923822e-17] [ -7.07106781e-01]]
$\mathbf{v}_1: \{A,B,C\}, \{D,E,F,G\}$
$\mathbf{v}_2: \{A,B,D,E,F\}, \{C,G\}$
print('fourth eigen value=%.2f' % eig_vals[3])
print('fourth eigen vector=\n%s' % eig_vectors[:,3])
fourth eigen value=3.00 fourth eigen vector= [[ -7.07106781e-01] [ 1.88737914e-15] [ 7.07106781e-01] [ 9.99200722e-16] [ -3.88578059e-16] [ -3.33066907e-16] [ -8.04911693e-16]]
$\mathbf{v}_1: \{A,B,C\}, \{D,E,F,G\}$
$\mathbf{v}_2: \{A,B,C,D,E\}, \{C,G\}$
$\mathbf{v}_3: \{A,E,F,G\}, \{B,C,D\}$
Method 3:
# get first row of eigenvector matrix.
eig_vectors[0,:]
matrix([[ -3.77964473e-01, 4.92886500e-01, 7.61771655e-17, -7.07106781e-01, -3.20722630e-01, -8.40213177e-17, -1.06502348e-01]])
# get second row of eigenvector matrix.
eig_vectors[1,:]
matrix([[ -3.77964473e-01, 2.96559521e-01, 1.92527950e-16, 1.88737914e-15, 7.50451469e-01, 6.04607112e-16, 4.53891948e-01]])
# plot each point using 2nd and 3rd eigen vector
def plot_by_eigenvectors(eig_vectors):
plt.figure()
for i, name in enumerate(['A', 'B', 'C', 'D', 'E', 'F', 'G']):
v1 = eig_vectors[i, 1]
v2 = eig_vectors[i, 2]
plt.annotate(name, xy=(v1, v2))
print(name, v1, v2)
plt.xlim(-.5,.5)
plt.ylim(-.8,.8)
plt.xlabel('second eigenvector value')
plt.ylabel('third eigenvector value')
plt.show()
plot_by_eigenvectors(eig_vectors)
A 0.492886499631 7.61771654839e-17 B 0.296559521215 1.92527950444e-16 C 0.492886499631 -1.92527950444e-16 D -0.214220281556 4.01736194763e-17 E -0.356037412973 0.707106781187 F -0.356037412973 7.02923822129e-17 G -0.356037412973 -0.707106781187