Manifold Learning with Isomap

$\newcommand{\dotp}[2]{\langle #1, #2 \rangle}$ $\newcommand{\enscond}[2]{\lbrace #1, #2 \rbrace}$ $\newcommand{\pd}[2]{ \frac{ \partial #1}{\partial #2} }$ $\newcommand{\umin}[1]{\underset{#1}{\min}\;}$ $\newcommand{\umax}[1]{\underset{#1}{\max}\;}$ $\newcommand{\umin}[1]{\underset{#1}{\min}\;}$ $\newcommand{\uargmin}[1]{\underset{#1}{argmin}\;}$ $\newcommand{\norm}[1]{\|#1\|}$ $\newcommand{\abs}[1]{\left|#1\right|}$ $\newcommand{\choice}[1]{ \left\{ \begin{array}{l} #1 \end{array} \right. }$ $\newcommand{\pa}[1]{\left(#1\right)}$ $\newcommand{\diag}[1]{{diag}\left( #1 \right)}$ $\newcommand{\qandq}{\quad\text{and}\quad}$ $\newcommand{\qwhereq}{\quad\text{where}\quad}$ $\newcommand{\qifq}{ \quad \text{if} \quad }$ $\newcommand{\qarrq}{ \quad \Longrightarrow \quad }$ $\newcommand{\ZZ}{\mathbb{Z}}$ $\newcommand{\CC}{\mathbb{C}}$ $\newcommand{\RR}{\mathbb{R}}$ $\newcommand{\EE}{\mathbb{E}}$ $\newcommand{\Zz}{\mathcal{Z}}$ $\newcommand{\Ww}{\mathcal{W}}$ $\newcommand{\Vv}{\mathcal{V}}$ $\newcommand{\Nn}{\mathcal{N}}$ $\newcommand{\NN}{\mathcal{N}}$ $\newcommand{\Hh}{\mathcal{H}}$ $\newcommand{\Bb}{\mathcal{B}}$ $\newcommand{\Ee}{\mathcal{E}}$ $\newcommand{\Cc}{\mathcal{C}}$ $\newcommand{\Gg}{\mathcal{G}}$ $\newcommand{\Ss}{\mathcal{S}}$ $\newcommand{\Pp}{\mathcal{P}}$ $\newcommand{\Ff}{\mathcal{F}}$ $\newcommand{\Xx}{\mathcal{X}}$ $\newcommand{\Mm}{\mathcal{M}}$ $\newcommand{\Ii}{\mathcal{I}}$ $\newcommand{\Dd}{\mathcal{D}}$ $\newcommand{\Ll}{\mathcal{L}}$ $\newcommand{\Tt}{\mathcal{T}}$ $\newcommand{\si}{\sigma}$ $\newcommand{\al}{\alpha}$ $\newcommand{\la}{\lambda}$ $\newcommand{\ga}{\gamma}$ $\newcommand{\Ga}{\Gamma}$ $\newcommand{\La}{\Lambda}$ $\newcommand{\si}{\sigma}$ $\newcommand{\Si}{\Sigma}$ $\newcommand{\be}{\beta}$ $\newcommand{\de}{\delta}$ $\newcommand{\De}{\Delta}$ $\newcommand{\phi}{\varphi}$ $\newcommand{\th}{\theta}$ $\newcommand{\om}{\omega}$ $\newcommand{\Om}{\Omega}$

This tour explores the Isomap algorithm for manifold learning.

The < Isomap> algorithm is introduced in

A Global Geometric Framework for Nonlinear Dimensionality Reduction, J. B. Tenenbaum, V. de Silva and J. C. Langford, Science 290 (5500): 2319-2323, 22 December 2000.

In [2]:

Graph Approximation of Manifolds

Manifold learning consist in approximating the parameterization of a manifold represented as a point cloud.

First we load a simple 3D point cloud, the famous Swiss Roll.

Number of points.

In [3]:
n = 1000;

Random position on the parameteric domain.

In [4]:
x = rand(2,n);

Mapping on the manifold.

In [5]:
v = 3*pi/2 * (.1 + 2*x(1,:));
X  = zeros(3,n);
X(2,:) = 20 * x(2,:);
X(1,:) = - cos( v ) .* v;
X(3,:) = sin( v ) .* v;

Parameter for display.

In [6]:
ms = 50;
lw = 1.5;
v1 = -15; v2 = 20;

Display the point cloud.

In [7]:
scatter3(X(1,:),X(2,:),X(3,:),ms,v, 'filled'); 
colormap jet(256);
view(v1,v2); axis('equal'); axis('off');

Compute the pairwise Euclidean distance matrix.

In [8]:
D1 = repmat(sum(X.^2,1),n,1);
D1 = sqrt(D1 + D1' - 2*X'*X);

Number of NN for the graph.

In [9]:
k = 6;

Compute the k-NN connectivity.

In [10]:
[DNN,NN] = sort(D1);
NN = NN(2:k+1,:);
DNN = DNN(2:k+1,:);

Adjacency matrix, and weighted adjacency.

In [11]:
B = repmat(1:n, [k 1]);
A = sparse(B(:), NN(:), ones(k*n,1));

Weighted adjacency (the metric on the graph).

In [12]:
W = sparse(B(:), NN(:), DNN(:));

Display the graph.

In [13]:
options.lw = lw; = 0.01;
clf; hold on;
scatter3(X(1,:),X(2,:),X(3,:),ms,v, 'filled'); 
plot_graph(A, X, options);
colormap jet(256);
view(v1,v2); axis('equal'); axis('off');

Floyd Algorithm to Compute Pairwise Geodesic Distances

A simple algorithm to compute the geodesic distances between all pairs of points on a graph is Floyd iterative algorithm. Its complexity is |O(n^3)| where |n| is the number of points. It is thus quite slow for sparse graph, where Dijkstra runs in |O(log(n)*n^2)|.

Floyd algorithm iterates the following update rule, for |k=1,...,n|

|D(i,j) <- min(D(i,j), D(i,k)+D(k,j)|,

with the initialization |D(i,j)=W(i,j)| if |W(i,j)>0|, and |D(i,j)=Inf| if |W(i,j)=0|.

Make the graph symmetric.

In [14]:
D = full(W);
D = (D+D')/2;

Initialize the matrix.

In [15]:
D(D==0) = Inf;

Add connexion between a point and itself.

In [16]:
D = D - diag(diag(D));

Exercise 1

Implement the Floyd algorithm to compute the full distance matrix |D|, where |D(i,j)| is the geodesic distance between

In [17]:
In [18]:
%% Insert your code here.

Find index of vertices that are not connected to the main manifold.

In [19]:
Iremove = find(D(:,1)==Inf);

Remove Inf remaining values (disconnected comonents).

In [20]:
D(D==Inf) = 0;

Isomap with Classical Multidimensional Scaling

Isomap perform the dimensionality reduction by applying multidimensional scaling.

Please refers to the tours on Bending Invariant for detail on Classical MDS (strain minimization).

Exercise 2

Perform classical MDS to compute the 2D flattening. entered kernel iagonalization lot graph

In [21]:
In [22]:
%% Insert your code here.

Redess the points using the two leading eigenvectors of the covariance matrix (PCA correction).

In [23]:
[U,L] = eig(Xstrain*Xstrain' / n);
Xstrain1 = U'*Xstrain;

Remove problematic points.

In [24]:
Xstrain1(:,Iremove) = Inf;

Display the final result of the dimensionality reduction.

In [25]:
clf; hold on;
scatter(Xstrain1(1,:),Xstrain1(2,:),ms,v, 'filled'); 
plot_graph(A, Xstrain1, options);
colormap jet(256);
axis('equal'); axis('off');

For comparison, the ideal locations on the parameter domain.

In [26]:
Y = cat(1, v, X(2,:));
Y(1,:) = rescale(Y(1,:), min(Xstrain(1,:)), max(Xstrain(1,:)));
Y(2,:) = rescale(Y(2,:), min(Xstrain(2,:)), max(Xstrain(2,:)));

Display the ideal graph on the reduced parameter domain.

In [27]:
clf; hold on;
scatter(Y(1,:),Y(2,:),ms,v, 'filled'); 
plot_graph(A,  Y, options);
colormap jet(256);
axis('equal'); axis('off');