This notebook presents some tensor decomposition techniques that are at the foundation of our anisotropic PDE discretizations on Cartesian grids. The general objective is to write a given symmetric positive definite matrix $D$ under the form $$ D = \sum_{0 \leq i < I} \lambda_i e_i e_i^T. $$ From this point, various numerical schemes can be designed, for both first order and second order, and both linear and non-linear PDEs.
The techniques used for constructing the above decomposition are non-trivial, and are related to classical yet subtle tools of discrete geometry. This notebook is meant to illustrate some of their properties.
This notebook is limited to dimensions $d\in \{2,3\}$. Tensor decomposition in dimension $d \in \{4,5\}$ requires another set of techniques (and in practice the call to a c++ library), which are discussed in II Tensor decomposition, dimensions 4 and 5
References
The tensor decomposition presented in this notebook is a central ingredient of the following paper:
Fehrenbach, J., & Mirebeau, J.-M. (2014). Sparse non-negative stencils for anisotropic diffusion. Journal of Mathematical Imaging and Vision, 49(1), 123–147. http://doi.org/http://dx.doi.org/10.1007/s10851-013-0446-3
Summary of volume Algorithmic tools, this series of notebooks.
Main summary of the Adaptive Grid Discretizations book of notebooks, including the other volumes.
Acknowledgement. Some of the experiments presented in these notebooks are part of ongoing research with Ludovic Métivier and Da Chen.
Copyright Jean-Marie Mirebeau, Centre Borelli, ENS Paris-Saclay, CNRS, University Paris-Saclay
import sys; sys.path.insert(0,"..") # Allow import of agd from parent directory (useless if conda package installed)
#from Miscellaneous import TocTools; print(TocTools.displayTOC('TensorSelling','Algo'))
from agd import LinearParallel as lp
from agd import FiniteDifferences as fd
from agd import Selling
from agd.Plotting import savefig; #savefig.dirName = 'Figures/TensorSelling'
from agd import AutomaticDifferentiation as ad
The library imported as lp is a set of routines meant to facilitate the manipulation of numerous small vectors and matrices simultaneously. It is based on numpy and implements only a small number of linear algebra tools.
import numpy as np; xp = np; allclose = np.allclose
import matplotlib.pyplot as plt
Uncomment the following line to run the notebook on the GPU. (This only for compatibility testing, since no intensive computation is involved in this specific notebook.)
#xp,plt,allclose=map(ad.cupy_friendly,(xp,plt,allclose))
In order to illustrate our tensor decomposition facilities, we will rely on randomly generated symmetric positive definite tensors. They are built as $$ M = A^T A + \varepsilon \mathrm{Id} $$ where $A$ has normalized random Gaussian entries.
def MakeRandomTensor(dim,shape = tuple(),relax=0.01):
identity = fd.as_field(np.eye(dim),shape,depth=2)
A = np.random.standard_normal( (dim,dim) + shape )
M = lp.dot_AA(lp.transpose(A),A) + relax*identity
return xp.asarray(M) # Convert to GPU array if needed
# For reproducibility, we fix the random seed
np.random.seed(42)
# Generate a 2x2 random psd tensor
D2 = MakeRandomTensor(2)
The Selling.Decomposition routine, applies to a symmetric positive definite matrix $D$ of size $d \times d$, with $d\leq 3$. It returns coefficients $\lambda_i \geq 0$ and offsets $e_i \in Z^d$.
A discussion on the inner workings of this decomposition is presented in the next section.
coefs,offsets = Selling.Decomposition(D2)
The matrix can be reconstructed by the formula $$ D = \sum_{0 \leq i < I} \lambda_i e_i e_i^T. $$
def Reconstruct(coefs,offsets):
return (coefs*lp.outer_self(offsets)).sum(2)
assert allclose(D2,Reconstruct(coefs,offsets))
There are $I = d(d+1)/2$ coefficients and offsets (a.k.a $I=3$ if $d=2$, and $I=6$ if $d=3$). Note that this is more than the similar-looking eigen-decomposition of a matrix, which uses only $d$ coefficients and unit vectors. However, Selling's offsets have integer entries, hence are suitable for the construction of finite difference schemes.
print("Selling decomposition of matrix : \n", D2)
print("Coefficients : ", coefs)
print("Offsets : \n", offsets)
Selling decomposition of matrix : [[0.67622539 0.91777115] [0.91777115 2.34873696]] Coefficients : [0.43467964 0.94787431 0.24154575] Offsets : [[-1 0 1] [-1 -1 2]]
plt.axis('equal'); plt.title("Decomposition offsets for the matrix D2")
plt.quiver(*np.zeros(offsets.shape),*offsets,angles='xy',scale_units='xy',scale=1);
plt.scatter(*offsets); plt.scatter(*(-offsets));
As previously, we generate a 3x3 random psd tensor, decompose it and validate the absence of reconstruction error.
D3 = MakeRandomTensor(3)
coefs,offsets = Selling.Decomposition(D3)
assert allclose(D3,Reconstruct(coefs,offsets))
print("Selling decomposition of matrix : \n", D3)
print("Coefficients : ", coefs)
print("Offsets : \n", offsets)
Selling decomposition of matrix : [[ 0.86853982 -0.08963958 -0.06552819] [-0.08963958 0.50213052 -0.73715916] [-0.06552819 -0.73715916 2.85683026]] Coefficients : [0.15516777 0.08963958 0.62373247 0.23502864 1.49444606 0.1774623 ] Offsets : [[-1 1 -1 0 0 0] [ 0 -1 0 1 0 -1] [ 1 1 0 -2 1 1]]
Selling's tensor decomposition algorithm requires the matrix to be positive definite. It involves a loop, whose number of iterations grows to infinity as the matrix degenerates.
A maximum number of iterations is set by default, to a reasonably large value, so that non-convergence is typically due to an error on the user's side - namely a non-positive matrix.
In the following cell, we define the following matrix, which is positive definite but strongly anisotropic: $$ \begin{pmatrix} 1 & \varepsilon\\ \varepsilon & 2 \varepsilon^2 \end{pmatrix} $$
# Construct some matrix with extremely large condition number
eps = 1/200.
D2_bad = np.array([[1,eps],[eps,2*eps**2]])
In addition, the anisotropy direction of this matrix is almost but not exactly aligned with the coordinate axes, which is a worst case scenario for Selling's algorithm. As a result, Selling's algorithm fails to converge within the prescribed iteration limit.
#Selling's decomposition does not terminate within the iteration limit
try:
coefs,offsets = Selling.Decomposition(D2_bad)
except ValueError as e:
print(e.args)
("Selling's algorithm did not terminate in iterMax2=100 iterations",)
The iteration limit of Selling's algorithm may be increased, so as to ensure correct termination.
Selling.iterMax2 *= 2
coefs,offsets = Selling.Decomposition(D2_bad)
However, the resulting offsets are unlikely to be of use for any PDE discretization, since they are way too large.
Note also that basis reduction techniques more efficient than Selling's algorithm are available for tensors with extremely large condition numbers. (E.g. Lagrange's algorithm.)
print("offsets : \n", offsets)
offsets : [[-99 100 -1] [ -1 1 0]]
Our implementation of Selling's algorithm automatically threads over dimensions deeper than two.
# Generate a 10x10 field of random 2x2 spd tensors
D2_field = MakeRandomTensor(2,(10,10))
coefs, offsets = Selling.Decomposition(D2_field)
The shape of the matrix field is $(d,d, n_1,\cdots,n_k)$, whereas the coefficients are $(I,n_1,\cdots,n_k)$ and the offsets $(d,I,n_1,\cdots,n_k)$ where $I = d (d+1)/2$.
print(f"{D2_field.shape}, {coefs.shape}, {offsets.shape}")
(2, 2, 10, 10), (3, 10, 10), (2, 3, 10, 10)
assert allclose(D2_field,Reconstruct(coefs,offsets))
This section illustrates the main mathematical concept underlying Selling's decomposition, known as obtuse superbases of the lattice $Z^d$.
A superbase of $Z^d$ is a special kind of coordinate system $(b_0,\cdots,b_d)$, spanning the lattice of integer points and with some redundancy. More precisely, one requires $$ \begin{aligned} b_0+\cdots+b_d &= 0,\\ \det(b_1,\cdots,b_d) &= \pm 1. \end{aligned} $$ Such a superbase is said $D$-obtuse, where $D$ is a given $d\times d$ symmetric positive definite matrix, if $$ <b_i,D b_j> \leq 0 $$ for all distinct $i,j\in \{0,\cdots,d\}$.
Sellings decomposition of a matrix $D \in S_d^{++}$ is based Selling's formula applied to a $D$-obtuse superbase, which itself is produced by Selling's algorithm.
The following method applies Selling's algorithm to a given positive definite tensor $D \in S_2^{++}$, producing a suitable superbase of $Z^2$.
sb = Selling.ObtuseSuperbase(D2)
sb
array([[-1., -1., 2.], [ 1., -0., -1.]])
The generated superbase $(b_0,\cdots,b_d)$ is designed to be $D$-obtuse. In other words to obey $$ <b_i,D b_j> \leq 0 $$ for all distinct $i,j\in \{0,\cdots,d\}$.
[lp.dot_VAV(sb[:,i],D2,sb[:,np.mod(i+1,3)]) for i in range(3)]
[-0.2415457537440081, -0.4346796384446704, -0.9478743070679034]
Selling's formula yields the desired decomposition of the given symmetric positive definite matrix $D$ $$ D = -\sum_{0\leq i<j \leq d} <b_i,D b_j> v_{ij} v_{ij}^T, $$ where one defines $v_{ij} = \pm b_k^\perp$ whenever $\{i,j,k\} = \{0,1,2\}$.
Note that,
coefs,offsets = Selling.Decomposition(D2,sb=sb)
print("Coefficients : ", coefs)
print("Offsets : \n", offsets)
Coefficients : [0.43467964 0.94787431 0.24154575] Offsets : [[-1 0 1] [-1 -1 2]]
It is possible to produce a tensor decomposition from an arbitrary superbase, but the coefficients are positive only if the superbase is obtuse.
sb = Selling.CanonicalSuperbase(D2) # Only takes the dimension an array type from the argument
coefs,offsets = Selling.Decomposition(D2,sb=sb)
print("Coefficients : ", coefs)
print("Offsets : \n", offsets)
Coefficients : [-0.91777115 3.26650811 1.59399654] Offsets : [[ 1 0 -1] [-1 1 0]]
assert np.allclose(D2,Reconstruct(coefs,offsets))
We follow similar steps in dimension $d=3$. Selling's algorithm, applied to the given matrix $D$, yields a specific superbase of $Z^3$.
sb = Selling.ObtuseSuperbase(D3)
sb
array([[ 1., -0., 0., -1.], [ 0., 1., 1., -2.], [ 0., 1., 0., -1.]])
The superbase produced by Selling's algorithm is $D$-obtuse, similarly to the two dimensional case.
[lp.dot_VAV(sb[:,i],D3,sb[:,j]) for i in range(4) for j in range(i)]
[-0.1551677694190755, -0.08963958172347006, -0.2350286383841237, -0.6237324732734056, -1.4944460568297004, -0.1774622969114411]
Selling's formula yields the desired decomposition of the given positive definite matrix $D \in S_3^{++}$: $$ D = -\sum_{0\leq i<j \leq d} <b_i,D b_j> v_{ij} v_{ij}^T, $$ where one defines $v_{ij} = \pm b_k \wedge b_l$ whenever $\{i,j,k,l\} = \{0,1,2,3\}$.
Note that:
coefs,offsets = Selling.Decomposition(D3)
print("Coefficients : ", coefs)
print("Offsets : \n", offsets)
Coefficients : [0.15516777 0.08963958 0.62373247 0.23502864 1.49444606 0.1774623 ] Offsets : [[-1 1 -1 0 0 0] [ 0 -1 0 1 0 -1] [ 1 1 0 -2 1 1]]
#Comparing with the cross products of the superbase elements.
ad.array(
[lp.cross(sb[:,i],sb[:,j]) for i in range(4) for j in range(i)]
).astype(int).T
array([[ 0, 0, 1, 0, -1, 1], [ 1, 0, 0, -1, 1, 0], [-1, -1, 0, 2, -1, -1]])
Selling's decomposition of tensors, presented in this notebook, has a qualities that make it particularly suitable for PDE discretizations. More precisely, it is:
We denote the anisotropy ratio of a symmetric positive definite matrix $D$ by $$ \mu(D) := \sqrt{\|D\| \|D^{-1}\|}. $$ In other words, this is the square root of the condition number of $D$.
We illustrate these properties in dimension $d=2$, by considering the following family of matrices $D(\theta,\mu) \in S_2^{++}$: $$ D(\theta,\mu) := \mu^2 e(\theta) e(\theta)^T + e(\theta)^\perp (e(\theta)^\perp)^T, $$ whose condition number is equal to $\mu\geq 1$, and whose anisotropy direction is aligned with the vector $e(\theta):=(\cos \theta,\sin \theta)$, $\theta \in R$.
θ = xp.linspace(0,np.pi/2,100)
μ = 10
D2_rotating = (
μ**2 * lp.outer_self(xp.array([np.cos(θ),np.sin(θ)]))
+ lp.outer_self(xp.array([-np.sin(θ),np.cos(θ)]))
)
coefs,offsets = Selling.Decomposition(D2_rotating)
The offsets $(e_i)_{1 \leq i \leq I}$ involved in Selling's decomposition of a tensor $D$ obey $$ \|e_i\| \leq C \mu(D), $$ where $C$ is an absolute constant.
offsets_norms = xp.linalg.norm(offsets,axis=0)
print("Sqrt of condition number : ", μ)
print("Largest offset norm : ", np.max(offsets_norms))
Sqrt of condition number : 10 Largest offset norm : 5.0990195135927845
plt.plot(θ,np.max(offsets_norms,axis=0));
We can rewrite Selling's decomposition in a manner independent of the a specific ordering of the offsets, as follows $$ D = \sum_{e \in Z^d} \lambda^e(D) e e^T. $$ One can then prove that the coefficient $\lambda^e(D)$ of Selling's tensor decomposition, for a given offset $e \in Z^d$, depends continuously on the parameter $D$, as illustrated below.
decomp = Selling.GatherByOffset(θ,coefs,offsets)
fig = plt.figure(figsize=(8,4))
for offset,(angle,coef) in decomp.items():
plt.plot(angle,coef)
plt.legend(decomp.keys(),ncol=3);
savefig(fig,"Coefs_Sel2_rot.pdf")
The lattice $Z^d$ is spanned, by integer combinations, by the offsets $(e_i)_{1 \leq i \leq I}$ appearing in the decomposition of an arbitrary p.s.d. tensor $D$. In other words, for all $x \in Z^d$, there exists coefficients $k_1,\cdots, k_I \in Z$ such that $$ x = k_1 e_1 + \cdots+ k_d e_d. $$ In addition, one may select this decomposition so that the weight $\lambda_i$ of $e_i$ is positive whenever $k_i \neq 0$.
This property guarantees that the graph underlying e.g. the discretization of an anisotropic laplacian is locally connected, hence that spurious modes such as chessboard artifacts will not appear.
From a mathematical standpoint, the spanning property can be deduced from the construction of the decomposition in terms of obtuse superbases. Numerically, we can check it by finding a subset of the offsets whose determinant equals $\pm 1$.
lp.det(offsets[:,0:2])
array([ 1, 1, -1, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, 1, 1, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, -1, -1, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, 1, 1, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, -1, 1, -1])
coefs,offsets = Selling.Decomposition(D3)
lp.det(offsets[:,0:3])
-1
Selling's coefficients depend in a piecewise linear manner on the decomposed tensor, since by Selling's formula they read $ - <b_i,D b_j>$ where $b_i,b_j$ are elements of a $D$-obtuse superbase $(b_0,\cdots,b_d)$. This property is exploited, in some (somewhat sophisticated) numerical schemes, e.g. for the Pucci and Monge Ampere PDEs.
def Interpolate(a,b,T=np.linspace(0,1,100)):
return T, np.moveaxis(ad.array([(1-t)*a + t*b for t in T]),0,-1)
np.random.seed(42)
t,D2t = Interpolate(MakeRandomTensor(2),MakeRandomTensor(2))
decomp = Selling.GatherByOffset(t,*Selling.Decomposition(D2t))
fig = plt.figure(figsize=(8,4))
for offset,(time,coef) in decomp.items(): plt.plot(time,coef)
plt.legend(decomp.keys());
np.random.seed(1)
t,D3t = Interpolate(MakeRandomTensor(3),MakeRandomTensor(3))
decomp = Selling.GatherByOffset(t,*Selling.Decomposition(D3t))
fig = plt.figure(figsize=(8,4)); plt.title("Coefficients of Selling's decomposition (d=3)")
for offset,(time,coef) in decomp.items(): plt.plot(time,coef)
plt.legend(decomp.keys(),ncol=3)
savefig(fig,"Coefs_Sel3.pdf")
The coefficients of Selling's decomposition are piecewise linear, hence have Lipschitz regularity, but not better. In some applications it is preferable, at least from the theoretical standpoint, to define a decomposition whose coefficients have a Lipschitz gradient, or whose square root is Lipschitz. We describe such a construction in the following.
Two auxiliary functions are needed for our smooth variant of Selling's decomposition. The first one is a regularized substitude for the absolute value, smoothed within the interval $[-1,1]$.
def sabs(x):
"""Regularized absolute value. Guarantee : 0 <= result-|x| <= 1/2"""
x = np.abs(x) # Actually useless for this specific application
return np.where(x<=1, 0.5*(1+x**2), x)
# return np.where(x<=1, (3+6*x**2-x**4)/8,x) # Alternative with higher regularity
x=np.linspace(-2,2,100)
plt.plot(x,sabs(x),label="sabs")
plt.plot(x,np.abs(x),label="abs")
plt.title("Smoothed absolute value")
plt.legend();
The second auxiliary function is a regularized substitute for the median of three positive reals $\rho_0,\rho_1,\rho_2$. In addition, the result is also defined in terms of two quantities denoted $s$ and $q$, that are chosen for their invariance under Selling's superbase operations.
def smed(ρ0,ρ1,ρ2):
"""Regularized median (a.k.a. ρ1) assuming ρ0<=ρ1<=ρ2.
Guarantee : ρ1/(2*sqrt(2)) <= result < ρ1"""
s,q = ρ0*ρ1+ρ1*ρ2+ρ2*ρ0, (ρ2-ρ1)**2 # Invariant quantities under Selling superbase flip
return 0.5*s/np.sqrt(q+2*s)
t = np.linspace(0,2*np.pi,200)
ρ0,ρ1,ρ2 = np.sort([1.5+np.cos(t),0.5*np.sin(t)**2,np.cos(3*t)+2],axis=0)
plt.plot(t,ρ0, t,ρ1, t,ρ2, t,smed(ρ0,ρ1,ρ2))
plt.title("Smoothed median")
plt.legend(['ρ0','ρ1','ρ2', 'smed']);
Consider a positive definite matrix $D$, and a $D$-obtuse superbase $(v_0,v_1,v_2)$. Define the Selling weights $\rho_i = -<e_{i-1},D e_{i+1}>$, with periodic indices, and offsets $e_i = v_i^\perp$. Assume that $\rho_0\leq \rho_1\leq \rho_2$, up to permuting the basis elements.
Our regularized Selling decomposition takes the following form, with $m = \mathrm{smed}(\rho_0,\rho_1,\rho_2)$. $$ D = \sum_{0 \leq i \leq 2} \rho_i e_i e_i^\top + \frac 1 2 (m\, \mathrm{sabs}(\rho_0/m)-\rho_0)(e_0 e_0^\top - 2 e_1 e_1^\top - 2 e_2 e_2^\top + (e_1-e_2)(e_1-e_2)^\top). $$ Note that the first term $\sum_{0 \leq i \leq 2} \rho_i e_i e_i^\top$ is Selling's standard decomposition of $D$. The second term amounts to zero, since $e_0 e_0^\top - 2 e_1 e_1^\top - 2 e_2 e_2^\top + (e_1-e_2)(e_1-e_2)^\top = 0$, but it introduces the additional offset $e_1-e_2$ and modifies the weights in a smooth manner.
def sdecomp(D):
"""Smooth variant of Selling's two dimensional decomposition"""
v = np.moveaxis(Selling.ObtuseSuperbase(D),1,0)
ρ = np.array([-lp.dot_VAV(v[1],D,v[2]),-lp.dot_VAV(v[0],D,v[2]),-lp.dot_VAV(v[0],D,v[1])])
order = np.argsort(ρ,axis=0)
v = np.take_along_axis(v,order[:,None],axis=0); ρ=np.take_along_axis(ρ,order,axis=0)
m = smed(*ρ)
w = np.maximum(0,m*sabs(ρ[0]/m)-ρ[0]) # Positive (w.o. max) up to roundoff error
return (ad.array([ρ[0]+w/2, ρ[1]-w, ρ[2]-w, w/2]),
lp.perp(np.moveaxis(ad.array([v[0],v[1],v[2],v[1]-v[2]]),0,1)).astype(int) )
Recall that Selling's decomposition has piecewise linear coefficients.
np.random.seed(42)
t,D2t = Interpolate(MakeRandomTensor(2),MakeRandomTensor(2))
if xp is not np: D2t = D2t.get() # Bug with cupy 9.0 and np.zeros_like below
fig = plt.figure(figsize=(8,4)); plt.title("Coefficients of Selling's decomposition")
decomp = Selling.GatherByOffset(t,*Selling.Decomposition(D2t))
for offset,(time,coef) in decomp.items(): plt.plot(time,coef)
plt.legend(decomp.keys()); savefig(fig,"Coefs_Sel2.pdf")
In this section, we constructed a matrix decomposition whose coefficients have a bounded second order derivative.
fig = plt.figure(figsize=(8,4)); plt.title("Coefficients of the smooth decomposition")
decomp = Selling.GatherByOffset(t,*sdecomp(D2t))
for offset,(time,coef) in decomp.items(): plt.plot(time,coef)
plt.legend(decomp.keys()); savefig(fig,"Coefs_Smooth.pdf")
In particular, their square root is Lipschitz, a property used in some convergence analyses.
fig = plt.figure(figsize=(8,4)); plt.title("Sqrt of coefficients, smooth decomposition")
for offset,(time,coef) in decomp.items(): plt.plot(time,np.sqrt(coef))
plt.legend(decomp.keys(),loc="upper right"); savefig(fig,"SqCoefs_Smooth.pdf")
Selling's decomposition is piecewise linear, and is non-differentiable close to the identity matrix. In contrast, it can be useful in some applications to have a positive matrix decomposition that is smooth (as the previous one) or even better linear in a neighborhood of the identity matrix. We provide the following routine, which provides such a linear and non-negative decomposition for all positive definite matrices $D$ such that $$ \kappa \lambda_{\max}(D) \leq \lambda_{min}(D), \quad \kappa:= \frac {d-1}{d+1}, $$ using a fixed stencil of $d^2$ points. It appears that no decomposition with a fixed stencil and linear weights applies to a larger neighborhood of the identity matrix, defined by a smaller constant $\kappa$ (unless there is an error in my calculations).
def linear_decomp_offsets(d):
"""The stencil for this linear decomposition is fixed and has $d^2$ elements"""
e = xp.eye(d)
return np.concatenate(
(e,xp.array([e[i]+s*e[j] for i in range(d) for j in range(i) for s in (-1,1)])),axis=0).T
def linear_decomp_coefs(D):
"""These weights depend linearly on the matrix $D$"""
d=len(D)
α=(d+1)/(2*d); β=-1/(2*d); γ=1/(4*d); δ=1/4; # ε=(d-1)/(d+1)
t = sum(D[i,i] for i in range(d))
return np.concatenate((ad.array([(α-β)*D[i,i]+β*t for i in range(d)]),
ad.array([γ*(D[i,i]+D[j,j])+s*δ*(D[i,j]+D[j,i]) for i in range(d)
for j in range(i) for s in (-1,1)]) ),axis=0)
print("Two dimensional stencil:\n",linear_decomp_offsets(2))
print("Three dimensional stencil:\n",linear_decomp_offsets(3))
Two dimensional stencil: [[ 1. 0. -1. 1.] [ 0. 1. 1. 1.]] Three dimensional stencil: [[ 1. 0. 0. -1. 1. -1. 1. 0. 0.] [ 0. 1. 0. 1. 1. 0. 0. -1. 1.] [ 0. 0. 1. 0. 0. 1. 1. 1. 1.]]
We numerically check the announced properties of this linear decomposition, in dimensions $2$ and $3$, on a hundred matrices.
d=2; D = MakeRandomTensor(d,(100,),relax=0.5)
coefs,offsets = linear_decomp_coefs(D),linear_decomp_offsets(d)
# Check reconstruction
assert allclose(Reconstruct(coefs,offsets[...,None]),D)
# Check positivity for sufficiently well conditioned matrices
poscoef = np.min(coefs,axis=0)>=0
λmin,λmax = np.linalg.eigvalsh(np.moveaxis(D,-1,0)).T
illcond = λmin < (d-1)/(d+1)*λmax
assert np.all(poscoef | illcond)
d=3; D = MakeRandomTensor(d,(100,),relax=0.5)
coefs,offsets = linear_decomp_coefs(D),linear_decomp_offsets(d)
# Check reconstruction
assert np.allclose(Reconstruct(coefs,offsets[...,None]),D)
# Check positivity for sufficiently well conditioned matrices
poscoef = np.min(coefs,axis=0)>=0
λmin,_,λmax = np.linalg.eigvalsh(np.moveaxis(D,-1,0)).T
illcond = λmin < (d-1)/(d+1)*λmax
assert np.all(poscoef | illcond)
Selling's decomposition This decomposition is piecewise linear. The code is compatible with automatic differentiation, but the AD part is unstable at points of non-differentiability, which include the identity matrix.
np.random.seed(42)
D0 = ad.Dense.identity(constant=xp.eye(2)) # identity matrix + symbolic first order perturbation
D0 = (D0+D0.T)/2 # Symmetrize the first order part
D1 = D0 + 1e-4*xp.asarray(np.random.rand(2,2)-0.5) # A perturbation
The matrices $D_0$ and $D_1$ are close, and so are their Selling decompositions. (Note that the stencil for $D_0$, the identity matrix, is not uniquely determined, but the Selling weight associated to the modified stencil elements is zero anyway.) However, the first order AD parts are different.
λ0,e0 = Selling.Decomposition(D0)
λ1,e1 = Selling.Decomposition(D1)
# In other cases, λ1 may need to be permuted, to account for arbitrary ordering of e0,e1
λdiff = λ0-λ1
assert np.max(np.abs(λdiff.value)) < 1e-4 # The coefficients values are close
np.max(np.abs(λdiff.coef)) # The coefficient derivatives are different
1.0
Smoothed variant of Selling's decomposition. The previous instability phenomenon is avoided with the smooth variant of the decomposition.
Note that the ordering of the stencil elements, is not uniquely determined. But this can be dealt with easily by putting them into some canonical ordering.
λ0,e0 = sdecomp(D0)
λ1,e1 = sdecomp(D1)
# λ1 is reversed here to account for reversed ordering of the offsets e0 and e1
# (An arbitrary permutation is possible in general)
λdiff = λ0-λ1[::-1]
assert np.max(np.abs(λdiff.value))<1e-4 # The coefficients are close
assert np.max(np.abs(λdiff.coef))<1e-4 # The coefficient derivatives are close
Linear variant. The linear decomposition presented above only applies to positive definite matrices whose condition number is sufficiently small. Since it linear, however, not only the coefficients are close, but their derivatives are identical. The offsets are also, of course, identical.
λ0 = linear_decomp_coefs(D0)
λ1 = linear_decomp_coefs(D1)
λdiff = λ0-λ1 # Always the same ordering here
assert np.max(np.abs(λdiff.value))<1e-4 # The coefficients are close
assert np.max(np.abs(λdiff.coef))==0 # Their derivatives are identical