Adaptive PDE discretizations on Cartesian grids¶

Volume : Algorithmic tools¶

Part : Tensor decomposition techniques¶

Chapter : Selling's algorithm, in dimension 2 and 3¶

This notebook presents some tensor decomposition techniques that are at the foundation of our anisotropic PDE discretizations on Cartesian grids. The general objective is to write a given symmetric positive definite matrix $D$ under the form $$ D = \sum_{0 \leq i < I} \lambda_i e_i e_i^T. $$ From this point, various numerical schemes can be designed, for both first order and second order, and both linear and non-linear PDEs.

The techniques used for constructing the above decomposition are non-trivial, and are related to classical yet subtle tools of discrete geometry. This notebook is meant to illustrate some of their properties.

This notebook is limited to dimensions $d\in \{2,3\}$. Tensor decomposition in dimension $d \in \{4,5\}$ requires another set of techniques (and in practice the call to a c++ library), which are discussed in II Tensor decomposition, dimensions 4 and 5

References

The tensor decomposition presented in this notebook is a central ingredient of the following paper:

Fehrenbach, J., & Mirebeau, J.-M. (2014). Sparse non-negative stencils for anisotropic diffusion. Journal of Mathematical Imaging and Vision, 49(1), 123–147. http://doi.org/http://dx.doi.org/10.1007/s10851-013-0446-3

Summary of volume Algorithmic tools, this series of notebooks.

Main summary of the Adaptive Grid Discretizations book of notebooks, including the other volumes.

Table of contents¶

1. Decomposing a tensor, or a tensor field
2. Under the hood : obtuse superbases
- 2.1 Two dimensions
- 2.2 Three dimensions
3. Properties of the decomposition
4. Smooth decomposition

Acknowledgement. Some of the experiments presented in these notebooks are part of ongoing research with Ludovic Métivier and Da Chen.

Copyright Jean-Marie Mirebeau, Centre Borelli, ENS Paris-Saclay, CNRS, University Paris-Saclay

0. Importing the required libraries¶

In [1]:

import sys; sys.path.insert(0,"..") # Allow import of agd from parent directory (useless if conda package installed)
#from Miscellaneous import TocTools; print(TocTools.displayTOC('TensorSelling','Algo'))

In [2]:

from agd import LinearParallel as lp
from agd import FiniteDifferences as fd
from agd import Selling
from agd.Plotting import savefig; #savefig.dirName = 'Figures/TensorSelling'
from agd import AutomaticDifferentiation as ad

The library imported as lp is a set of routines meant to facilitate the manipulation of numerous small vectors and matrices simultaneously. It is based on numpy and implements only a small number of linear algebra tools.

In [3]:

import numpy as np; xp = np; allclose = np.allclose
import matplotlib.pyplot as plt

0.1 Additional configuration¶

Uncomment the following line to run the notebook on the GPU. (This only for compatibility testing, since no intensive computation is involved in this specific notebook.)

In [4]:

#xp,plt,allclose=map(ad.cupy_friendly,(xp,plt,allclose))

1. Decomposing a tensor, or a tensor field¶

In order to illustrate our tensor decomposition facilities, we will rely on randomly generated symmetric positive definite tensors. They are built as $$ M = A^T A + \varepsilon \mathrm{Id} $$ where $A$ has normalized random Gaussian entries.

In [5]:

def MakeRandomTensor(dim,shape = tuple(),relax=0.01):
    identity = fd.as_field(np.eye(dim),shape,depth=2)
    A = np.random.standard_normal( (dim,dim) + shape ) 
    M = lp.dot_AA(lp.transpose(A),A) + relax*identity
    return xp.asarray(M) # Convert to GPU array if needed

In [6]:

# For reproducibility, we fix the random seed
np.random.seed(42) 

1.1 Case of a $2\times 2$ matrix¶

In [7]:

# Generate a 2x2 random psd tensor
D2 = MakeRandomTensor(2)

The Selling.Decomposition routine, applies to a symmetric positive definite matrix $D$ of size $d \times d$, with $d\leq 3$. It returns coefficients $\lambda_i \geq 0$ and offsets $e_i \in Z^d$.

A discussion on the inner workings of this decomposition is presented in the next section.

In [8]:

coefs,offsets = Selling.Decomposition(D2)

The matrix can be reconstructed by the formula $$ D = \sum_{0 \leq i < I} \lambda_i e_i e_i^T. $$

In [9]:

def Reconstruct(coefs,offsets):
     return (coefs*lp.outer_self(offsets)).sum(2)

In [10]:

assert allclose(D2,Reconstruct(coefs,offsets))

There are $I = d(d+1)/2$ coefficients and offsets (a.k.a $I=3$ if $d=2$, and $I=6$ if $d=3$). Note that this is more than the similar-looking eigen-decomposition of a matrix, which uses only $d$ coefficients and unit vectors. However, Selling's offsets have integer entries, hence are suitable for the construction of finite difference schemes.

In [11]:

print("Selling decomposition of matrix : \n", D2)
print("Coefficients : ", coefs)
print("Offsets : \n", offsets)

Selling decomposition of matrix : 
 [[0.67622539 0.91777115]
 [0.91777115 2.34873696]]
Coefficients :  [0.43467964 0.94787431 0.24154575]
Offsets : 
 [[-1  0  1]
 [-1 -1  2]]

In [12]:

plt.axis('equal'); plt.title("Decomposition offsets for the matrix D2")
plt.quiver(*np.zeros(offsets.shape),*offsets,angles='xy',scale_units='xy',scale=1);
plt.scatter(*offsets); plt.scatter(*(-offsets));

1.2 Case of a $3 \times 3$ matrix.¶

As previously, we generate a 3x3 random psd tensor, decompose it and validate the absence of reconstruction error.

In [13]:

D3 = MakeRandomTensor(3)

In [14]:

coefs,offsets = Selling.Decomposition(D3)

In [15]:

assert allclose(D3,Reconstruct(coefs,offsets))

In [16]:

print("Selling decomposition of matrix : \n", D3)
print("Coefficients : ", coefs)
print("Offsets : \n", offsets)

Selling decomposition of matrix : 
 [[ 0.86853982 -0.08963958 -0.06552819]
 [-0.08963958  0.50213052 -0.73715916]
 [-0.06552819 -0.73715916  2.85683026]]
Coefficients :  [0.15516777 0.08963958 0.62373247 0.23502864 1.49444606 0.1774623 ]
Offsets : 
 [[-1  1 -1  0  0  0]
 [ 0 -1  0  1  0 -1]
 [ 1  1  0 -2  1  1]]

1.3 Case of (extremely) strong anisotropy¶

Selling's tensor decomposition algorithm requires the matrix to be positive definite. It involves a loop, whose number of iterations grows to infinity as the matrix degenerates.

A maximum number of iterations is set by default, to a reasonably large value, so that non-convergence is typically due to an error on the user's side - namely a non-positive matrix.

In the following cell, we define the following matrix, which is positive definite but strongly anisotropic: $$ \begin{pmatrix} 1 & \varepsilon\\ \varepsilon & 2 \varepsilon^2 \end{pmatrix} $$

In [17]:

# Construct some matrix with extremely large condition number
eps = 1/200.
D2_bad = np.array([[1,eps],[eps,2*eps**2]])

In addition, the anisotropy direction of this matrix is almost but not exactly aligned with the coordinate axes, which is a worst case scenario for Selling's algorithm. As a result, Selling's algorithm fails to converge within the prescribed iteration limit.

In [18]:

#Selling's decomposition does not terminate within the iteration limit
try:
    coefs,offsets = Selling.Decomposition(D2_bad)
except ValueError as e:
    print(e.args)

("Selling's algorithm did not terminate in iterMax2=100 iterations",)

The iteration limit of Selling's algorithm may be increased, so as to ensure correct termination.

In [19]:

Selling.iterMax2 *= 2

In [20]:

coefs,offsets = Selling.Decomposition(D2_bad)

However, the resulting offsets are unlikely to be of use for any PDE discretization, since they are way too large.

Note also that basis reduction techniques more efficient than Selling's algorithm are available for tensors with extremely large condition numbers. (E.g. Lagrange's algorithm.)

In [21]:

print("offsets : \n", offsets)

offsets : 
 [[-99 100  -1]
 [ -1   1   0]]

1.4 Decomposition a field of symmetric tensors¶

Our implementation of Selling's algorithm automatically threads over dimensions deeper than two.

In [22]:

# Generate a 10x10 field of random 2x2 spd tensors
D2_field = MakeRandomTensor(2,(10,10))

In [23]:

coefs, offsets = Selling.Decomposition(D2_field)

The shape of the matrix field is $(d,d, n_1,\cdots,n_k)$, whereas the coefficients are $(I,n_1,\cdots,n_k)$ and the offsets $(d,I,n_1,\cdots,n_k)$ where $I = d (d+1)/2$.

In [24]:

print(f"{D2_field.shape}, {coefs.shape}, {offsets.shape}")

(2, 2, 10, 10), (3, 10, 10), (2, 3, 10, 10)

In [25]:

assert allclose(D2_field,Reconstruct(coefs,offsets))

2. Under the hood : obtuse superbases¶

This section illustrates the main mathematical concept underlying Selling's decomposition, known as obtuse superbases of the lattice $Z^d$.

A superbase of $Z^d$ is a special kind of coordinate system $(b_0,\cdots,b_d)$, spanning the lattice of integer points and with some redundancy. More precisely, one requires $$ \begin{aligned} b_0+\cdots+b_d &= 0,\\ \det(b_1,\cdots,b_d) &= \pm 1. \end{aligned} $$ Such a superbase is said $D$-obtuse, where $D$ is a given $d\times d$ symmetric positive definite matrix, if $$ <b_i,D b_j> \leq 0 $$ for all distinct $i,j\in \{0,\cdots,d\}$.

Sellings decomposition of a matrix $D \in S_d^{++}$ is based Selling's formula applied to a $D$-obtuse superbase, which itself is produced by Selling's algorithm.

2.1 Two dimensions¶

The following method applies Selling's algorithm to a given positive definite tensor $D \in S_2^{++}$, producing a suitable superbase of $Z^2$.

In [26]:

sb = Selling.ObtuseSuperbase(D2)

In [27]:

sb

Out[27]:

array([[-1., -1.,  2.],
       [ 1., -0., -1.]])

The generated superbase $(b_0,\cdots,b_d)$ is designed to be $D$-obtuse. In other words to obey $$ <b_i,D b_j> \leq 0 $$ for all distinct $i,j\in \{0,\cdots,d\}$.

In [28]:

[lp.dot_VAV(sb[:,i],D2,sb[:,np.mod(i+1,3)]) for i in range(3)]

Out[28]:

[-0.2415457537440081, -0.4346796384446704, -0.9478743070679034]

Selling's formula yields the desired decomposition of the given symmetric positive definite matrix $D$ $$ D = -\sum_{0\leq i<j \leq d} <b_i,D b_j> v_{ij} v_{ij}^T, $$ where one defines $v_{ij} = \pm b_k^\perp$ whenever $\{i,j,k\} = \{0,1,2\}$.

Note that,

the weight of the decomposition are the negated scalar products, hence are non-negative.
the offsets of the decomposition are the superbase elements rotated by $\pi/2$, hence have integer coordinates.

In [29]:

coefs,offsets = Selling.Decomposition(D2,sb=sb)

In [30]:

print("Coefficients : ", coefs)
print("Offsets : \n", offsets)

Coefficients :  [0.43467964 0.94787431 0.24154575]
Offsets : 
 [[-1  0  1]
 [-1 -1  2]]

It is possible to produce a tensor decomposition from an arbitrary superbase, but the coefficients are positive only if the superbase is obtuse.

In [31]:

sb = Selling.CanonicalSuperbase(D2) # Only takes the dimension an array type from the argument
coefs,offsets = Selling.Decomposition(D2,sb=sb)
print("Coefficients : ", coefs)
print("Offsets : \n", offsets)

Coefficients :  [-0.91777115  3.26650811  1.59399654]
Offsets : 
 [[ 1  0 -1]
 [-1  1  0]]

In [32]:

assert np.allclose(D2,Reconstruct(coefs,offsets))

2.2 Three dimensions¶

We follow similar steps in dimension $d=3$. Selling's algorithm, applied to the given matrix $D$, yields a specific superbase of $Z^3$.

In [33]:

sb = Selling.ObtuseSuperbase(D3)

In [34]:

sb

Out[34]:

array([[ 1., -0.,  0., -1.],
       [ 0.,  1.,  1., -2.],
       [ 0.,  1.,  0., -1.]])

The superbase produced by Selling's algorithm is $D$-obtuse, similarly to the two dimensional case.

In [35]:

[lp.dot_VAV(sb[:,i],D3,sb[:,j]) for i in range(4) for j in range(i)]

Out[35]:

[-0.1551677694190755,
 -0.08963958172347006,
 -0.2350286383841237,
 -0.6237324732734056,
 -1.4944460568297004,
 -0.1774622969114411]

Selling's formula yields the desired decomposition of the given positive definite matrix $D \in S_3^{++}$: $$ D = -\sum_{0\leq i<j \leq d} <b_i,D b_j> v_{ij} v_{ij}^T, $$ where one defines $v_{ij} = \pm b_k \wedge b_l$ whenever $\{i,j,k,l\} = \{0,1,2,3\}$.

Note that:

the weight of the decomposition are the negated scalar products, hence are non-negative.
the offsets of the decomposition are the cross-products of superbase elements (or their opposites), hence have integer entries.

In [36]:

coefs,offsets = Selling.Decomposition(D3)

In [37]:

print("Coefficients : ", coefs)
print("Offsets : \n", offsets)

Coefficients :  [0.15516777 0.08963958 0.62373247 0.23502864 1.49444606 0.1774623 ]
Offsets : 
 [[-1  1 -1  0  0  0]
 [ 0 -1  0  1  0 -1]
 [ 1  1  0 -2  1  1]]

In [38]:

#Comparing with the cross products of the superbase elements.
ad.array(
    [lp.cross(sb[:,i],sb[:,j]) for i in range(4) for j in range(i)]
).astype(int).T

Out[38]:

array([[ 0,  0,  1,  0, -1,  1],
       [ 1,  0,  0, -1,  1,  0],
       [-1, -1,  0,  2, -1, -1]])

3. Properties of the decomposition¶

Selling's decomposition of tensors, presented in this notebook, has a qualities that make it particularly suitable for PDE discretizations. More precisely, it is:

Local: The offsets appearing in the decomposition are typically small, and in any case bounded in terms of the anisotropy ratio of the matrix, see below.
Stable: The decomposition is unique, up to trivial transformations (reordering the offsets, and replacing some with their opposites). It is also stable, more precisely the coefficients are locally Lipschitz w.r.t. the decomposed matrix.
Spanning: The offsets of the decomposition span $Z^d$ by linear relations with integer coordinates. In practice, this means that anisotropic PDE discretizations using this method should not produce chessboard artifacts.

We denote the anisotropy ratio of a symmetric positive definite matrix $D$ by $$ \mu(D) := \sqrt{\|D\| \|D^{-1}\|}. $$ In other words, this is the square root of the condition number of $D$.

We illustrate these properties in dimension $d=2$, by considering the following family of matrices $D(\theta,\mu) \in S_2^{++}$: $$ D(\theta,\mu) := \mu^2 e(\theta) e(\theta)^T + e(\theta)^\perp (e(\theta)^\perp)^T, $$ whose condition number is equal to $\mu\geq 1$, and whose anisotropy direction is aligned with the vector $e(\theta):=(\cos \theta,\sin \theta)$, $\theta \in R$.

In [39]:

θ = xp.linspace(0,np.pi/2,100)
μ = 10
D2_rotating = (
    μ**2 * lp.outer_self(xp.array([np.cos(θ),np.sin(θ)])) 
    + lp.outer_self(xp.array([-np.sin(θ),np.cos(θ)])) 
)

In [40]:

coefs,offsets = Selling.Decomposition(D2_rotating)

3.1 Offsets smallness¶

The offsets $(e_i)_{1 \leq i \leq I}$ involved in Selling's decomposition of a tensor $D$ obey $$ \|e_i\| \leq C \mu(D), $$ where $C$ is an absolute constant.

In [41]:

offsets_norms = xp.linalg.norm(offsets,axis=0)

In [42]:

print("Sqrt of condition number : ", μ)
print("Largest offset norm : ", np.max(offsets_norms))

Sqrt of condition number :  10
Largest offset norm :  5.0990195135927845

In [43]:

plt.plot(θ,np.max(offsets_norms,axis=0));

3.2 Stability¶

We can rewrite Selling's decomposition in a manner independent of the a specific ordering of the offsets, as follows $$ D = \sum_{e \in Z^d} \lambda^e(D) e e^T. $$ One can then prove that the coefficient $\lambda^e(D)$ of Selling's tensor decomposition, for a given offset $e \in Z^d$, depends continuously on the parameter $D$, as illustrated below.

In [44]:

decomp = Selling.GatherByOffset(θ,coefs,offsets)

In [45]:

fig = plt.figure(figsize=(8,4))
for offset,(angle,coef) in decomp.items():
    plt.plot(angle,coef)
plt.legend(decomp.keys(),ncol=3);
savefig(fig,"Coefs_Sel2_rot.pdf")

3.3 Spanning property (no chessboard artifacts)¶

The lattice $Z^d$ is spanned, by integer combinations, by the offsets $(e_i)_{1 \leq i \leq I}$ appearing in the decomposition of an arbitrary p.s.d. tensor $D$. In other words, for all $x \in Z^d$, there exists coefficients $k_1,\cdots, k_I \in Z$ such that $$ x = k_1 e_1 + \cdots+ k_d e_d. $$ In addition, one may select this decomposition so that the weight $\lambda_i$ of $e_i$ is positive whenever $k_i \neq 0$.

This property guarantees that the graph underlying e.g. the discretization of an anisotropic laplacian is locally connected, hence that spurious modes such as chessboard artifacts will not appear.

From a mathematical standpoint, the spanning property can be deduced from the construction of the decomposition in terms of obtuse superbases. Numerically, we can check it by finding a subset of the offsets whose determinant equals $\pm 1$.

In [46]:

lp.det(offsets[:,0:2])

Out[46]:

array([ 1,  1, -1,  1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  1,  1,  1,  1,
        1,  1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  1,  1,  1,  1, -1, -1,
       -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  1, -1, -1,
        1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        1,  1,  1,  1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  1,  1,  1,  1,
        1,  1, -1, -1, -1, -1, -1, -1, -1, -1, -1,  1, -1,  1, -1])

In [47]:

coefs,offsets = Selling.Decomposition(D3)

In [48]:

lp.det(offsets[:,0:3])

Out[48]:

-1

3.4 Piecewise linearity¶

Selling's coefficients depend in a piecewise linear manner on the decomposed tensor, since by Selling's formula they read $ - <b_i,D b_j>$ where $b_i,b_j$ are elements of a $D$-obtuse superbase $(b_0,\cdots,b_d)$. This property is exploited, in some (somewhat sophisticated) numerical schemes, e.g. for the Pucci and Monge Ampere PDEs.

In [49]:

def Interpolate(a,b,T=np.linspace(0,1,100)):
    return T, np.moveaxis(ad.array([(1-t)*a + t*b for t in T]),0,-1)

In [50]:

np.random.seed(42)
t,D2t = Interpolate(MakeRandomTensor(2),MakeRandomTensor(2))
decomp = Selling.GatherByOffset(t,*Selling.Decomposition(D2t))

In [51]:

fig = plt.figure(figsize=(8,4))
for offset,(time,coef) in decomp.items(): plt.plot(time,coef)
plt.legend(decomp.keys());

In [52]:

np.random.seed(1)
t,D3t = Interpolate(MakeRandomTensor(3),MakeRandomTensor(3))
decomp = Selling.GatherByOffset(t,*Selling.Decomposition(D3t))

In [53]:

fig = plt.figure(figsize=(8,4)); plt.title("Coefficients of Selling's decomposition (d=3)")
for offset,(time,coef) in decomp.items(): plt.plot(time,coef)
plt.legend(decomp.keys(),ncol=3)
savefig(fig,"Coefs_Sel3.pdf")

4. Smooth decomposition¶

The coefficients of Selling's decomposition are piecewise linear, hence have Lipschitz regularity, but not better. In some applications it is preferable, at least from the theoretical standpoint, to define a decomposition whose coefficients have a Lipschitz gradient, or whose square root is Lipschitz. We describe such a construction in the following.

4.1 Construction¶

Two auxiliary functions are needed for our smooth variant of Selling's decomposition. The first one is a regularized substitude for the absolute value, smoothed within the interval $[-1,1]$.

In [54]:

def sabs(x):
    """Regularized absolute value. Guarantee : 0 <= result-|x| <= 1/2"""
    x = np.abs(x) # Actually useless for this specific application
    return np.where(x<=1, 0.5*(1+x**2), x) 
#    return np.where(x<=1, (3+6*x**2-x**4)/8,x) # Alternative with higher regularity

In [55]:

x=np.linspace(-2,2,100)
plt.plot(x,sabs(x),label="sabs")
plt.plot(x,np.abs(x),label="abs")
plt.title("Smoothed absolute value")
plt.legend();

The second auxiliary function is a regularized substitute for the median of three positive reals $\rho_0,\rho_1,\rho_2$. In addition, the result is also defined in terms of two quantities denoted $s$ and $q$, that are chosen for their invariance under Selling's superbase operations.

In [56]:

def smed(ρ0,ρ1,ρ2):
    """Regularized median (a.k.a. ρ1) assuming ρ0<=ρ1<=ρ2.
    Guarantee : ρ1/(2*sqrt(2)) <= result < ρ1"""
    s,q = ρ0*ρ1+ρ1*ρ2+ρ2*ρ0, (ρ2-ρ1)**2 # Invariant quantities under Selling superbase flip 
    return 0.5*s/np.sqrt(q+2*s) 

In [57]:

t = np.linspace(0,2*np.pi,200)
ρ0,ρ1,ρ2 = np.sort([1.5+np.cos(t),0.5*np.sin(t)**2,np.cos(3*t)+2],axis=0)
plt.plot(t,ρ0, t,ρ1, t,ρ2, t,smed(ρ0,ρ1,ρ2))
plt.title("Smoothed median")
plt.legend(['ρ0','ρ1','ρ2', 'smed']);

Consider a positive definite matrix $D$, and a $D$-obtuse superbase $(v_0,v_1,v_2)$. Define the Selling weights $\rho_i = -<e_{i-1},D e_{i+1}>$, with periodic indices, and offsets $e_i = v_i^\perp$. Assume that $\rho_0\leq \rho_1\leq \rho_2$, up to permuting the basis elements.

Our regularized Selling decomposition takes the following form, with $m = \mathrm{smed}(\rho_0,\rho_1,\rho_2)$. $$ D = \sum_{0 \leq i \leq 2} \rho_i e_i e_i^\top + \frac 1 2 (m\, \mathrm{sabs}(\rho_0/m)-\rho_0)(e_0 e_0^\top - 2 e_1 e_1^\top - 2 e_2 e_2^\top + (e_1-e_2)(e_1-e_2)^\top). $$ Note that the first term $\sum_{0 \leq i \leq 2} \rho_i e_i e_i^\top$ is Selling's standard decomposition of $D$. The second term amounts to zero, since $e_0 e_0^\top - 2 e_1 e_1^\top - 2 e_2 e_2^\top + (e_1-e_2)(e_1-e_2)^\top = 0$, but it introduces the additional offset $e_1-e_2$ and modifies the weights in a smooth manner.

In [58]:

def sdecomp(D):
    """Smooth variant of Selling's two dimensional decomposition"""
    v = np.moveaxis(Selling.ObtuseSuperbase(D),1,0)
    ρ = np.array([-lp.dot_VAV(v[1],D,v[2]),-lp.dot_VAV(v[0],D,v[2]),-lp.dot_VAV(v[0],D,v[1])])
    order = np.argsort(ρ,axis=0)
    v = np.take_along_axis(v,order[:,None],axis=0); ρ=np.take_along_axis(ρ,order,axis=0)

    m = smed(*ρ)
    w = np.maximum(0,m*sabs(ρ[0]/m)-ρ[0]) # Positive (w.o. max) up to roundoff error
    
    return (ad.array([ρ[0]+w/2, ρ[1]-w, ρ[2]-w, w/2]),
         lp.perp(np.moveaxis(ad.array([v[0],v[1],v[2],v[1]-v[2]]),0,1)).astype(int) )

4.2 Display of coefficients¶

Recall that Selling's decomposition has piecewise linear coefficients.

In [59]:

np.random.seed(42)
t,D2t = Interpolate(MakeRandomTensor(2),MakeRandomTensor(2))
if xp is not np: D2t = D2t.get() # Bug with cupy 9.0 and np.zeros_like below

In [60]:

fig = plt.figure(figsize=(8,4)); plt.title("Coefficients of Selling's decomposition")
decomp = Selling.GatherByOffset(t,*Selling.Decomposition(D2t))
for offset,(time,coef) in decomp.items(): plt.plot(time,coef)
plt.legend(decomp.keys()); savefig(fig,"Coefs_Sel2.pdf")

In this section, we constructed a matrix decomposition whose coefficients have a bounded second order derivative.

In [61]:

fig = plt.figure(figsize=(8,4)); plt.title("Coefficients of the smooth decomposition")
decomp = Selling.GatherByOffset(t,*sdecomp(D2t))
for offset,(time,coef) in decomp.items(): plt.plot(time,coef)
plt.legend(decomp.keys()); savefig(fig,"Coefs_Smooth.pdf")

In particular, their square root is Lipschitz, a property used in some convergence analyses.

In [62]:

fig = plt.figure(figsize=(8,4)); plt.title("Sqrt of coefficients, smooth decomposition")
for offset,(time,coef) in decomp.items(): plt.plot(time,np.sqrt(coef))
plt.legend(decomp.keys(),loc="upper right"); savefig(fig,"SqCoefs_Smooth.pdf")

4.3 Linear decomposition¶

Selling's decomposition is piecewise linear, and is non-differentiable close to the identity matrix. In contrast, it can be useful in some applications to have a positive matrix decomposition that is smooth (as the previous one) or even better linear in a neighborhood of the identity matrix. We provide the following routine, which provides such a linear and non-negative decomposition for all positive definite matrices $D$ such that $$ \kappa \lambda_{\max}(D) \leq \lambda_{min}(D), \quad \kappa:= \frac {d-1}{d+1}, $$ using a fixed stencil of $d^2$ points. It appears that no decomposition with a fixed stencil and linear weights applies to a larger neighborhood of the identity matrix, defined by a smaller constant $\kappa$ (unless there is an error in my calculations).

In [63]:

def linear_decomp_offsets(d):
    """The stencil for this linear decomposition is fixed and has $d^2$ elements"""
    e = xp.eye(d)
    return np.concatenate(
        (e,xp.array([e[i]+s*e[j] for i in range(d) for j in range(i) for s in (-1,1)])),axis=0).T

def linear_decomp_coefs(D):
    """These weights depend linearly on the matrix $D$"""
    d=len(D)
    α=(d+1)/(2*d); β=-1/(2*d); γ=1/(4*d); δ=1/4; # ε=(d-1)/(d+1)
    t = sum(D[i,i] for i in range(d))
    return np.concatenate((ad.array([(α-β)*D[i,i]+β*t for i in range(d)]),
        ad.array([γ*(D[i,i]+D[j,j])+s*δ*(D[i,j]+D[j,i]) for i in range(d) 
        for j in range(i) for s in (-1,1)]) ),axis=0)

In [64]:

print("Two dimensional stencil:\n",linear_decomp_offsets(2))
print("Three dimensional stencil:\n",linear_decomp_offsets(3))

Two dimensional stencil:
 [[ 1.  0. -1.  1.]
 [ 0.  1.  1.  1.]]
Three dimensional stencil:
 [[ 1.  0.  0. -1.  1. -1.  1.  0.  0.]
 [ 0.  1.  0.  1.  1.  0.  0. -1.  1.]
 [ 0.  0.  1.  0.  0.  1.  1.  1.  1.]]

We numerically check the announced properties of this linear decomposition, in dimensions $2$ and $3$, on a hundred matrices.

In [65]:

d=2; D = MakeRandomTensor(d,(100,),relax=0.5) 
coefs,offsets = linear_decomp_coefs(D),linear_decomp_offsets(d)

# Check reconstruction
assert allclose(Reconstruct(coefs,offsets[...,None]),D)

# Check positivity for sufficiently well conditioned matrices
poscoef = np.min(coefs,axis=0)>=0
λmin,λmax = np.linalg.eigvalsh(np.moveaxis(D,-1,0)).T
illcond = λmin < (d-1)/(d+1)*λmax
assert np.all(poscoef | illcond)

In [66]:

d=3; D = MakeRandomTensor(d,(100,),relax=0.5)
coefs,offsets = linear_decomp_coefs(D),linear_decomp_offsets(d)

# Check reconstruction
assert np.allclose(Reconstruct(coefs,offsets[...,None]),D)

# Check positivity for sufficiently well conditioned matrices
poscoef = np.min(coefs,axis=0)>=0
λmin,_,λmax = np.linalg.eigvalsh(np.moveaxis(D,-1,0)).T
illcond = λmin < (d-1)/(d+1)*λmax
assert np.all(poscoef | illcond)

4.4 Automatic differentiation¶

Selling's decomposition This decomposition is piecewise linear. The code is compatible with automatic differentiation, but the AD part is unstable at points of non-differentiability, which include the identity matrix.

In [67]:

np.random.seed(42)
D0 = ad.Dense.identity(constant=xp.eye(2)) # identity matrix + symbolic first order perturbation
D0 = (D0+D0.T)/2 # Symmetrize the first order part
D1 = D0 + 1e-4*xp.asarray(np.random.rand(2,2)-0.5) # A perturbation

The matrices $D_0$ and $D_1$ are close, and so are their Selling decompositions. (Note that the stencil for $D_0$, the identity matrix, is not uniquely determined, but the Selling weight associated to the modified stencil elements is zero anyway.) However, the first order AD parts are different.

In [68]:

λ0,e0 = Selling.Decomposition(D0)
λ1,e1 = Selling.Decomposition(D1)
# In other cases, λ1 may need to be permuted, to account for arbitrary ordering of e0,e1
λdiff = λ0-λ1
assert np.max(np.abs(λdiff.value)) < 1e-4 # The coefficients values are close
np.max(np.abs(λdiff.coef)) # The coefficient derivatives are different

Out[68]:

1.0

Smoothed variant of Selling's decomposition. The previous instability phenomenon is avoided with the smooth variant of the decomposition.

Note that the ordering of the stencil elements, is not uniquely determined. But this can be dealt with easily by putting them into some canonical ordering.

In [69]:

λ0,e0 = sdecomp(D0)
λ1,e1 = sdecomp(D1)
# λ1 is reversed here to account for reversed ordering of the offsets e0 and e1 
# (An arbitrary permutation is possible in general)
λdiff = λ0-λ1[::-1] 
assert np.max(np.abs(λdiff.value))<1e-4 # The coefficients are close
assert np.max(np.abs(λdiff.coef))<1e-4 # The coefficient derivatives are close

Linear variant. The linear decomposition presented above only applies to positive definite matrices whose condition number is sufficiently small. Since it linear, however, not only the coefficients are close, but their derivatives are identical. The offsets are also, of course, identical.

In [70]:

λ0 = linear_decomp_coefs(D0)
λ1 = linear_decomp_coefs(D1)
λdiff = λ0-λ1 # Always the same ordering here
assert np.max(np.abs(λdiff.value))<1e-4 # The coefficients are close
assert np.max(np.abs(λdiff.coef))==0 # Their derivatives are identical

In [ ]: