This volume presents some of the algorithmic tools at the foundation of our numerical schemes. In particular tensor decomposition and automatic differentiation.
Tensor decomposition. Tensors come in various of flavors, and admit numerous decompositions and normal forms. Our numerical schemes are based on a specific technique, known as Voronoi's first reduction of quadratic forms. It allows to decompose a symmetric positive definite tensor $D \in S_d^{++}$ in the form $$ D = \sum_{1 \leq i \leq I} \rho_i e_i e_i^T, $$ where the weights $\rho_i \geq 0$, and the offsets $e_i \in Z^d\setminus \{0\}$ have integer coordinates. In addition, among all such possible decompositions, the coefficient's sum is maximized $$ \sum_{1 \leq i \leq I} \rho_i. $$ This decomposition is particularly suitable for the construction of finite difference schemes on Cartesian grids.
Automatic differentiation. Numerical analysis is, more often than not, about differentiating various kinds functions. These functions may depend on just one or on a million variables, and may have varying degrees of complexity. Sometimes first order differentiation is enough, while in other circumstances second order or higher order derivatives are needed.
The derivatives of a given function can in many cases be computed by hand, and implemented numerically as such. However, this manual process is time consuming and is one of the most frequent sources of programming bugs.
Automatic differentiation is a numerical tool which abstracts function differentiation. It often works by subclassing the numerical data, in our case numpy ndarray tensors, so as replace scalars with first or second order Taylor expansions. Automatic differentiation can be extremely powerful, the tradeoff being a modest increase of CPU time, and a huge reduction of development time. The first experience with such tools, in my experience, feels almost magical.
However, in contrast with what is often claimed, automatic differentiation is not a one size fits all or univocal procedure. Several variants exist, depending on the choice of the following features:
Depending on this choice, a basic scalar $a_0 \in R$ is replaced with the Taylor expansion $$ a_0 + <a_1, h> + O(h^2), \quad \text{ or } \quad a_0 + <a_1, h> + \frac 1 2 <h,a_2 h> + O(h^3), $$ where $a_1 \in R^n$ is a $n$-dimensional vector, and $a_2 \in S_n(R)$ is an $n\times n$ symmetric matrix. The variable $h\in R^n$ is an abstract $n$-dimensional perturbation, which has no existence numerically, in contrast with $a_0, a_1, a_2$. First order differentiation, a.k.a. the computation of a Jacobian or gradient, is most common among numerical packages. Second order differentiation, a.k.a. the computation of a Hessian, is required for e.g. optimization problems solved with the Newton method. If needed, higher order differentiation can be achieved by recursive first or second order differentiation.
Method | Limitation |
---|---|
Dense | Low dimensional input |
Sparse | Simple function |
Reverse | Low dimensional output |
Which AD methods are implemented in the AGD package ? We provide python routines for the six possible flavors of automatic differentitiation: First/Second order with Dense/Sparse/Reverse implementations. The HamiltonFastMarching C++ library, on the other hand, only provides First order automatic differentiation, with Dense/Reverse implementations.
Why did I not use an existing AD package ? There exists many high quality automatic differentiation packages. However, in our experience, they usually specialize in a particular implementation modality: for instance Reverse First order automatic differentiation is extremely popular (for good reason) in optimization packages. In contrast, the numerical schemes presented in these notebooks do require several of the AD implementations, in particular Sparse automatic differentiation which is not so easily found. We also need the ability to compose various AD methods together, so as differentiate a given function by parts, using Dense or Sparse differentiation depending on the code sections.
Should you use the AGD package for AD ? Only, I would say, if it provides functionality not available elsewhere, or if you are using one of the provided notebooks as a starting point for a personal project.
Github repository to run and modify the examples on your computer. AdaptiveGridDiscretizations
Main summary, including the other volumes of this work.
4 GPU acceleration of the TTI projection
1 Matrix multiplication and inversion 2. In place modifications and aliasing 3. CPU/GPU generic programming
#import sys; sys.path.append("..") # Allow imports from parent directory
#from Miscellaneous import TocTools; print(TocTools.displayTOCs('Algo'))