The present notebook is devoted to forward and reverse differentiation of the fast marching algorithm. We limit ourselves to isotropic fast marching, but more complex models are supported equally well, see the subsequent notebooks as well as the publication :
Limitations. For the purposes of sensitivity analysis, we do avoid some enhancements of the fast marching method :
Indeed, if they were used, in the current implementation, then minor inaccuracies could arise in the the computed sensitivities.
Summary of volume Fast Marching Methods, this series of notebooks.
Main summary of the Adaptive Grid Discretizations book of notebooks, including the other volumes.
This Python® notebook is intended as documentation and testing for the HamiltonFastMarching (HFM) library, which also has interfaces to the Matlab® and Mathematica® languages. More information on the HFM library in the manuscript:
Copyright Jean-Marie Mirebeau, University Paris-Sud, CNRS, University Paris-Saclay
import sys; sys.path.insert(0,"..") # Allow import of agd from parent directory (useless if conda package installed)
#from Miscellaneous import TocTools; print(TocTools.displayTOC('Sensitivity','FMM'))
from agd import Eikonal
from agd import AutomaticDifferentiation as ad
from agd import FiniteDifferences as fd
from agd import Metrics
from agd.Interpolation import UniformGridInterpolation
from agd.Plotting import savefig; #savefig.dirName = 'Figures/Sensitivity'
norm_infinity = ad.Optimization.norm_infinity
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
import scipy.optimize
def ReloadPackages():
from Miscellaneous.rreload import rreload
global Eikonal,ad,fd,Metrics
Eikonal,ad,fd,Metrics = rreload([Eikonal,ad,fd,Metrics],rootdir="../..")
We choose as a start to set up a same problem as considered in the first notebook: a path planning problem involving an isotropic cost, on a two dimensional domain with obstacles. More precisely, we compute the unique viscosity solution $u: \overline \Omega \to ]-\infty,\infty]$ to an eikonal equation \begin{align*} \forall x \in \Omega, \|\nabla u(x)\| &= c(x), & \forall x \in \partial \Omega, u(x) &= \sigma(x). \end{align*} This PDE solution is known to solve the following optimal control problem \begin{equation*} u(x) = \min_{\substack{\gamma(0) \in \partial \Omega\\ \gamma(1)=x}} \sigma(\gamma(0))+ \int_0^1 c(\gamma(t)), \|\gamma'(t)\| \,\mathrm dt \end{equation*} and the minimal paths $\gamma:[0,1] \to \overline \Omega$ can be efficiently backtracked.
Choice of the numerical scheme. At this point, we need to mention that there exists two consistent discretizations of the eikonal equation, namely: $$ \|\nabla u(x)\|^2 \approx h^{-2} \sum_{1\leq i \leq d} \max \{0,u(x)-u(x-h e_i), u(x)-u(x+h e_i)\}^2, $$ and $$ \|\nabla u(x)\|^2 \approx h^{-2} \sum_{1\leq i \leq d} \sum_{s \in \{-1,1\}} \max \{0,u(x)-u(x- h s e_i)\}^2, $$ where $h$ denotes the gridscale, and $(e_1,\cdots,e_d)$ is the canonical basis of $\mathbb R^d$.
The first implementation (left), referred to as 'Isotropic2', is usually preferred since it is more accurate at points were the solution $u$ looses differentiability, e.g. near the cut locus (the points reached by two minimal geodesics).
The second implementation, referred to as 'IsotropicDiff2', has the advantage of being continuously differentiable w.r.t. the values of $u$, and is thus better behaved when it comes to automatic differentiation.
hfmIn = Eikonal.dictIn()
hfmIn['model']='Isotropic2' # Alternatively 'IsotropicDiff2'
Before turning to sensitivity analysis, let us recall how the software is run and its output displayed.
# Define the domain
hfmIn.SetRect(sides=[[-1,1],[0,1]],gridScale=1./100.)
# Set up the boundary conditions
hfmIn['seeds']=[[-0.5,0.3],[0.5,0.8]] # Seed position
hfmIn['seedValues']=[0.,0.5] # Boundary condition imposed at the seed. Defaults to $[0.,0.]$.
# Define the speed function
X,Y = hfmIn.Grid() # Create a coordinate system
hfmIn['cost'] = np.exp(-0.5*(X**2+Y**2)) # Define the cost function
# Insert the obstacles
disk = (X-0.3)**2 + (Y-0.3)**2 <= 0.2**2
barrier = np.logical_and(X==X[70,0], Y>=0.4)
walls = np.logical_or(disk,barrier)
hfmIn['walls']= walls
# Request the desired outputs
hfmIn['exportValues']=1. # Ask for the PDE solution
hfmIn['tips'] = [[0.,0.6],[-0.9,0.5],[0.8,0.8]] # Ask for the geodesics from these three points
hfmOut = hfmIn.Run()
Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Fast marching solver completed in 0.003066 s. Field geodesicSolver defaults to Discrete Field geodesicStep defaults to 0.25 Field geodesicWeightThreshold defaults to 0.001 Field geodesicVolumeBound defaults to 8.45
plt.figure(figsize=[6,3]); plt.title('Distance and minimal geodesics'); plt.axis('equal');
for geo in hfmOut['geodesics']: plt.plot(*geo)
plt.contourf(X,Y,hfmOut['values'],cmap='Greys');
In this section, we differentiate the front arrival times $u : \Omega \to ]-\infty,\infty[$ w.r.t variations in the cost function $c : \Omega \to ]0,\infty[$, and in the boundary conditions $\sigma : \Omega\to ]-\infty,\infty[$. More precisely, denote by $u[c,\sigma] : \Omega \to ]-\infty,\infty[$, the solution to the eikonal equation \begin{align*} \forall x \in \Omega, \| \nabla u[c,\sigma](x) \| &= c(x) & \forall x \in \partial \Omega, u[c,\sigma](x) &= \sigma(x). \end{align*}
Consider perturbation fields $\xi : \Omega \to \mathbb R$ and $\zeta : \partial \Omega \to \mathbb R$. Forward differentiation allows to compute the first term $\nu : \Omega \to \mathbb R$ in the Taylor expansion of the distance function, if it exists. In other words \begin{equation*} \mu(x) := \frac d {d \varepsilon} u[c+ \varepsilon \xi, \sigma+ \varepsilon \zeta] (x) \end{equation*}
These features are implemented in the HFM library, and can be accessed in two ways:
The first approach is more explicit and possibly pedagogical, yet the second approach is expected to be more convenient in practice.
We show how the perturbation to the cost function, and to the seed values, can be provided to the HFM library by raw explicit arguments.
cost = hfmIn['cost']
seedValues = hfmIn['seedValues']
# Define the cost perturbation(s), above named xi. We actually define three perturbations,
# xi_0 (on the right side of the domain only), xi_1 (on the left side only), and xi_2 (no perturbation)
hfmIn['costVariation']= np.stack([(X>0.)*cost, (X<=0.)*cost, 0.*X],2)
# Define the boundary condition perturbation(s), above named zeta.
# Again, similarly define three perturbation, zeta_0 (no perturbation), zeta_1 (no perturbation), and zeta_2.
hfmIn['seedValueVariation']= [[0,0],[0,0],seedValues]
hfmOut = hfmIn.Run()
Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Fast marching solver completed in 0.00303 s. Field geodesicSolver defaults to Discrete Field geodesicStep defaults to 0.25 Field geodesicWeightThreshold defaults to 0.001 Field geodesicVolumeBound defaults to 8.45
The following cell shows the effect $\mu_0$ of the first perturbation $(\xi_0,\zeta_0)$. Since $\xi_0$ is positive in the right side of the domain $\{x>0\}$, the perturbation increases the cost function there, hence also the value function. Therefore $\mu_0>0$ where $\{x>0\}$, as can be observed numerically. On the other hand $\zeta_0=0$, which means that boundary conditions, in other words the seed values, are untouched.
plt.figure(figsize=[6,3]); plt.title(r'Value variation $\mu$, cost perturbed on the right.'); plt.axis('equal');
# The field 'valueVariation' is denoted mu in the above mathematical expression.
plt.contourf(X,Y,hfmOut['values'].gradient(0))
plt.axis('equal');plt.colorbar();
The following cell shows the effect $\mu_1$ of the second perturbation $(\xi_1,\zeta_1)$. Since $\xi_1$ is positive on the left side of the domain $\{x\leq 0\}$, the perturbation increases the cost function there, hence also the value function $u$. Therefore $\mu_1>0$ where $\{x \leq 0\}$, as can be observed numerically.
However, one can note that $\mu_1>0$ on part of the right side of the domain $\{x>0\}$ as well. That is because the corresponding minimal paths come from the left part of the domain, hence they see their cost increased by the perturbation.
fig = plt.figure(figsize=[6,3]); plt.title(r'Value variation $\mu$, cost perturbed on the left.'); plt.axis('equal');
plt.contourf(X,Y,hfmOut['values'].gradient(1))
plt.axis('equal');plt.colorbar();
savefig(fig,'ValueVariation_CostPerturbationLeft.png')
The third perturbation affects the boundary conditions only: $\xi_2=0$ and $\zeta_2 \neq 0$. More precisely, the perturbation increases the boundary condition at the right seed $x_1$ only. Therefore, as can be observed numerically, hence the value function $u$ increases in the Voronoi region of $x_1$ only. In other words $\mu_2>0$ at at all the points for which the backtracked geodesic leads to $x_1$.
fig = plt.figure(figsize=[6,3]); plt.title(r'Value variation $\mu$, right seed value perturbed.'); plt.axis('equal');
plt.contourf(X,Y,hfmOut['values'].gradient(2))
plt.axis('equal');plt.colorbar();
savefig(fig,'ValueVariation_SeedValuePerturbationRight.png')
We end this section with a consistency test, based on a mathematical property that we describe below.
It is worth noting that the arrival time function $u$ is $1$-homogeneous w.r.t. its parameters $c$ and $\sigma$ \begin{equation*} u[\lambda c,\lambda \sigma] = \lambda u[c,\sigma]. \end{equation*} This implies a differential identity, referred to as Euler identity for homogeneous functions: \begin{equation*} \frac d {d\lambda} u[\lambda c,\lambda \sigma] = u[c,\sigma]. \end{equation*} In the test case above, we have chosen the perturbations such that $\xi_0+\xi_1+\xi_2 = c$ and $\zeta_0+\zeta_1+\zeta_2 = \sigma$. Thus denoting by $\mu_0,\mu_1,\mu_2$ the corresponding value variations, Euler's identity implies that $\mu_0+\mu_1+\mu_2 = u$, as can be observed numerically.
values = hfmOut['values'].value
values[hfmIn['walls']]=0.; # Eliminate values inside walls, which equal Infinity
# Check Euler's identity
assert np.max(np.abs(hfmOut['values'].gradient().sum(axis=0)-values)) < 1e-14
For convenience, we provide a (limited) interface between HFM library and the AutomaticDifferentiation (ad) module of the AdaptiveGridDiscretizations (agd) package. In that setting, some of the keys of the hfmInput dictionary can be provided as first order dense AD variables. Under the hood, a pre-processing and a post-processing step reformat the AD data as in the previous subsection.
# The following keys cannot be enhanced with AD information
hfmIn_ad = Eikonal.dictIn({key:hfmIn[key] for key in
['model','arrayOrdering','gridScale','dims','origin','seeds','walls','exportValues']})
Let us construct a first order symbolic perturbation with three independent components.
delta = ad.Dense.identity(shape=(3,))
We can define a cost and seed values which incorporate first order perturbations. Here we simply reproduce the previous ones.
hfmIn_ad['cost'] = cost*(1+ delta[0]*(X>0.) + delta[1]*(X<=0.))
hfmIn_ad['seedValues'] = seedValues * (1+ delta[2])
A smart run will pre-process and post-process the HFM data to correct formatting.
hfmOut_ad = hfmIn_ad.Run()
Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Fast marching solver completed in 0.00302 s.
The field of values output by the smart run incorporates first order AD information.
values_ad = hfmOut_ad['values']
grad = values_ad.gradient()
fig = plt.figure(figsize=[15,2.5])
plt.subplot(1,3,1)
plt.contourf(X,Y,grad[0])
plt.subplot(1,3,2)
plt.contourf(X,Y,grad[1])
plt.subplot(1,3,3)
plt.contourf(X,Y,grad[2]);
Consider a cost function $c : \Omega \to ]0,\infty[$ and some boundary values $\sigma : \Omega \to ]-\infty,\infty]$. Reverse differentiation, for a given point $x\in \Omega$ provides two fields $\rho = \rho[x,c,\sigma] : \Omega \to \mathbb R$ and $\pi = \pi[x,c,\sigma] : \partial \Omega \to \mathbb R$ such that \begin{equation*} u[c+\varepsilon \xi,\sigma+\varepsilon \zeta](x) = u[x,\sigma](x)+ \varepsilon \Bigg(\int_\Omega \rho \xi + \int_{\partial \Omega} \pi \zeta\Bigg) + o(\varepsilon). \end{equation*} This equality holds, assuming differentiability, for any perturbation $\xi$ of the cost function $c$, and any perturbation $\zeta$ of the boundary condition $\sigma$. The fields $\rho$ and $\pi$ express how much the front arrival time value $u[c,\sigma]$ is sensitive to variations in these parameters.
Similarly to the forward case, this functionality can be accessed in two ways:
Again, the first usage is more explicit and possibly pedagogical, but the second one is expected to be much more convenient.
hfmIn['inspectSensitivity']=[ [-0.8,0.8], [0.575,0.1] ] # Ask for rho and pi related to these two points
hfmOut = hfmIn.Run()
Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Fast marching solver completed in 0.003028 s. Field geodesicSolver defaults to Discrete Field geodesicStep defaults to 0.25 Field geodesicWeightThreshold defaults to 0.001 Field geodesicVolumeBound defaults to 8.45
As can illustrated in the next cell, the sensitivity $\rho = \rho[x,c,\sigma]$, of the value function $u(x)$ at a given point $x$ w.r.t. variations in the cost $c$, is (mostly) supported in the neighborhood of the minimal geodesic from $x$ to the nearest seed point. This property is actually at the foundation of one of our backtracking methods.
plt.figure(figsize=[6,3]); plt.title(r'Value sensitivity $\rho$.'); plt.axis('equal');
plt.contourf(X,Y,hfmOut['costSensitivity_0']); # rho
plt.colorbar();
The second point $x_2 = (0.575,0.1)$ for which we request the sensitivity, is located precisely on the cut locus. In other words, it is at equal distance of the two seeds (taking into account the boundary conditions). The corresponding sensitivity $\rho$ is thus supported on the neighborhood of two geodesics.
fig = plt.figure(figsize=[6,3]); plt.title(r'Value sensitivity $\rho$. Two minimal geodesics.'); plt.axis('equal');
plt.contourf(X,Y,hfmOut['costSensitivity_1']);
plt.axis('equal');plt.colorbar();
savefig(fig,'ValueSensitivity_TwoPaths.png')
The sensitivity of the value $u[x,c,\sigma]$ w.r.t. the boundary condition $\sigma$ is also returned, above denoted $\pi : \partial \Omega \to \mathbb R$. The format is \begin{equation*} [ [s_0^0,s_0^1,\pi_0], [s_1^0,s_1^1,\pi_1], ...]. \end{equation*} Here $s_0=(s_0^0,s_0^1)$ and $s_1=(s_1^0,s_1^1)$ are the seeds for which the corresponding sensitivity $\pi_0$ and $\pi_1$ is positive. If the sensitivity $u(x)$ was requested unless for a generic point $x$, then the list is of length one, and of the form $[ [s^0,s^1,1] ]$. In addition $s=(s^0,s^1)$ is the seed linked to $x$ by the minimal path.
hfmOut['seedSensitivity_0']
array([[-0.495, 0.305, 1. ]])
However, if sensitivity is requested for a point on the cut-locus, for which there exists several minimal geodesics leading to distinct seeds, then $\pi$ is supported on several seeds.
hfmOut['seedSensitivity_1']
array([[ 0.505 , 0.805 , 0.37180042], [-0.495 , 0.305 , 0.62819958]])
A slight annoyance of the above output is that the original seeds are not returned. Instead, the returned positions correspond to the nearest point on the discretization grid. Some minor postprocessing, for instance by converting these points to multi-indices, is thus required to establish the matching.
print("Seed indices : ",hfmIn.IndexFromPoint(hfmIn['seeds'])[0])
print("Sensitivity returned as positions : ",hfmIn.IndexFromPoint(hfmOut['seedSensitivity_1'][:,0:2])[0])
# Note that the order of the seeds may not be preserved. Also, roundoff direction may change for points exactly in the middle of a cell.
Seed indices : [[ 50 29] [150 80]] Sensitivity returned as positions : [[150 80] [ 50 30]]
We conclude this section with a consistency check. More precisely, we ensure that our two automatic differentiation methods yield consistent results. Both can be used to compute the scalar $\mu(x)$ appearing in the following Taylor expansion \begin{equation*} u[c+\varepsilon \xi, \sigma+\varepsilon \zeta] (x) = u[c,\sigma](x)+\varepsilon \mu(x) + o(\varepsilon), \end{equation*} where the point $x$ and the perturbations $\xi$ and $\zeta$ are given. The two differentiation methods use distinct inputs, as follows.
The inputs $x,\zeta,\xi$ and outputs $\mu, \rho, \pi$, are mathematically tied by the the following identity, which we verify numerically: \begin{equation*} \mu(x) = \int_\Omega \rho \xi + \int_{\partial \Omega} \pi \zeta. \end{equation*}
valueVariation = hfmOut['values'].gradient()
index,_ = hfmIn.IndexFromPoint(hfmIn['inspectSensitivity'][1])
In the first instance of forward differentiation, boundary conditions were not perturbed ($\zeta=0$). Hence we expect, and numerically check, that $\mu(x) = \int_\Omega \rho \xi$.
mu_x = valueVariation[0,index[0],index[1]] # Evaluates mu(x)
int_dom = (hfmIn['costVariation'][:,:,0]*hfmOut['costSensitivity_1']).sum() # Evaluates int_Omega rho*xi
assert abs(mu_x-int_dom) < 1e-14
In the last instance of forward differentiation, the cost function was not perturbed ($\xi=0$). Hence we expect, and numerically check, that $\mu(x) = \int_{\partial\Omega} \pi \zeta$.
mu_x = valueVariation[2,index[0],index[1]] # Evaluates mu(x)
int_bd = np.dot(hfmOut['seedSensitivity_1'][(1,0),2] , hfmIn['seedValueVariation'][2,:]) # int_Boundary pi*zeta
assert abs(mu_x-int_bd) < 1e-14
An interface is provided with the reverse first order automatic differentiation module of the agd library.
# These variables cannot be enhanced with automatic differentiation
hfmIn_rev = Eikonal.dictIn({key:hfmIn[key] for key in
['model','arrayOrdering','gridScale','dims','origin','seeds','walls']})
We need to register the variables w.r.t which sensitivity will be requested.
rev,(cost_rev,seedValues_rev) = ad.Reverse.empty(inputs=(cost,seedValues),input_iterables=(tuple,Eikonal.dictIn,))
Note on the input_iterables
field. By default, the ad.Reverse automatic differentiation module only looks for AD information inside tuples. This behavior is here modified, through the input_iterables
field, since calls to the HFM library involve a dictionary of inputs. See the notebook Reverse for details.
hfmIn_rev['cost'] = cost_rev
hfmIn_rev['seedValues'] = seedValues_rev
In order to be taken into account, the solution values will be returned separately from the rest of the HFM outputs.
This is achieved with the extractValues
key.
hfmIn_rev['extractValues']=True
hfmOut,values_rev = rev.apply(Eikonal.dictIn.Run,hfmIn_rev)
Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Fast marching solver completed in 0.003065 s.
Let us define the objective function, which has to be a scalar function, in terms of the values.
grid = np.array((X,Y))
values_rev_interp = UniformGridInterpolation(grid,values_rev)
points = np.array([[-0.8,0.8],[0.575,0.1]]).T
val = values_rev_interp(points)
objective = 2*val[0]**2+val[1]**2
grad = rev.gradient(objective)
Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Fast marching solver completed in 0.003046 s.
cost_sensitivity,seed_sensitivity = rev.to_inputshapes(grad)
plt.title('Gradient of the objective function')
plt.contourf(*grid,cost_sensitivity);
seed_sensitivity
array([3.52533149, 0.40359865])
Reverse automatic differentiation is a two pass procedure:
If suitable data is cached in the first pass, then the jacobian evaluation in the second pass can avoid the full recomputation of the function value. Compare the results above and below.
hfmIn_rev = Eikonal.dictIn({key:hfmIn[key] for key in
['model','arrayOrdering','gridScale','dims','origin','seeds','walls','seedValues']})
hfmIn_rev['extractValues']=True
For a change, in contrast with the previous paragraph, we do not register the seed values for automatic differentiation.
rev,cost_rev = ad.Reverse.empty(inputs=cost,input_iterables=(Eikonal.dictIn,))
hfmIn_rev['cost'] = cost_rev
cache = Eikonal.Cache()
hfmOut,values_rev = rev.apply(Eikonal.dictIn.Run,hfmIn_rev,cache=cache)
Requesting cacheable data Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Fast marching solver completed in 0.003026 s. Filling cache data
Some output data is cached so as to bypass most the fast marching solver in future computations on this data.
cache.contents.keys()
dict_keys(['values', 'activeNeighs'])
We next define the objective function, and compute its sensitivity w.r.t. the cost function.
values_rev_interp = UniformGridInterpolation(grid,values_rev)
val = values_rev_interp(points)
objective = 2*val[0]**2+val[1]**2
cost_sensitivity_cache, = rev.to_inputshapes(rev.gradient(objective))
Providing cached data Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Bypassing fast marching solver based on cached data.
As mentioned in the last line, and in contrast with the previous subsection, the fast marching solver was bypassed entirely in the reverse pass thanks to the cached data. Yet the result is identical.
assert norm_infinity(cost_sensitivity-cost_sensitivity_cache,axis=None)==0.
We discuss the computation of the gradient of the value function, using a centered or upwind scheme. Then we compute its first order perturbation using forward or reverse automatic differentiation.
Let $u : \Omega \to R$ be the value function, also referred to as the distance map, which is approximated by the fast marching algorithm. One would like to estimate the gradient $\nabla u$, which satisfies the eikonal equation $$ \| \nabla u(x) \| = c(x) $$ at any $x \in \Omega$, where $c(x)$ is the cost function. A closely related quantity is the geodesic flow, which is defined by $$ V(x) := \frac 1 {c(x)^2} \nabla u(x), $$ and for which the HFM library provides an upwind approximation.
Note on anisotropic metric. When metric is Riemannian or Finslerian, the gradient and the geodesic flow are not anymore proportionnal, but are related through norm duality, see the notebook Sensitivity in Semi-Lagrangian schemes.
hfmIn = Eikonal.dictIn({
'model':'Isotropic2',
'arrayOrdering':'RowMajor',
'seeds':[[-0.5,0.3],[0.5,0.8]],
'seedValues':[0.,0.5],
'tips':[[0.,0.6],[-0.9,0.5],[0.8,0.8]],
'walls':np.logical_or(disk,barrier),
'exportValues':True,
'exportGeodesicFlow':True,
})
hfmIn.SetRect(sides=[[-1,1],[0,1]],gridScale=1./100.)
X,Y = hfmIn.Grid()
hfmIn['cost']=np.exp(-0.5*(X**2+Y**2))
h = hfmIn['gridScale']
hfmOut = hfmIn.Run()
Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Fast marching solver completed in 0.003013 s. Field geodesicSolver defaults to Discrete Field geodesicStep defaults to 0.25 Field geodesicWeightThreshold defaults to 0.001 Field geodesicVolumeBound defaults to 8.45
The gradient is easily recovered from the geodesic flow.
grad_hfm = hfmOut['flow']*hfmIn['cost']**2
plt.title("Gradient as computed by the HFM library")
s=5; plt.quiver(*grid[:,::s,::s],*grad_hfm[:,::s,::s]);
We may also directly recompute the gradient using finite differences. Centered finite differences are usually more precise, whereas upwind finite differences are usually more stable.
Note on Python warnings.* Those come from the computation of the gradient inside obtacles, where the the value function is $+\infty$.
grad_centered = fd.DiffGradient(hfmOut['values'],gridScale=h)
/Users/jean-mariemirebeau/Dropbox/Programmes/GithubM1/AdaptiveGridDiscretizations/agd/FiniteDifferences.py:239: RuntimeWarning: invalid value encountered in add return sum(TakeAtOffset(u,mult*ad.asarray(offset),**kwargs)*weight
def gradient_upwind(u,h):
"""(Quasi-)Upwind gradient of an array u at gridscale h"""
offsets = np.eye(u.ndim).astype(int)
dup = fd.DiffUpwind(u, offsets, h, padding=np.inf)
dum = fd.DiffUpwind(u,-offsets, h, padding=np.inf)
return np.where(dup<dum,dup,-dum)
grad_upwind = gradient_upwind(hfmOut['values'],h)
The upwind and hfm gradient are actually identical up to machine precision, except inside obstacles and along lines where the scheme "degenerates" in the sense that there is only one active neighbor.
grad_same = np.linalg.norm(grad_upwind-grad_hfm,axis=0)<1e-12
ratio = np.logical_or(grad_same,hfmIn['walls']).sum()/grad_same.size
assert ratio>0.95
print(f"Pixel ratio where hfm and 'upwind' gradient are equal : {ratio}")
Pixel ratio where hfm and 'upwind' gradient are equal : 0.97345
plt.title("Distinct hfm and upwind gradient")
plt.contourf(*grid,grad_same);
In this example, not much distinguishes the centered and the upwind gradient.
plt.figure(figsize=(10,3))
plt.subplot(1,2,1)
plt.title("Upwind gradient")
s=5; plt.quiver(*grid[:,::s,::s],*grad_upwind[:,::s,::s]);
plt.subplot(1,2,2)
plt.title("Centered gradient")
s=5; plt.quiver(*grid[:,::s,::s],*grad_centered[:,::s,::s]);
Finite differences are compatible with automatic differentiation. For illustration, we compute the first order perturbation of the value function gradient w.r.t. the perturbations of the cost function and seed values previously considered.
hfmOut = hfmIn_ad.Run()
Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Fast marching solver completed in 0.003046 s.
grad_upwind_ad = gradient_upwind(hfmOut['values'],h)
grad_centered_ad = fd.DiffGradient(hfmOut['values'],gridScale=h)
/Users/jean-mariemirebeau/Dropbox/Programmes/GithubM1/AdaptiveGridDiscretizations/agd/AutomaticDifferentiation/Dense.py:53: RuntimeWarning: invalid value encountered in add return self.new(self.value+other.value, _add_coef(self.coef,other.coef))
Automatic differentiation has no impact on the zero-th order term, where it is defined.
assert np.all(np.logical_or(grad_upwind_ad.value==grad_upwind,np.isnan(grad_upwind)))
assert np.all(np.logical_or(grad_centered_ad.value==grad_centered,np.isnan(grad_centered)))
The improved stability of the upwind gradient, w.r.t.\ the centered gradient, is slightly more visible when differentiation is involved. One sees that the centered gradient features more long spurious arrow, originating from the vicity of obtsacles or of the boundary.
perturbation_index = 0
plt.figure(figsize=[10,3])
plt.subplot(1,2,1)
plt.title("Upwind gradient perturbation")
plt.quiver(*grid[:,::s,::s],*grad_upwind_ad.gradient(perturbation_index)[:,::s,::s]);
plt.subplot(1,2,2)
plt.title("Centered gradient perturbation (omiting boundary layer)")
plt.quiver(*grid[:,::s,::s],*grad_centered_ad.gradient(perturbation_index)[:,1:-1:s,1:-1:s]);
We illustrate reverse differentiation of the value function gradient
rev,cost_rev = ad.Reverse.empty(inputs=cost,input_iterables=(Eikonal.dictIn,))
hfmIn_rev['cost'] = cost_rev
hfmOut,values_rev = rev.apply(Eikonal.dictIn.Run,hfmIn_rev,cache=Eikonal.Cache())
Requesting cacheable data Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Fast marching solver completed in 0.003021 s. Filling cache data
grad_upwind_rev = gradient_upwind(values_rev,h)
grad_centered_rev = fd.DiffGradient(values_rev,gridScale=h)
/Users/jean-mariemirebeau/Dropbox/Programmes/GithubM1/AdaptiveGridDiscretizations/agd/AutomaticDifferentiation/Sparse.py:73: RuntimeWarning: invalid value encountered in add value = self.value+other.value
def objective(grad):
interp = UniformGridInterpolation(grid,grad)
p0,p1 = points.T
# Return horizontal component at p0, vertical component at p1
return interp(p0)[0] +interp(p1)[1]
grad_upwind_cost, = rev.to_inputshapes(rev.gradient(objective(grad_upwind_rev)))
print("---- Finished upwind, turning to centered ----")
grad_centered_cost, = rev.to_inputshapes(rev.gradient(objective(grad_centered_rev)))
Providing cached data Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Bypassing fast marching solver based on cached data. ---- Finished upwind, turning to centered ---- Providing cached data Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Bypassing fast marching solver based on cached data.
levels = [0.0001,0.001,0.01]
plt.figure(figsize=(10,3))
plt.subplot(1,2,1)
plt.title("Sensitivity of upwind gradient")
plt.contourf(*grid,grad_upwind_cost,levels=levels)
plt.subplot(1,2,2)
plt.title("Sensitivity of centered gradient")
plt.contourf(*grid,grad_centered_cost,levels=levels);
Validation by comparison with forward gradient.
grad_diff = (grad_upwind_cost * hfmIn_ad['cost']).sum().gradient() - objective(grad_upwind_ad).gradient()
assert norm_infinity(grad_diff[:2])<1e-13
# Third component ignored because related with seedValueVariation, not considered in reverse more here.
We illustrate automatic differentiation solving an optimization problem posed on cost functions $c$. The objective is to maximize the distance from a point $x_*$ to the boundary, minus a integral penalty on the cost. Formally, the problem reads as follows \begin{align*} \max_{c : \Omega \to [\alpha,\beta]} \, u[c](x_*) - \gamma \int_\Omega c. \end{align*} The bounds $\alpha,\beta > 0$ imposed on the cost function, and the penalization factor $\gamma >0$ on the cost function integral, are given parameters, as well as the target point $x_*$.
Experiments of similar nature are presented in :
optIn = Eikonal.dictIn()
optIn['model']='IsotropicDiff2' # Alternatively, 'Isotropic2'
#Fast marching parameters, copied from earlier problem, with different seed
for key in ['arrayOrdering','dims','gridScale','origin','walls']:
optIn[key]=hfmIn[key]
# Problem parameters
alpha = 0.1; beta=1.; gamma=1.5;
optIn['seed']=[-0.7,0.7] # Seed position
# For better stability, we slightly spread the target x_* on the four neighbor points
targetIndices = [(180,40),(179,40),(181,40),(180,39),(180,41)];
targetWeights = [0.5,0.125,0.125,0.125,0.125]
#Unspread target, commented below, yields less stable computations.
#targetIndices = [(40,180)]; targetWeights = [1];
# Silent the execution, since HFM will be called many times
optIn['verbosity']=0
# Utilities
targetPoints = optIn.PointFromIndex(targetIndices)
allOnes = np.ones(optIn.size)
We prepare two functions returning the objective function and its jacobian.
This particular test application works better with the differentiable scheme 'IsotropicDiff2', although the usual discretization 'Isotropic2' is still usable.
def func(cost,sign=1):
optIn['cost']=cost.reshape(optIn.shape)
optIn['exportValues']=1
optIn.pop('inspectSensitivity',None)
optOut = optIn.Run()
value = sum([weight*optOut['values'][index] for index,weight in zip(targetIndices,targetWeights)])
return sign*(value - gamma*cost.sum()*optIn['gridScale']**2)
def func_deriv(cost,sign=1):
optIn['cost']=cost.reshape(optIn.shape)
optIn['exportValues']=0
# Request sensitivity for a weighted sum of values
optIn['inspectSensitivity'] = targetPoints
optIn['inspectSensitivityWeights'] = targetWeights
optIn['inspectSensitivityLengths'] = [len(targetWeights)]
optOut = optIn.Run()
return sign*(optOut['costSensitivity_0'].reshape(-1) - gamma*allOnes*optIn['gridScale']**2)
%%time
# Warning : takes up to a minute.
res = scipy.optimize.\
minimize(func,
allOnes*(alpha+beta)/2., # Initial guess
bounds=np.array((alpha*allOnes,beta*allOnes)).transpose(), # alpha <= c <= beta
jac=func_deriv,
args=(-1.), # Minimize instead of maximize
method='L-BFGS-B',options={'gtol':1e-4,'maxiter':300})
CPU times: user 994 ms, sys: 2.39 s, total: 3.39 s Wall time: 8.78 s
The optimal strategy uses a large cost around the seed, the target, around the boundary of obstacles, ...
fig = plt.figure(figsize=[6,3]); plt.title('Optimal cost function.'); plt.axis('equal');
plt.contourf(X,Y,res.x.reshape(hfmIn.shape));
savefig(fig,'OptimalCost.png')
At the optimal strategy, there is a continuum of optimal curves, spread all over the domain.
fig = plt.figure(figsize=[6,3]); plt.title('Sensitivity at the optimal cost.'); plt.axis('equal');
plt.contourf(X,Y,func_deriv(res.x).reshape(optIn.shape));
savefig(fig,'OptimalSensitivity.png')
optIn['tips']=optIn['inspectSensitivity']
optIn['exportValues']=1
optOut = optIn.Run()
fig = plt.figure(figsize=[6,3]); plt.title('Distance and minimal geodesics, at the optimal cost'); plt.axis('equal');
for geo in optOut['geodesics']: plt.plot(*geo)
plt.contourf(X,Y,optOut['values'],cmap='Greys');
savefig(fig,'OptimalPathCost.png')
The HFM library lets you compute the sensitivity at multiple points, possibly weighted. For instance, assume $u : \Omega \to R$ is a distance map computed using the fast marching algorithm. Let also $x_0,x_1,x_2 \in \Omega$ and $\alpha_0,\alpha_1,\alpha_2 \in R$. We show how to compute the sensitivity of the the vector $$ (\alpha_0 u(x_0)+\alpha_1 u(x_1),\ \alpha_2 u(x_2)) $$ w.r.t. variations of the cost function.
# Define the problem
hfmIn = Eikonal.dictIn({
'model':'Isotropic2',
'cost':1.,
'seed':[0.,0.],
'exportValues':1.,
})
hfmIn.SetRect(sides=[[-1.,1.],[-1.,1.]],dimx=100)
X = hfmIn.Grid()
# Where to get the sensitivity
x0,x1,x2 = [0.1,-0.4],[0.5,0.8],[-0.5,-0.7]
alpha0,alpha1,alpha2 = 2.,3.,4.5
# Format this data for the input
hfmIn['inspectSensitivity'] = [x0,x1,x2]
hfmIn['inspectSensitivityWeights'] = [alpha0,alpha1,alpha2]
# How to group the points (here the first two are summed together, and the last one is alone)
hfmIn['inspectSensitivityLengths'] = [2.,1.]
hfmOut = hfmIn.Run()
Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Fast marching solver completed in 0.001537 s.
plt.title(r"Cost sensitivity for the weighted sum $\alpha_0 u(x_0)+\alpha_1 u(x_1)$")
plt.contourf(*X,hfmOut['costSensitivity_0']);
plt.title(r"Cost sensitivity for $\alpha_2 u(x_2)$")
plt.contourf(*X,hfmOut['costSensitivity_1']);
The reverse AD library currently only supports scalar valued objectives. In the case of a non-scalar output, multiple calls to the gradient are required. The cost remains reasonable provided the adequate data is cached in the forward pass.
hfmIn_ad = Eikonal.dictIn({key:hfmIn[key] for key in ['model','seeds','arrayOrdering','dims','origin','gridScale']})
hfmIn_ad['extractValues']=True
X = hfmIn_ad.Grid()
cost = np.ones( hfmIn_ad.shape )
rev, cost_ad = ad.Reverse.empty(inputs = cost, input_iterables = (Eikonal.dictIn,))
hfmIn_ad['cost'] = cost_ad
We provide an un-named cache variable, which will be saved by the Reverse AD structure, like the other input arguments.
hfmOut_ad,values_ad = rev.apply(Eikonal.dictIn.Run, hfmIn_ad, cache=Eikonal.Cache() )
Requesting cacheable data Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Fast marching solver completed in 0.001551 s. Filling cache data
values_ad_interp = UniformGridInterpolation(X,values_ad)
objective = ad.array( (alpha0*values_ad_interp(x0) + alpha1*values_ad_interp(x1), alpha2*values_ad_interp(x2)) )
We defined a multi-dimensional objective function, but we can only request its gradient component by component so far. The cost remains bearable since the fast marching solver is bypassed thanks to cached data.
grad_0, = rev.to_inputshapes(rev.gradient(objective[0]))
print("------")
grad_1, = rev.to_inputshapes(rev.gradient(objective[1]))
Providing cached data Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Bypassing fast marching solver based on cached data. ------ Providing cached data Field verbosity defaults to 1 Field order defaults to 1 Field seedRadius defaults to 0 Bypassing fast marching solver based on cached data.
plt.figure(figsize=[12,3])
plt.subplot(1,2,1)
plt.contourf(*X,grad_0)
plt.subplot(1,2,2)
plt.contourf(*X,grad_1);