The techniques of automatic differentiation technique play an essential role in the notebooks presented in this repository.
Summary of volume Algorithmic tools, this series of notebooks.
Main summary of the Adaptive Grid Discretizations book of notebooks, including the other volumes.
Acknowledgement. Some of the experiments presented in these notebooks are part of ongoing research with Ludovic Métivier and Da Chen.
Copyright Jean-Marie Mirebeau, Centre Borelli, ENS Paris-Saclay, CNRS, University Paris-Saclay
import sys; sys.path.insert(0,"..") # Allow importing agd from parent directory
#from Miscellaneous import TocTools; TocTools.displayTOC('ADBugs','Algo')
import numpy as np
import scipy.sparse.linalg
from matplotlib import pyplot as plt
import agd.AutomaticDifferentiation as ad
def reload_packages():
from Miscellaneous.rreload import rreload
global ad
ad, = rreload([ad],rootdir='..',verbose=True)
Please use the ad.apply_linear_mapping
and ad.apply_linear_inverse
functions in combination with np.dot
, or scipy solve functions.
v = ad.Dense.denseAD( np.random.standard_normal((4,)),np.random.standard_normal((4,4)))
m0 = np.random.standard_normal((4,4))
m1 = scipy.sparse.coo_matrix( ([1.,2.,3.,4.,5.],([0,2,1,2,3],[0,1,2,2,3]))).tocsr()
# Fails
#print("np.dot looses AD:",np.dot(m0,v))
#print("scipy '*' looses AD:",m1*v)
print("np.dot with AD:\n",ad.apply_linear_mapping(m0,v))
print("scipy '*' with AD:\n",ad.apply_linear_mapping(m1,v))
np.dot with AD: denseAD([ 0.50118769 -2.04170495 0.94570217 0.90779885], [[-0.17839066 1.39457248 -0.26113732 -1.19811421] [ 2.15846549 -0.28449558 -0.65240613 2.2270235 ] [-0.66491683 -0.67162119 2.32948764 -0.61617014] [-0.30672444 0.37466936 2.82816426 -1.03297632]]) scipy '*' with AD: denseAD([ 0.65225313 0.08570385 -1.43024935 4.84768032], [[-0.91835723 -1.07453281 0.46032601 -0.16568321] [-2.07819885 -1.98998352 -3.48063992 0.88215899] [-1.67178855 -2.53777686 -9.02485327 2.53313897] [-5.6389261 1.78281879 -3.19441496 -6.00454089]])
print("scipy solve with AD :\n",ad.apply_linear_inverse(scipy.sparse.linalg.spsolve,m1,v))
scipy solve with AD : denseAD([ 0.65225313 0.52912436 -0.25742019 0.19390721], [[-0.91835723 -1.07453281 0.46032601 -0.16568321] [-0.71274756 -0.37017542 0.88122669 -0.30528249] [ 0.18319054 0.01925575 -0.73066667 0.2261545 ] [-0.22555704 0.07131275 -0.1277766 -0.24018164]])
The AD information often consists of very large arrays. In order to save time and memory, this information is not systematically copied and/or stored fully. It can take the form of a broadcasted array, or of an alias to another array. In that case a copy is necessary to enable modifications.
When an operation leaves the AD information untouched, an alias is used. This can lead to bugs if in place modifications are used afterward.
x=ad.Dense.identity(constant=np.array([1.,2.]))
y=x+1 # Only affects the value, not the AD information
print("Values are distinct :", x.value is y.value)
print("AD information is shared :", y.coef is x.coef)
Values are distinct : False AD information is shared : True
A modification of the aliased variable will impact the original one.
print(x[0])
y[0]*=2
print("Caution ! Shared AD information is affected :", x[0])
denseAD(1.0,[1. 0.]) Caution ! Shared AD information is affected : denseAD(1.0,[2. 0.])
Avoid this effect by making a copy.
x=ad.Dense.identity(constant=np.array([1.,2.]))
y=(x+1).copy()
print("AD information is distinct :", y.coef is x.coef)
AD information is distinct : False
Note that a similar effect arises with the -
binary operator, but not with *
or /
. That is because the latter modify the AD information, which therefore must be copied anyway.
x=ad.Dense.identity(constant=np.array([1.,2.]))
print("AD information is shared :", (x-1).coef is x.coef)
print("AD information is distinct :", (x*2).coef is x.coef)
print("AD information is distinct :", (x/2).coef is x.coef)
AD information is shared : True AD information is distinct : False AD information is distinct : False
When creating an dense AD variable, the coefficients may be non writeable (e.g. broadcasted) arrays.
x=ad.Dense.identity(constant=np.array([[1.,2.],[3.,4.]]),shape_bound=(2,))
x.coef.flags.writeable
False
# x+=1 # Fails because non-writeable
Make a copy to solve the issue.
y=x.copy()
y.coef.flags.writeable
True
y+=1
The agd library allows CPU/GPU generic programming to some extent. Here are the guidelines to make this approach work.
Make a copy of the numpy array module
xp = np
Activate GPU acceleration. If uncommented, the following line will replace the module xp with np, and modify its other arguments in a custom manner intended for easy interaction.
#xp,plt = map(ad.cupy_friendly,[xp,plt])
Create basic arrays using xp
. Basic arrays are arrays of zeros, of ones, arange, linspace, etc
x = xp.linspace(0,2*np.pi)
Use numpy's overloading mechanisms. These mechanisms will dispatch the function calls to cupy, or to the AutomaticDifferentiation module of the agd library, depending on the data type (array from numpy, cupy, or ad).
y = np.sin(x)
Use ad.asarray
and ad.array
. Stacking arrays using np.array will not work for AD or cupy arrays.
xy = ad.array([x,y])
Use functions that accept both numpy and cupy arrays. Or modify them for that purpose, as we did for the member functions of the pyplot module using the ad.cupy_friendly
function.
plt.plot(*xy);
If needed, convert cupy array's to numpy arrays. Using the get
member function of the cupy.ndarray
class, or using ad.cupy_generic.cupy_get
(which leaves non-cupy variables unchanged).
type(ad.cupy_generic.cupy_get(y))
numpy.ndarray