This operator is based on simplfications of the systems presented in:
Self-adjoint, energy-conserving second-order pseudoacoustic systems for VTI and TTI media for reverse time migration and full-waveform inversion (2016)
Kenneth Bube, John Washbourne, Raymond Ergas, and Tamas Nemeth
SEG Technical Program Expanded Abstracts
https://library.seg.org/doi/10.1190/segam2016-13878451.1
The goal of this tutorial set is to generate and prove correctness of modeling and inversion capability in Devito for variable density visco- acoustics using an energy conserving form of the wave equation. We describe how the linearization of the energy conserving self adjoint system with respect to modeling parameters allows using the same modeling system for all nonlinear and linearized forward and adjoint finite difference evolutions. There are three notebooks in this series:
forward
and adjoint
modeling operations.There are similar series of notebooks implementing and testing operators for VTI and TTI anisotropy (README.md).
Below we introduce the self adjoint form of the scalar isotropic variable density visco- acoustic wave equation with a simple form of dissipation only Q attenuation. This dissipation only (no dispersion) attenuation term $\left (\frac{\displaystyle \omega}{Q}\ \partial_t\ u \right)$ is an approximation of a Maxwell Body -- that is to say viscoelasticity approximated with a spring and dashpot in series. In practice this approach for attenuating outgoing waves is very similar to the Cerjan style damping in absorbing boundaries used elsewhere in Devito (References).
The derivation of the attenuation model is not in scope for this tutorial, but one important point is that the physics in the absorbing boundary region and the interior of the model are unified, allowing the same modeling equations to be used everywhere, with physical Q values in the interior tapering to non-physical small Q at the boundaries to attenuate outgoing waves.
Symbol | Description | Dimensionality |
---|---|---|
$\omega_c = 2 \pi f_c$ | center angular frequency | constant |
$m(x,y,z)$ | P wave velocity | function of space |
$b(x,y,z)$ | buoyancy $(1 / \rho)$ | function of space |
$Q(x,y,z)$ | Attenuation at frequency $\omega_c$ | function of space |
$u(t,x,y,z)$ | Pressure wavefield | function of time and space |
$q(t,x,y,z)$ | Source term | function of time, localized in space |
$\overleftarrow{\partial_t}$ | shifted first derivative wrt $t$ | shifted 1/2 sample backward in time |
$\partial_{tt}$ | centered second derivative wrt $t$ | centered in time |
$\overrightarrow{\partial_x},\ \overrightarrow{\partial_y},\ \overrightarrow{\partial_z}$ | + shifted first derivative wrt $x,y,z$ | shifted 1/2 sample forward in space |
$\overleftarrow{\partial_x},\ \overleftarrow{\partial_y},\ \overleftarrow{\partial_z}$ | - shifted first derivative wrt $x,y,z$ | shifted 1/2 sample backward in space |
$\Delta_t, \Delta_x, \Delta_y, \Delta_z$ | sampling rates for $t, x, y , z$ | $t, x, y , z$ |
We use the arrow symbols over derivatives $\overrightarrow{\partial_x}$ as a shorthand notation to indicate that the derivative is taken at a shifted location. For example:
$\overrightarrow{\partial_x}\ u(t,x,y,z)$ indicates that the $x$ derivative of $u(t,x,y,z)$ is taken at $u(t,x+\frac{\Delta x}{2},y,z)$.
$\overleftarrow{\partial_z}\ u(t,x,y,z)$ indicates that the $z$ derivative of $u(t,x,y,z)$ is taken at $u(t,x,y,z-\frac{\Delta z}{2})$.
$\overleftarrow{\partial_t}\ u(t,x,y,z)$ indicates that the $t$ derivative of $u(t,x,y,z)$ is taken at $u(t-\frac{\Delta_t}{2},x,y,z)$.
We usually drop the $(t,x,y,z)$ notation from wavefield variables unless required for clarity of exposition, so that $u(t,x,y,z)$ becomes $u$.
Our self adjoint wave equation is written:
$$ \frac{b}{m^2} \left( \frac{\omega_c}{Q} \overleftarrow{\partial_t}\ u + \partial_{tt}\ u \right) = \overleftarrow{\partial_x}\left(b\ \overrightarrow{\partial_x}\ u \right) + \overleftarrow{\partial_y}\left(b\ \overrightarrow{\partial_y}\ u \right) + \overleftarrow{\partial_z}\left(b\ \overrightarrow{\partial_z}\ u \right) + q $$An advantage of this form is that the same system can be used to provide stable modes of propagation for all operations needed in quasi- Newton optimization:
This advantage is more important for anisotropic operators, where widely utilized non energy conserving formulations can provide unstable adjoints and thus unstable gradients for anisotropy parameters.
The self adjoint formulation is evident in the shifted spatial derivatives, with the derivative on the right side $\overrightarrow{\partial}$ shifting forward in space one-half cell, and the derivative on the left side $\overleftarrow{\partial}$ shifting backward in space one-half cell.
$\overrightarrow{\partial}$ and $\overleftarrow{\partial}$ are anti-symmetric (also known as skew symmetric), meaning that for two random vectors $x_1$ and $x_2$, correctly implemented numerical derivatives will have the following property:
$$ x_2 \cdot \left( \overrightarrow{\partial_x}\ x_1 \right) \approx -\ x_1 \cdot \left( \overleftarrow{\partial_x}\ x_2 \right) $$Below we will demonstrate this skew symmetry with a simple unit test on Devito generated derivatives.
In the following notebooks in this series, material parameters sandwiched between the derivatives -- including anisotropy parameters -- become much more interesting, but here buoyancy $b$ is the only material parameter between derivatives in our self adjoint (SA) wave equation.
We have grouped all imports used in this notebook here for consistency.
import numpy as np
from examples.seismic import RickerSource, Receiver, TimeAxis
from devito import (Grid, Function, TimeFunction, SpaceDimension, Constant,
Eq, Operator, solve, configuration, norm)
from devito.finite_differences import Derivative
from devito.builtins import gaussian_smooth
from examples.seismic.self_adjoint import setup_w_over_q
import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib import cm
from timeit import default_timer as timer
# These lines force images to be displayed in the notebook, and scale up fonts
%matplotlib inline
mpl.rc('font', size=14)
# Make white background for plots, not transparent
plt.rcParams['figure.facecolor'] = 'white'
# Set logging to debug, captures statistics on the performance of operators
configuration['log-level'] = 'DEBUG'
As noted above, we prove with a small 1D unit test and 8th order spatial operator that the Devito shifted first derivatives are skew symmetric. This anti-symmetry can be demonstrated for the forward and backward half cell shift first derivative operators $\overrightarrow{\partial}$ and $\overleftarrow{\partial}$ with two random vectors $x_1$ and $x_2$ by verifying the dot product test as written above.
We will use Devito to implement the following two equations with an Operator
:
And then verify the dot products are equivalent. Recall that skew symmetry introduces a minus sign in this equality:
$$ f_1 \cdot g_2 \approx - g_1 \cdot f_2 $$We use the following test for relative error (note the flipped signs in numerator and denominator due to anti- symmetry):
$$ \frac{\displaystyle f_1 \cdot g_2 + g_1 \cdot f_2} {\displaystyle f_1 \cdot g_2 - g_1 \cdot f_2}\ <\ \epsilon $$# NBVAL_IGNORE_OUTPUT
# Make 1D grid to test derivatives
n = 101
d = 1.0
shape = (n, )
spacing = (1 / (n-1), )
origin = (0., )
extent = (d * (n-1), )
dtype = np.float64
# Initialize Devito grid and Functions for input(f1,g1) and output(f2,g2)
# Note that space_order=8 allows us to use an 8th order finite difference
# operator by properly setting up grid accesses with halo cells
grid1d = Grid(shape=shape, extent=extent, origin=origin, dtype=dtype)
x = grid1d.dimensions[0]
f1 = Function(name='f1', grid=grid1d, space_order=8)
f2 = Function(name='f2', grid=grid1d, space_order=8)
g1 = Function(name='g1', grid=grid1d, space_order=8)
g2 = Function(name='g2', grid=grid1d, space_order=8)
# Fill f1 and g1 with random values in [-1,+1]
f1.data[:] = -1 + 2 * np.random.rand(n,)
g1.data[:] = -1 + 2 * np.random.rand(n,)
# Equation defining: [f2 = forward 1/2 cell shift derivative applied to f1]
equation_f2 = Eq(f2, f1.dx(x0=x+0.5*x.spacing))
# Equation defining: [g2 = backward 1/2 cell shift derivative applied to g1]
equation_g2 = Eq(g2, g1.dx(x0=x-0.5*x.spacing))
# Define an Operator to implement these equations and execute
op = Operator([equation_f2, equation_g2])
op()
# Compute the dot products and the relative error
f1g2 = np.dot(f1.data, g2.data)
g1f2 = np.dot(g1.data, f2.data)
diff = (f1g2+g1f2)/(f1g2-g1f2)
tol = 100 * np.finfo(dtype).eps
print("f1g2, g1f2, diff, tol; %+.6e %+.6e %+.6e %+.6e" % (f1g2, g1f2, diff, tol))
# At last the unit test
# Assert these dot products are float epsilon close in relative error
assert diff < 100 * np.finfo(dtype).eps
Allocating memory for f1(117,) Allocating memory for g1(117,) Operator `Kernel` generated in 0.19 s * lowering.Expressions: 0.13 s (69.7 %) Flops reduction after symbolic optimization: [62 --> 25] Allocating memory for f2(117,) Allocating memory for g2(117,) Operator `Kernel` fetched `/tmp/devito-jitcache-uid5138/b40bd0708425569dbd35e554e8d2140abd9e0072.c` in 0.12 s from jit-cache Operator `Kernel` run in 0.01 s * section0<101> with OI=0.76 computed in 0.01 s [0.37 GFlops/s] Performance[mode=advanced] arguments: {}
f1g2, g1f2, diff, tol; +9.468804e-01 -9.468804e-01 -8.793795e-16 +2.220446e-14
You can inspect the finite difference coefficients and locations for evaluation with the code shown below.
For your reference, the finite difference coefficients seen in the first two stanzas below are exactly the coefficients generated in Table 2 of Fornberg's paper Generation of Finite Difference Formulas on Arbitrarily Spaced Grids linked below (References).
Note that you don't need to inspect the generated code, but this does provide the option to use this highly optimized code in applications that do not need or require python. If you inspect the code you will notice hallmarks of highly optimized c code, including pragmas
for vectorization, and decorations
for pointer restriction and alignment.
# NBVAL_IGNORE_OUTPUT
# Show the FD coefficients generated by Devito
# for the forward 1/2 cell shifted first derivative operator
print("\n\nForward +1/2 cell shift;")
print("..................................")
print(f1.dx(x0=x+0.5*x.spacing).evaluate)
# Show the FD coefficients generated by Devito
# for the backward 1/2 cell shifted first derivative operator
print("\n\nBackward -1/2 cell shift;")
print("..................................")
print(f1.dx(x0=x-0.5*x.spacing).evaluate)
# Show code generated by Devito for applying the derivatives
print("\n\nGenerated c code;")
print("..................................")
print(op.ccode)
Forward +1/2 cell shift; .................................. -1.19628906*f1(x)/h_x + 0.000697544643*f1(x - 3.0*h_x)/h_x - 0.0095703125*f1(x - 2.0*h_x)/h_x + 0.0797526042*f1(x - 1.0*h_x)/h_x + 1.19628906*f1(x + 1.0*h_x)/h_x - 0.0797526042*f1(x + 2.0*h_x)/h_x + 0.0095703125*f1(x + 3.0*h_x)/h_x - 0.000697544643*f1(x + 4.0*h_x)/h_x Backward -1/2 cell shift; .................................. 1.19628906*f1(x)/h_x + 0.000697544643*f1(x - 4.0*h_x)/h_x - 0.0095703125*f1(x - 3.0*h_x)/h_x + 0.0797526042*f1(x - 2.0*h_x)/h_x - 1.19628906*f1(x - 1.0*h_x)/h_x - 0.0797526042*f1(x + 1.0*h_x)/h_x + 0.0095703125*f1(x + 2.0*h_x)/h_x - 0.000697544643*f1(x + 3.0*h_x)/h_x Generated c code; .................................. #define _POSIX_C_SOURCE 200809L #include "stdlib.h" #include "math.h" #include "sys/time.h" #include "xmmintrin.h" #include "pmmintrin.h" struct dataobj { void *restrict data; int * size; int * npsize; int * dsize; int * hsize; int * hofs; int * oofs; } ; struct profiler { double section0; } ; int Kernel(struct dataobj *restrict f1_vec, struct dataobj *restrict f2_vec, struct dataobj *restrict g1_vec, struct dataobj *restrict g2_vec, const double h_x, struct profiler * timers, const int x_M, const int x_m) { double (*restrict f1) __attribute__ ((aligned (64))) = (double (*)) f1_vec->data; double (*restrict f2) __attribute__ ((aligned (64))) = (double (*)) f2_vec->data; double (*restrict g1) __attribute__ ((aligned (64))) = (double (*)) g1_vec->data; double (*restrict g2) __attribute__ ((aligned (64))) = (double (*)) g2_vec->data; /* Flush denormal numbers to zero in hardware */ _MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON); _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON); struct timeval start_section0, end_section0; gettimeofday(&start_section0, NULL); /* Begin section0 */ for (int x = x_m; x <= x_M; x += 1) { double r0 = 1.0/h_x; f2[x + 8] = r0*(6.97544643e-4*(f1[x + 5] - f1[x + 12]) + 9.5703125e-3*(-f1[x + 6] + f1[x + 11]) + 7.97526042e-2*(f1[x + 7] - f1[x + 10]) + 1.19628906*(-f1[x + 8] + f1[x + 9])); g2[x + 8] = r0*(6.97544643e-4*(g1[x + 4] - g1[x + 11]) + 9.5703125e-3*(-g1[x + 5] + g1[x + 10]) + 7.97526042e-2*(g1[x + 6] - g1[x + 9]) + 1.19628906*(-g1[x + 7] + g1[x + 8])); } /* End section0 */ gettimeofday(&end_section0, NULL); timers->section0 += (double)(end_section0.tv_sec-start_section0.tv_sec)+(double)(end_section0.tv_usec-start_section0.tv_usec)/1000000; return 0; }
The next step in implementing our Devito modeling operator is to define the equation used to update the pressure wavefield as a function of time. What follows is a bit of algebra using the wave equation and finite difference approximations to time derivatives to express the pressure wavefield forward in time $u(t+\Delta_t)$ as a function of the current $u(t)$ and previous $u(t-\Delta_t)$ pressure wavefields.
The second order accurate centered approximation to the second time derivative involves three wavefields: $u(t-\Delta_t)$, $u(t)$, and $u(t+\Delta_t)$. In order to advance our finite difference solution in time, we solve for $u(t+\Delta_t)$.
$$ \begin{aligned} \partial_{tt}\ u &= \frac{\displaystyle u(t+\Delta_t) - 2\ u(t) + u(t-\Delta_t)}{\displaystyle \Delta_t^2} \\[5pt] u(t+\Delta_t)\ &= \Delta_t^2\ \partial_{tt}\ u + 2\ u(t) - u(t-\Delta_t) \end{aligned} $$The argument for using a backward approximation is a bit hand wavy, but goes like this: a centered or forward approximation for $\partial_{t}\ u$ would involve the term $u(t+\Delta_t)$, and hence $u(t+\Delta_t)$ would appear at two places in our time update equation below, essentially making the form implicit (although it would be easy to solve for $u(t+\Delta_t)$).
We are interested in explicit time stepping and the correct behavior of the attenuation term, and so prefer the backward approximation for $\overleftarrow{\partial_{t}}\ u$. Our experience is that the use of the backward difference is more stable than forward or centered.
The first order accurate backward approximation to the first time derivative involves two wavefields: $u(t-\Delta_t)$, and $u(t)$. We can use this expression as is.
$$ \overleftarrow{\partial_{t}}\ u = \frac{\displaystyle u(t) - u(t-\Delta_t)}{\displaystyle \Delta_t} $$Next we plug in the right hand sides for $\partial_{tt}\ u$ and $\overleftarrow{\partial_{t}}\ u$ into the the time update expression for $u(t+\Delta_t)$ from step 2.
$$ \begin{aligned} u(t+\Delta_t) &= \Delta_t^2 \frac{m^2}{b} \left[ \overleftarrow{\partial_x}\left(b\ \overrightarrow{\partial_x}\ u \right) + \overleftarrow{\partial_y}\left(b\ \overrightarrow{\partial_y}\ u \right) + \overleftarrow{\partial_z}\left(b\ \overrightarrow{\partial_z}\ u \right) + q \right] \\[10pt] & \quad -\ \Delta_t^2 \frac{\omega_c}{Q} \left( \frac{\displaystyle u(t) - u(t-\Delta_t)} {\displaystyle \Delta_t} \right) + 2\ u(t) - u(t-\Delta_t) \end{aligned} $$The last equation is how we update the pressure wavefield at each time step, and depends on $u(t)$ and $u(t-\Delta_t)$.
The main work of the finite difference explicit time stepping is evaluating the nested spatial derivative operators on the RHS of this equation. The particular advantage of Devito symbolic optimization is that Devito is able to solve for the complicated expressions that result from substituting the discrete forms of high order numerical finite difference approximations for these nested spatial derivatives.
We have now completed the maths required to implement the modeling operator. The remainder of this notebook deals with setting up and using the required Devito objects.
Define the dimensions and coordinates for the model. The computational domain of the model is surrounded by an absorbing boundary region where we implement boundary conditions to eliminate outgoing waves. We define the sizes for the interior of the model nx
and nz
, the width of the absorbing boundary region npad
, and the sizes for the entire model padded with absorbing boundaries become nxpad = nx + 2*npad
and nzpad = nz + 2*npad
.
# Define dimensions for the interior of the model
nx,nz = 751,751
dx,dz = 10.0,10.0 # Grid spacing in m
shape = (nx, nz) # Number of grid points
spacing = (dx, dz) # Domain size is now 5 km by 5 km
origin = (0., 0.) # Origin of coordinate system, specified in m.
extent = tuple([s*(n-1) for s, n in zip(spacing, shape)])
# Define dimensions for the model padded with absorbing boundaries
npad = 50 # number of points in absorbing boundary region (all sides)
nxpad,nzpad = nx+2*npad, nz+2*npad
shape_pad = np.array(shape) + 2 * npad
origin_pad = tuple([o - s*npad for o, s in zip(origin, spacing)])
extent_pad = tuple([s*(n-1) for s, n in zip(spacing, shape_pad)])
# Define the dimensions
# Note if you do not specify dimensions, you get in order x,y,z
x = SpaceDimension(name='x', spacing=Constant(name='h_x',
value=extent_pad[0]/(shape_pad[0]-1)))
z = SpaceDimension(name='z', spacing=Constant(name='h_z',
value=extent_pad[1]/(shape_pad[1]-1)))
# Initialize the Devito grid
grid = Grid(extent=extent_pad, shape=shape_pad, origin=origin_pad,
dimensions=(x, z), dtype=dtype)
print("shape; ", shape)
print("origin; ", origin)
print("spacing; ", spacing)
print("extent; ", extent)
print("")
print("shape_pad; ", shape_pad)
print("origin_pad; ", origin_pad)
print("extent_pad; ", extent_pad)
print("")
print("grid.shape; ", grid.shape)
print("grid.extent; ", grid.extent)
print("grid.spacing_map;", grid.spacing_map)
shape; (751, 751) origin; (0.0, 0.0) spacing; (10.0, 10.0) extent; (7500.0, 7500.0) shape_pad; [851 851] origin_pad; (-500.0, -500.0) extent_pad; (8500.0, 8500.0) grid.shape; (851, 851) grid.extent; (8500.0, 8500.0) grid.spacing_map; {h_x: 10.0, h_z: 10.0}
We have the following constants and fields from our self adjoint wave equation that we define as time invariant using Functions
:
Symbol | Description |
---|---|
$$m(x,z)$$ | Acoustic velocity |
$$b(x,z)=\frac{1}{\rho(x,z)}$$ | Buoyancy (reciprocal density) |
# Create the velocity and buoyancy fields.
# - We use a wholespace velocity of 1500 m/s
# - We use a wholespace density of 1 g/cm^3
# - These are scalar fields so we use Function to define them
# - We specify space_order to establish the appropriate size halo on the edges
space_order = 8
# Wholespace velocity
m = Function(name='m', grid=grid, space_order=space_order)
m.data[:] = 1.5
# Constant density
b = Function(name='b', grid=grid, space_order=space_order)
b.data[:,:] = 1.0 / 1.0
Allocating memory for m(867, 867) Allocating memory for b(867, 867)
In this notebook we run 2 seconds of simulation using the sample rate related to the CFL condition as implemented in examples/seismic/self_adjoint/utils.py
.
Important note smaller Q values in highly viscous media may require smaller temporal sampling rates than a non-viscous medium to achieve dispersion free propagation. This is a cost of the visco- acoustic modification we use here.
We also use the convenience TimeRange
as defined in examples/seismic/source.py
.
def compute_critical_dt(v):
"""
Determine the temporal sampling to satisfy CFL stability.
This method replicates the functionality in the Model class.
Note we add a safety factor, reducing dt by a factor 0.75 due to the
w/Q attentuation term.
Parameters
----------
v : Function
velocity
"""
coeff = 0.38 if len(v.grid.shape) == 3 else 0.42
dt = 0.75 * v.dtype(coeff * np.min(v.grid.spacing) / (np.max(v.data)))
return v.dtype("%.5e" % dt)
t0 = dtype(0.) # Simulation time start
tn = dtype(2000.) # Simulation time end (1 second = 1000 msec)
dt = compute_critical_dt(m)
time_range = TimeAxis(start=t0, stop=tn, step=dt)
print("Time min, max, dt, num; %10.6f %10.6f %10.6f %d" % (t0, tn, dt, int(tn//dt) + 1))
print("time_range; ", time_range)
Time min, max, dt, num; 0.000000 2000.000000 2.100000 953 time_range; TimeAxis: start=0, stop=2001.3, step=2.1, num=954
source:
examples/seismic/source.py
receivers:
PointSource
in examples/seismic/source.py
# Source in the center of the model at 10 Hz center frequency
fpeak = 0.010
src = RickerSource(name='src', grid=grid, f0=fpeak, npoint=1, time_range=time_range)
src.coordinates.data[0,0] = dx * (nx//2)
src.coordinates.data[0,1] = dz * (nz//2)
# line of receivers along the right edge of the model
rec = Receiver(name='rec', grid=grid, npoint=nz, time_range=time_range)
rec.coordinates.data[:,0] = dx * (nx//2)
rec.coordinates.data[:,1] = np.linspace(0.0, dz*(nz-1), nz)
print("src_coordinate X; %+12.4f" % (src.coordinates.data[0,0]))
print("src_coordinate Z; %+12.4f" % (src.coordinates.data[0,1]))
print("rec_coordinates X min/max; %+12.4f %+12.4f" % \
(np.min(rec.coordinates.data[:,0]), np.max(rec.coordinates.data[:,0])))
print("rec_coordinates Z min/max; %+12.4f %+12.4f" % \
(np.min(rec.coordinates.data[:,1]), np.max(rec.coordinates.data[:,1])))
# We can plot the time signature to see the wavelet
src.show()
Allocating memory for src(954, 1) Allocating memory for src_coords(1, 2) Allocating memory for rec_coords(751, 2)
src_coordinate X; +3750.0000 src_coordinate Z; +3750.0000 rec_coordinates X min/max; +3750.0000 +3750.0000 rec_coordinates Z min/max; +0.0000 +7500.0000
Next we plot the velocity and density models for illustration.
# note: flip sense of second dimension to make the plot positive downwards
plt_extent = [origin_pad[0], origin_pad[0] + extent_pad[0],
origin_pad[1] + extent_pad[1], origin_pad[1]]
vmin, vmax = 1.4, 1.7
dmin, dmax = 0.9, 1.1
plt.figure(figsize=(12,8))
plt.subplot(1, 2, 1)
plt.imshow(np.transpose(m.data), cmap=cm.jet,
vmin=vmin, vmax=vmax, extent=plt_extent)
plt.colorbar(orientation='horizontal', label='Velocity (m/msec)')
plt.plot([origin[0], origin[0], extent[0], extent[0], origin[0]],
[origin[1], extent[1], extent[1], origin[1], origin[1]],
'white', linewidth=4, linestyle=':', label="Absorbing Boundary")
plt.plot(rec.coordinates.data[:, 0], rec.coordinates.data[:, 1], \
'black', linestyle='-', label="Receiver")
plt.plot(src.coordinates.data[:, 0], src.coordinates.data[:, 1], \
'red', linestyle='None', marker='*', markersize=15, label="Source")
plt.xlabel("X Coordinate (m)")
plt.ylabel("Z Coordinate (m)")
plt.title("Velocity w/ absorbing boundary")
plt.subplot(1, 2, 2)
plt.imshow(np.transpose(1 / b.data), cmap=cm.jet,
vmin=dmin, vmax=dmax, extent=plt_extent)
plt.colorbar(orientation='horizontal', label='Density (m^3/kg)')
plt.plot([origin[0], origin[0], extent[0], extent[0], origin[0]],
[origin[1], extent[1], extent[1], origin[1], origin[1]],
'white', linewidth=4, linestyle=':', label="Absorbing Boundary")
plt.plot(rec.coordinates.data[:, 0], rec.coordinates.data[:, 1], \
'black', linestyle='-', label="Receiver")
plt.plot(src.coordinates.data[:, 0], src.coordinates.data[:, 1], \
'red', linestyle='None', marker='*', markersize=15, label="Source")
plt.xlabel("X Coordinate (m)")
plt.ylabel("Z Coordinate (m)")
plt.title("Density w/ absorbing boundary")
plt.tight_layout()
None
We have two remaining constants and fields from our SA wave equation that we need to define:
Symbol | Description |
---|---|
$$\omega_c = 2 \pi f_c$$ | Center angular frequency |
$$\frac{1}{Q(x,z)}$$ | Inverse Q model used in the modeling system |
The absorbing boundary condition strategy we use is designed to eliminate any corners or edges in the attenuation profile. We do this by making Q a function of distance from the nearest boundary.
We have implemented the function setup_w_over_q
for 2D and 3D fields in the file utils.py
, and will use it below. In Devito these fields are type Function
, a concrete implementation of AbstractFunction
.
Feel free to inspect the source at utils.py, which uses Devito's symbolic math to write a nonlinear equation describing the absorbing boundary for dispatch to automatic code generation.
Note that we will generate two Q models, one with strong attenuation (a Q value of 25) and one with moderate attenuation (a Q value of 100) -- in order to demonstrate the impact of attenuation in the plots near the end of this notebook.
# NBVAL_IGNORE_OUTPUT
# Initialize the attenuation profile for Q=25 and Q=100 models
w = 2.0 * np.pi * fpeak
print("w,fpeak; ", w, fpeak)
qmin = 0.1
wOverQ_025 = Function(name='wOverQ_025', grid=grid, space_order=space_order)
wOverQ_100 = Function(name='wOverQ_100', grid=grid, space_order=space_order)
setup_w_over_q(wOverQ_025, w, qmin, 25.0, npad)
setup_w_over_q(wOverQ_100, w, qmin, 100.0, npad)
# Plot the log of the generated Q profile
q025 = np.log10(w / wOverQ_025.data)
q100 = np.log10(w / wOverQ_100.data)
lmin, lmax = np.log10(qmin), np.log10(100)
plt.figure(figsize=(12,8))
plt.subplot(1, 2, 1)
plt.imshow(np.transpose(q025.data), cmap=cm.jet,
vmin=lmin, vmax=lmax, extent=plt_extent)
plt.colorbar(orientation='horizontal', label='log10(Q)')
plt.plot([origin[0], origin[0], extent[0], extent[0], origin[0]],
[origin[1], extent[1], extent[1], origin[1], origin[1]],
'white', linewidth=4, linestyle=':', label="Absorbing Boundary")
plt.plot([origin[0], origin[0], extent[0], extent[0], origin[0]],
[origin[1], extent[1], extent[1], origin[1], origin[1]],
'white', linewidth=4, linestyle=':', label="Absorbing Boundary")
plt.plot(rec.coordinates.data[:, 0], rec.coordinates.data[:, 1], \
'black', linestyle='-', label="Receiver")
plt.plot(src.coordinates.data[:, 0], src.coordinates.data[:, 1], \
'red', linestyle='None', marker='*', markersize=15, label="Source")
plt.xlabel("X Coordinate (m)")
plt.ylabel("Z Coordinate (m)")
plt.title("log10 of $Q=25$ model")
plt.subplot(1, 2, 2)
plt.imshow(np.transpose(q100.data), cmap=cm.jet,
vmin=lmin, vmax=lmax, extent=plt_extent)
plt.colorbar(orientation='horizontal', label='log10(Q)')
plt.plot([origin[0], origin[0], extent[0], extent[0], origin[0]],
[origin[1], extent[1], extent[1], origin[1], origin[1]],
'white', linewidth=4, linestyle=':', label="Absorbing Boundary")
plt.plot([origin[0], origin[0], extent[0], extent[0], origin[0]],
[origin[1], extent[1], extent[1], origin[1], origin[1]],
'white', linewidth=4, linestyle=':', label="Absorbing Boundary")
plt.plot(rec.coordinates.data[:, 0], rec.coordinates.data[:, 1], \
'black', linestyle='-', label="Receiver")
plt.plot(src.coordinates.data[:, 0], src.coordinates.data[:, 1], \
'red', linestyle='None', marker='*', markersize=15, label="Source")
plt.xlabel("X Coordinate (m)")
plt.ylabel("Z Coordinate (m)")
plt.title("log10 of $Q=100$ model")
plt.tight_layout()
None
Operator `WOverQ_Operator` generated in 0.16 s * lowering.Expressions: 0.06 s (38.5 %) * lowering.IET: 0.06 s (38.5 %) * specializing.IET: 0.04 s (25.7 %) * lowering.Clusters: 0.04 s (25.7 %) Flops reduction after symbolic optimization: [19 --> 19] Allocating memory for wOverQ_025(867, 867)
w,fpeak; 0.06283185307179587 0.01 gcc -O3 -g -fPIC -Wall -std=c99 -march=native -Wno-unused-result -Wno-unused-variable -Wno-unused-but-set-variable -ffast-math -shared -fopenmp /tmp/devito-jitcache-uid5138/4e7b93ebaa1230b81ccf71e59152d42d0215d34b.c -lm -o /tmp/devito-jitcache-uid5138/4e7b93ebaa1230b81ccf71e59152d42d0215d34b.so
Operator `WOverQ_Operator` jit-compiled `/tmp/devito-jitcache-uid5138/4e7b93ebaa1230b81ccf71e59152d42d0215d34b.c` in 0.24 s with `GNUCompiler` Operator `WOverQ_Operator` run in 0.02 s * section0<<851,851>,<50,851>,<50,851>,<851,50>,<851,50>,<851,851>> with OI=0.01 computed in 0.02 s [0.24 GFlops/s] Performance[mode=advanced] arguments: {} Operator `WOverQ_Operator` generated in 0.24 s * lowering.Clusters: 0.12 s (51.7 %) * schedule: 0.09 s (38.8 %) * lowering.Expressions: 0.06 s (25.9 %) * lowering.IET: 0.06 s (25.9 %) Flops reduction after symbolic optimization: [19 --> 19] Allocating memory for wOverQ_100(867, 867) Operator `WOverQ_Operator` jit-compiled `/tmp/devito-jitcache-uid5138/d17d6931e92b8d07d8e8b288e52ed3fa6cecdae4.c` in 0.23 s with `GNUCompiler`
gcc -O3 -g -fPIC -Wall -std=c99 -march=native -Wno-unused-result -Wno-unused-variable -Wno-unused-but-set-variable -ffast-math -shared -fopenmp /tmp/devito-jitcache-uid5138/d17d6931e92b8d07d8e8b288e52ed3fa6cecdae4.c -lm -o /tmp/devito-jitcache-uid5138/d17d6931e92b8d07d8e8b288e52ed3fa6cecdae4.so
Operator `WOverQ_Operator` run in 0.02 s * section0<<851,851>,<50,851>,<50,851>,<851,50>,<851,50>,<851,851>> with OI=0.01 computed in 0.02 s [0.24 GFlops/s] Performance[mode=advanced] arguments: {}
TimeFunction
¶We specify the time_order as 2, which allocates 3 time steps in the pressure wavefield. As described elsewhere, Devito will use "cyclic indexing" to index into this multi-dimensional array, mneaning that via the modulo operator, the time indices $[0, 1, 2, 3, 4, 5, ...]$ are mapped into the modulo indices $[0, 1, 2, 0, 1, 2, ...]$
This FAQ entry explains in more detail.
# Define the TimeFunction
u = TimeFunction(name="u", grid=grid, time_order=2, space_order=space_order)
# Get the symbols for dimensions for t, x, z
# We need these below in order to write the source injection and the
t,x,z = u.dimensions
If you examine the equation for the time update we derived above you will see that the source $q$ is scaled by the term $(\Delta_t^2 m^2\ /\ b)$. You will see that scaling term in the source injection below. For $\Delta_t^2$ we use the time dimension spacing symbol t.spacing**2
.
Note that source injection and receiver extraction are accomplished via linear interpolation, as implemented in SparseTimeFunction
in sparse.py.
# Source injection, with appropriate scaling
src_term = src.inject(field=u.forward, expr=src * t.spacing**2 * m**2 / b)
# Receiver extraction
rec_term = rec.interpolate(expr=u.forward)
We next transcribe the time update expression we derived above into a Devito Eq
. Then we add that expression with the source injection and receiver extraction and build an Operator
that will generate the c code for performing the modeling.
We copy the time update expression from above for clarity. Note we omit $q$ because we will be explicitly injecting the source using src_term
defined immediately above. However, for the linearized Born forward modeling operation the $q$ term is an appropriately scaled field, as shown in the next notebook in this series.
# NBVAL_IGNORE_OUTPUT
# Generate the time update equation and operator for Q=25 model
eq_time_update = (t.spacing**2 * m**2 / b) * \
((b * u.dx(x0=x+x.spacing/2)).dx(x0=x-x.spacing/2) + \
(b * u.dz(x0=z+z.spacing/2)).dz(x0=z-z.spacing/2)) + \
(2 - t.spacing * wOverQ_025) * u + \
(t.spacing * wOverQ_025 - 1) * u.backward
stencil = Eq(u.forward, eq_time_update)
# Update the dimension spacing_map to include the time dimension
# These symbols will be replaced with the relevant scalars by the Operator
spacing_map = grid.spacing_map
spacing_map.update({t.spacing : dt})
print("spacing_map; ", spacing_map)
# op = Operator([stencil] + src_term + rec_term)
op = Operator([stencil] + src_term + rec_term, subs=spacing_map)
spacing_map; {h_x: 10.0, h_z: 10.0, dt: 2.1}
Operator `Kernel` generated in 1.40 s * lowering.Expressions: 0.75 s (53.7 %) * lowering.Clusters: 0.40 s (28.7 %) Flops reduction after symbolic optimization: [298 --> 68]
The argument subs=spacing_map
passed to the operator substitutes values for the temporal and spatial dimensions into the expressions before code generation. This reduces the number of floating point operations executed by the kernel by pre-evaluating certain coefficients, and possibly absorbing the spacing scalars from the denominators of the numerical finite difference approximations into the finite difference coefficients.
If you run the two cases of passing/not passing the subs=spacing_map
argument by commenting/un-commenting the last two lines of the cell immediately above, you can inspect the difference in computed flop count for the operator. This is reported by setting Devito logging configuration['log-level'] = 'DEBUG'
and is reported during Devito symbolic optimization with the output line Flops reduction after symbolic optimization
. Note also if you inspect the generated code for the two cases, you will see extra calling parameters are required for the case without the substitution. We have compiled the flop count for 2D and 3D operators into the table below.
Dimensionality | Passing subs | Flops reduction | Delta |
---|---|---|---|
2D | False | 588 --> 81 | |
2D | True | 300 --> 68 | 13.7% |
3D | False | 875 --> 116 | |
3D | True | 442 --> 95 | 18.1% |
Note the gain in performance is around 14% for this example in 2D, and around 18% in 3D.
We use op.arguments()
to print the arguments to the operator. As noted above depending on the use of subs=spacing_map
you will see different arguments here. In the case of no subs=spacing_map
argument to the operator, you will see arguments for the dimensional spacing constants as parameters to the operator, including h_x
, h_z
, and dt
.
# NBVAL_IGNORE_OUTPUT
op.arguments()
Allocating memory for rec(954, 751) Allocating memory for u(3, 867, 867)
{'b': <cparam 'P' (0x7feba210c2b0)>, 'x_m': 0, 'x_size': 851, 'x_M': 850, 'z_m': 0, 'z_size': 851, 'z_M': 850, 'm': <cparam 'P' (0x7feba2022df0)>, 'o_x': -500.0, 'o_z': -500.0, 'rec': <cparam 'P' (0x7feba2025170)>, 'time_m': 1, 'time_size': 954, 'time_M': 952, 'p_rec_m': 0, 'p_rec_size': 751, 'p_rec_M': 750, 'rec_coords': <cparam 'P' (0x7feba2022730)>, 'd_m': 0, 'd_size': 2, 'd_M': 1, 'src': <cparam 'P' (0x7feba2021d30)>, 'p_src_m': 0, 'p_src_size': 1, 'p_src_M': 0, 'src_coords': <cparam 'P' (0x7feba2021330)>, 'u': <cparam 'P' (0x7feba20398f0)>, 't_size': 3, 'wOverQ_025': <cparam 'P' (0x7feba2038ef0)>, 'timers': <cparam 'P' (0x7feba1ee9490)>}
We use print(op)
to output the generated c code for review.
# NBVAL_IGNORE_OUTPUT
print(op)
#define _POSIX_C_SOURCE 200809L #include "stdlib.h" #include "math.h" #include "sys/time.h" #include "xmmintrin.h" #include "pmmintrin.h" struct dataobj { void *restrict data; int * size; int * npsize; int * dsize; int * hsize; int * hofs; int * oofs; } ; struct profiler { double section0; double section1; double section2; } ; int Kernel(struct dataobj *restrict b_vec, struct dataobj *restrict m_vec, const double o_x, const double o_z, struct dataobj *restrict rec_vec, struct dataobj *restrict rec_coords_vec, struct dataobj *restrict src_vec, struct dataobj *restrict src_coords_vec, struct dataobj *restrict u_vec, struct dataobj *restrict wOverQ_025_vec, const int x_M, const int x_m, const int x_size, const int z_M, const int z_m, const int z_size, const int p_rec_M, const int p_rec_m, const int p_src_M, const int p_src_m, const int time_M, const int time_m, struct profiler * timers) { double (*restrict b)[b_vec->size[1]] __attribute__ ((aligned (64))) = (double (*)[b_vec->size[1]]) b_vec->data; double (*restrict m)[m_vec->size[1]] __attribute__ ((aligned (64))) = (double (*)[m_vec->size[1]]) m_vec->data; double (*restrict rec)[rec_vec->size[1]] __attribute__ ((aligned (64))) = (double (*)[rec_vec->size[1]]) rec_vec->data; double (*restrict rec_coords)[rec_coords_vec->size[1]] __attribute__ ((aligned (64))) = (double (*)[rec_coords_vec->size[1]]) rec_coords_vec->data; double (*restrict src)[src_vec->size[1]] __attribute__ ((aligned (64))) = (double (*)[src_vec->size[1]]) src_vec->data; double (*restrict src_coords)[src_coords_vec->size[1]] __attribute__ ((aligned (64))) = (double (*)[src_coords_vec->size[1]]) src_coords_vec->data; double (*restrict u)[u_vec->size[1]][u_vec->size[2]] __attribute__ ((aligned (64))) = (double (*)[u_vec->size[1]][u_vec->size[2]]) u_vec->data; double (*restrict wOverQ_025)[wOverQ_025_vec->size[1]] __attribute__ ((aligned (64))) = (double (*)[wOverQ_025_vec->size[1]]) wOverQ_025_vec->data; double (*r26)[z_size + 3 + 4]; posix_memalign((void**)&r26, 64, sizeof(double[x_size + 3 + 4][z_size + 3 + 4])); double (*r25)[z_size + 3 + 4]; posix_memalign((void**)&r25, 64, sizeof(double[x_size + 3 + 4][z_size + 3 + 4])); /* Flush denormal numbers to zero in hardware */ _MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON); _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON); for (int time = time_m, t0 = (time)%(3), t1 = (time + 1)%(3), t2 = (time + 2)%(3); time <= time_M; time += 1, t0 = (time)%(3), t1 = (time + 1)%(3), t2 = (time + 2)%(3)) { struct timeval start_section0, end_section0; gettimeofday(&start_section0, NULL); /* Begin section0 */ for (int x = x_m - 4; x <= x_M + 3; x += 1) { #pragma omp simd aligned(u:32) for (int z = z_m - 4; z <= z_M + 3; z += 1) { r26[x + 4][z + 4] = 6.97544642889625e-5*(u[t0][x + 5][z + 8] - u[t0][x + 12][z + 8]) + 9.5703125007276e-4*(-u[t0][x + 6][z + 8] + u[t0][x + 11][z + 8]) + 7.97526041715173e-3*(u[t0][x + 7][z + 8] - u[t0][x + 10][z + 8]) + 1.1962890625e-1*(-u[t0][x + 8][z + 8] + u[t0][x + 9][z + 8]); r25[x + 4][z + 4] = 6.97544642889625e-5*(u[t0][x + 8][z + 5] - u[t0][x + 8][z + 12]) + 9.5703125007276e-4*(-u[t0][x + 8][z + 6] + u[t0][x + 8][z + 11]) + 7.97526041715173e-3*(u[t0][x + 8][z + 7] - u[t0][x + 8][z + 10]) + 1.1962890625e-1*(-u[t0][x + 8][z + 8] + u[t0][x + 8][z + 9]); } } for (int x = x_m; x <= x_M; x += 1) { #pragma omp simd aligned(b,m,u,wOverQ_025:32) for (int z = z_m; z <= z_M; z += 1) { double r7 = 2.1*wOverQ_025[x + 8][z + 8] - 1; double r8 = 2 - 2.1*wOverQ_025[x + 8][z + 8]; u[t1][x + 8][z + 8] = r7*u[t2][x + 8][z + 8] + r8*u[t0][x + 8][z + 8] + (4.41*(m[x + 8][z + 8]*m[x + 8][z + 8])*(6.97544642889625e-5*(b[x + 4][z + 8]*r26[x][z + 4] + b[x + 8][z + 4]*r25[x + 4][z] - b[x + 8][z + 11]*r25[x + 4][z + 7] - b[x + 11][z + 8]*r26[x + 7][z + 4]) + 9.5703125007276e-4*(-b[x + 5][z + 8]*r26[x + 1][z + 4] - b[x + 8][z + 5]*r25[x + 4][z + 1] + b[x + 8][z + 10]*r25[x + 4][z + 6] + b[x + 10][z + 8]*r26[x + 6][z + 4]) + 7.97526041715173e-3*(b[x + 6][z + 8]*r26[x + 2][z + 4] + b[x + 8][z + 6]*r25[x + 4][z + 2] - b[x + 8][z + 9]*r25[x + 4][z + 5] - b[x + 9][z + 8]*r26[x + 5][z + 4]) + 1.1962890625e-1*(-b[x + 7][z + 8]*r26[x + 3][z + 4] - b[x + 8][z + 7]*r25[x + 4][z + 3] + b[x + 8][z + 8]*r25[x + 4][z + 4] + b[x + 8][z + 8]*r26[x + 4][z + 4])))/b[x + 8][z + 8]; } } /* End section0 */ gettimeofday(&end_section0, NULL); timers->section0 += (double)(end_section0.tv_sec-start_section0.tv_sec)+(double)(end_section0.tv_usec-start_section0.tv_usec)/1000000; struct timeval start_section1, end_section1; gettimeofday(&start_section1, NULL); /* Begin section1 */ for (int p_src = p_src_m; p_src <= p_src_M; p_src += 1) { int ii_src_0 = (int)(floor(-1.0e-1*o_x + 1.0e-1*src_coords[p_src][0])); int ii_src_1 = (int)(floor(-1.0e-1*o_z + 1.0e-1*src_coords[p_src][1])); int ii_src_2 = (int)(floor(-1.0e-1*o_z + 1.0e-1*src_coords[p_src][1])) + 1; int ii_src_3 = (int)(floor(-1.0e-1*o_x + 1.0e-1*src_coords[p_src][0])) + 1; double px = (double)(-o_x - 1.0e+1*(int)(floor(-1.0e-1*o_x + 1.0e-1*src_coords[p_src][0])) + src_coords[p_src][0]); double pz = (double)(-o_z - 1.0e+1*(int)(floor(-1.0e-1*o_z + 1.0e-1*src_coords[p_src][1])) + src_coords[p_src][1]); if (ii_src_0 >= x_m - 1 && ii_src_1 >= z_m - 1 && ii_src_0 <= x_M + 1 && ii_src_1 <= z_M + 1) { double r0 = 4.41*(m[ii_src_0 + 8][ii_src_1 + 8]*m[ii_src_0 + 8][ii_src_1 + 8])*(1.0e-2*px*pz - 1.0e-1*px - 1.0e-1*pz + 1)*src[time][p_src]/b[ii_src_0 + 8][ii_src_1 + 8]; u[t1][ii_src_0 + 8][ii_src_1 + 8] += r0; } if (ii_src_0 >= x_m - 1 && ii_src_2 >= z_m - 1 && ii_src_0 <= x_M + 1 && ii_src_2 <= z_M + 1) { double r1 = 4.41*(m[ii_src_0 + 8][ii_src_2 + 8]*m[ii_src_0 + 8][ii_src_2 + 8])*(-1.0e-2*px*pz + 1.0e-1*pz)*src[time][p_src]/b[ii_src_0 + 8][ii_src_2 + 8]; u[t1][ii_src_0 + 8][ii_src_2 + 8] += r1; } if (ii_src_1 >= z_m - 1 && ii_src_3 >= x_m - 1 && ii_src_1 <= z_M + 1 && ii_src_3 <= x_M + 1) { double r2 = 4.41*(m[ii_src_3 + 8][ii_src_1 + 8]*m[ii_src_3 + 8][ii_src_1 + 8])*(-1.0e-2*px*pz + 1.0e-1*px)*src[time][p_src]/b[ii_src_3 + 8][ii_src_1 + 8]; u[t1][ii_src_3 + 8][ii_src_1 + 8] += r2; } if (ii_src_2 >= z_m - 1 && ii_src_3 >= x_m - 1 && ii_src_2 <= z_M + 1 && ii_src_3 <= x_M + 1) { double r3 = 4.41e-2*px*pz*(m[ii_src_3 + 8][ii_src_2 + 8]*m[ii_src_3 + 8][ii_src_2 + 8])*src[time][p_src]/b[ii_src_3 + 8][ii_src_2 + 8]; u[t1][ii_src_3 + 8][ii_src_2 + 8] += r3; } } /* End section1 */ gettimeofday(&end_section1, NULL); timers->section1 += (double)(end_section1.tv_sec-start_section1.tv_sec)+(double)(end_section1.tv_usec-start_section1.tv_usec)/1000000; struct timeval start_section2, end_section2; gettimeofday(&start_section2, NULL); /* Begin section2 */ for (int p_rec = p_rec_m; p_rec <= p_rec_M; p_rec += 1) { int ii_rec_0 = (int)(floor(-1.0e-1*o_x + 1.0e-1*rec_coords[p_rec][0])); int ii_rec_1 = (int)(floor(-1.0e-1*o_z + 1.0e-1*rec_coords[p_rec][1])); int ii_rec_2 = (int)(floor(-1.0e-1*o_z + 1.0e-1*rec_coords[p_rec][1])) + 1; int ii_rec_3 = (int)(floor(-1.0e-1*o_x + 1.0e-1*rec_coords[p_rec][0])) + 1; double px = (double)(-o_x - 1.0e+1*(int)(floor(-1.0e-1*o_x + 1.0e-1*rec_coords[p_rec][0])) + rec_coords[p_rec][0]); double pz = (double)(-o_z - 1.0e+1*(int)(floor(-1.0e-1*o_z + 1.0e-1*rec_coords[p_rec][1])) + rec_coords[p_rec][1]); double sum = 0.0; if (ii_rec_0 >= x_m - 1 && ii_rec_1 >= z_m - 1 && ii_rec_0 <= x_M + 1 && ii_rec_1 <= z_M + 1) { sum += (1.0e-2*px*pz - 1.0e-1*px - 1.0e-1*pz + 1)*u[t1][ii_rec_0 + 8][ii_rec_1 + 8]; } if (ii_rec_0 >= x_m - 1 && ii_rec_2 >= z_m - 1 && ii_rec_0 <= x_M + 1 && ii_rec_2 <= z_M + 1) { sum += (-1.0e-2*px*pz + 1.0e-1*pz)*u[t1][ii_rec_0 + 8][ii_rec_2 + 8]; } if (ii_rec_1 >= z_m - 1 && ii_rec_3 >= x_m - 1 && ii_rec_1 <= z_M + 1 && ii_rec_3 <= x_M + 1) { sum += (-1.0e-2*px*pz + 1.0e-1*px)*u[t1][ii_rec_3 + 8][ii_rec_1 + 8]; } if (ii_rec_2 >= z_m - 1 && ii_rec_3 >= x_m - 1 && ii_rec_2 <= z_M + 1 && ii_rec_3 <= x_M + 1) { sum += 1.0e-2*px*pz*u[t1][ii_rec_3 + 8][ii_rec_2 + 8]; } rec[time][p_rec] = sum; } /* End section2 */ gettimeofday(&end_section2, NULL); timers->section2 += (double)(end_section2.tv_sec-start_section2.tv_sec)+(double)(end_section2.tv_usec-start_section2.tv_usec)/1000000; } free(r26); free(r25); return 0; }
By setting Devito logging configuration['log-level'] = 'DEBUG'
we have enabled output of statistics related to the performance of the operator, which you will see below when the operator runs.
We will run the Operator once with the Q model as defined wOverQ_025
, and then run a second time passing the wOverQ_100
Q model. For the second run with the different Q model, we take advantage of the placeholder design patten
in the Devito Operator
.
For more information on this see the FAQ entry.
# NBVAL_IGNORE_OUTPUT
# Run the operator for the Q=25 model
print("m min/max; %+12.6e %+12.6e" % (np.min(m.data), np.max(m.data)))
print("b min/max; %+12.6e %+12.6e" % (np.min(b.data), np.max(b.data)))
print("wOverQ_025 min/max; %+12.6e %+12.6e" % (np.min(wOverQ_025.data), np.max(wOverQ_025.data)))
print("wOverQ_100 min/max; %+12.6e %+12.6e" % (np.min(wOverQ_100.data), np.max(wOverQ_100.data)))
print(time_range)
u.data[:] = 0
op(time=time_range.num-1)
# summary = op(time=time_range.num-1, h_x=dx, h_z=dz, dt=dt)
# Save the Q=25 results and run the Q=100 case
import copy
uQ25 = copy.copy(u)
recQ25 = copy.copy(rec)
u.data[:] = 0
op(time=time_range.num-1, wOverQ_025=wOverQ_100)
print("Q= 25 receiver data min/max; %+12.6e %+12.6e" %\
(np.min(recQ25.data[:]), np.max(recQ25.data[:])))
print("Q=100 receiver data min/max; %+12.6e %+12.6e" %\
(np.min(rec.data[:]), np.max(rec.data[:])))
m min/max; +1.500000e+00 +1.500000e+00 b min/max; +1.000000e+00 +1.000000e+00 wOverQ_025 min/max; +2.513274e-03 +6.283185e-01 wOverQ_100 min/max; +6.283185e-04 +6.283185e-01 TimeAxis: start=0, stop=2001.3, step=2.1, num=954 gcc -O3 -g -fPIC -Wall -std=c99 -march=native -Wno-unused-result -Wno-unused-variable -Wno-unused-but-set-variable -ffast-math -shared -fopenmp /tmp/devito-jitcache-uid5138/86598a3421d157e83eb281582174fa6eedd17660.c -lm -o /tmp/devito-jitcache-uid5138/86598a3421d157e83eb281582174fa6eedd17660.so
Operator `Kernel` jit-compiled `/tmp/devito-jitcache-uid5138/86598a3421d157e83eb281582174fa6eedd17660.c` in 0.42 s with `GNUCompiler` Operator `Kernel` run in 6.48 s Global performance indicators * Achieved 0.11 FD-GPts/s Local performance indicators * section0<<953,858,858>,<953,851,851>> with OI=0.94 computed in 6.46 s [7.31 GFlops/s, 0.11 GPts/s] * section1<<953,1>,<953,1>,<953,1>,<953,1>,<953,1>> with OI=1.93 computed in 0.01 s [0.06 GFlops/s, 0.01 GPts/s] * section2<<953,751>,<953,751>,<953,751>,<953,751>,<953,751>,<953,751>> with OI=2.42 computed in 0.02 s [2.66 GFlops/s] Performance[mode=advanced] arguments: {} Allocating memory for u(3, 867, 867) Allocating memory for rec(954, 751) Allocating memory for rec_coords(751, 2) Operator `Kernel` run in 6.49 s Global performance indicators * Achieved 0.11 FD-GPts/s Local performance indicators * section0<<953,858,858>,<953,851,851>> with OI=0.94 computed in 6.47 s [7.30 GFlops/s, 0.11 GPts/s] * section1<<953,1>,<953,1>,<953,1>,<953,1>,<953,1>> with OI=1.93 computed in 0.01 s [0.07 GFlops/s, 0.01 GPts/s] * section2<<953,751>,<953,751>,<953,751>,<953,751>,<953,751>,<953,751>> with OI=2.42 computed in 0.02 s [2.91 GFlops/s] Performance[mode=advanced] arguments: {}
Q= 25 receiver data min/max; -2.184589e+01 +4.205808e+01 Q=100 receiver data min/max; -2.200673e+01 +4.218462e+01
# Continuous integration hooks
# We ensure the norm of these computed wavefields is repeatable
assert np.isclose(norm(uQ25), 26.749, atol=0, rtol=1e-3)
assert np.isclose(norm(u), 161.131, atol=0, rtol=1e-3)
assert np.isclose(norm(recQ25), 368.153, atol=0, rtol=1e-3)
assert np.isclose(norm(rec), 413.414, atol=0, rtol=1e-3)
# NBVAL_IGNORE_OUTPUT
# Plot the two wavefields, normalized to Q=100 (the larger amplitude)
amax_Q25 = 1.0 * np.max(np.abs(uQ25.data[1,:,:]))
amax_Q100 = 1.0 * np.max(np.abs(u.data[1,:,:]))
print("amax Q= 25; %12.6f" % (amax_Q25))
print("amax Q=100; %12.6f" % (amax_Q100))
plt.figure(figsize=(12,8))
plt.subplot(1, 2, 1)
plt.imshow(np.transpose(uQ25.data[1,:,:] / amax_Q100), cmap="seismic",
vmin=-1, vmax=+1, extent=plt_extent)
plt.colorbar(orientation='horizontal', label='Amplitude')
plt.plot([origin[0], origin[0], extent[0], extent[0], origin[0]],
[origin[1], extent[1], extent[1], origin[1], origin[1]],
'black', linewidth=4, linestyle=':', label="Absorbing Boundary")
plt.plot(src.coordinates.data[:, 0], src.coordinates.data[:, 1], \
'red', linestyle='None', marker='*', markersize=15, label="Source")
plt.xlabel("X Coordinate (m)")
plt.ylabel("Z Coordinate (m)")
plt.title("Data for $Q=25$ model")
plt.subplot(1, 2, 2)
plt.imshow(np.transpose(u.data[1,:,:] / amax_Q100), cmap="seismic",
vmin=-1, vmax=+1, extent=plt_extent)
plt.colorbar(orientation='horizontal', label='Amplitude')
plt.plot([origin[0], origin[0], extent[0], extent[0], origin[0]],
[origin[1], extent[1], extent[1], origin[1], origin[1]],
'black', linewidth=4, linestyle=':', label="Absorbing Boundary")
plt.plot(src.coordinates.data[:, 0], src.coordinates.data[:, 1], \
'red', linestyle='None', marker='*', markersize=15, label="Source")
plt.xlabel("X Coordinate (m)")
plt.ylabel("Z Coordinate (m)")
plt.title("Data for $Q=100$ model")
plt.tight_layout()
None
amax Q= 25; 0.156892 amax Q=100; 0.940393
# NBVAL_IGNORE_OUTPUT
# Plot the two receiver gathers, normalized to Q=100 (the larger amplitude)
amax_Q25 = 0.1 * np.max(np.abs(recQ25.data[:]))
amax_Q100 = 0.1 * np.max(np.abs(rec.data[:]))
print("amax Q= 25; %12.6f" % (amax_Q25))
print("amax Q=100; %12.6f" % (amax_Q100))
plt.figure(figsize=(12,8))
plt.subplot(1, 2, 1)
plt.imshow(recQ25.data[:,:] / amax_Q100, cmap="seismic",
vmin=-1, vmax=+1, extent=plt_extent, aspect="auto")
plt.colorbar(orientation='horizontal', label='Amplitude')
plt.xlabel("X Coordinate (m)")
plt.ylabel("Z Coordinate (m)")
plt.title("Receiver gather for $Q=25$ model")
plt.subplot(1, 2, 2)
plt.imshow(rec.data[:,:] / amax_Q100, cmap="seismic",
vmin=-1, vmax=+1, extent=plt_extent, aspect="auto")
plt.colorbar(orientation='horizontal', label='Amplitude')
plt.xlabel("X Coordinate (m)")
plt.ylabel("Z Coordinate (m)")
plt.title("Receiver gather for $Q=100$ model")
plt.tight_layout()
None
amax Q= 25; 4.205808 amax Q=100; 4.218462
Note this takes a long time ... about 50 seconds, but obviates the need to solve for the time update expression as we did above.
If you would like to see the time update equation as generated by Devito symbolic optimization, uncomment the lines for the solve below.
# NBVAL_IGNORE_OUTPUT
# Define the partial_differential equation
# Note the backward shifted time derivative is obtained via u.dt(x0=t-0.5*t.spacing)
pde = (b / m**2) * (wOverQ_100 * u.dt(x0=t-0.5*t.spacing) + u.dt2) -\
(b * u.dx(x0=x+0.5*x.spacing)).dx(x0=x-0.5*x.spacing) -\
(b * u.dz(x0=z+0.5*z.spacing)).dz(x0=z-0.5*z.spacing)
# Uncomment the next 5 lines to see the equation as generated by Devito
# t1 = timer()
# stencil = Eq(u.forward, solve(pde, u.forward))
# t2 = timer()
# print("solve ran in %.4f seconds." % (t2-t1))
# stencil
This concludes the implementation of the nonlinear forward operator. This series continues in the next notebook that describes the implementation of the Jacobian linearized forward and adjoint operators.
Charles Cerjan, Dan Kosloff, Ronnie Kosloff, and Moshe Resheq
Geophysics, Vol. 50, No. 4
https://library.seg.org/doi/pdfplus/10.1190/segam2016-13878451.1
Bengt Fornberg
Mathematics of Computation, Vol. 51, No. 184
http://dx.doi.org/10.1090/S0025-5718-1988-0935077-0
https://web.njit.edu/~jiang/math712/fornberg.pdf
Kenneth Bube, John Washbourne, Raymond Ergas, and Tamas Nemeth
SEG Technical Program Expanded Abstracts
https://library.seg.org/doi/10.1190/segam2016-13878451.1