This a simple demonstration of the variogram nugget effect structure for a 1D datasets with variable spatial continuity and visualization.
we will see that the nugget effect results from random error
we will perform the calculations in 1D for fast run times and ease of visualization.
The following code loads the required libraries.
import os # to set current working directory
import numpy as np # arrays and matrix math
import matplotlib.pyplot as plt # for plotting
from matplotlib.gridspec import GridSpec # custom matrix plots
plt.rc('axes', axisbelow=True) # set axes and grids in the background for all plots
from ipywidgets import interactive # widgets and interactivity
from ipywidgets import widgets
from ipywidgets import Layout
from ipywidgets import Label
from ipywidgets import VBox, HBox
import math # for square root
from geostatspy import GSLIB # affine correction
seed = 73073
If you get a package import error, you may have to first install some of these packages. This can usually be accomplished by opening up a command window on Windows and then typing 'python -m pip install [package-name]'. More assistance is available with the respective package docs.
I always like to do this so I don't lose files and to simplify subsequent read and writes (avoid including the full address each time). Also, in this case make sure to place the required (see below) data file in this working directory.
#os.chdir("C:\PGE337") # set the working directory
We need a variogram calculator that is fast and works well with 1D.
References:
Pyrcz, M.J., Jo. H., Kupenko, A., Liu, W., Gigliotti, A.E., Salomaki, T., and Santos, J., 2021, GeostatsPy Python Package, PyPI, Python Package Index, https://pypi.org/project/geostatspy/.
def gam(array, tmin, tmax, xsiz, ysiz, ixd, iyd, nlag, isill):
"""GSLIB's GAM program (Deutsch and Journel, 1998) converted from the
original Fortran to Python by Michael Pyrcz, the University of Texas at
Austin (Jan, 2019).
:param array: 2D gridded data / model
:param tmin: property trimming limit
:param tmax: property trimming limit
:param xsiz: grid cell extents in x direction
:param ysiz: grid cell extents in y direction
:param ixd: lag offset in grid cells
:param iyd: lag offset in grid cells
:param nlag: number of lags to calculate
:param isill: 1 for standardize sill
:return: TODO
"""
if array.ndim == 2:
ny, nx = array.shape
elif array.ndim == 1:
ny, nx = len(array),1
array = array.reshape((ny,1))
nvarg = 1 # for multiple variograms repeat the program
nxy = nx * ny # TODO: not used
mxdlv = nlag
# Allocate the needed memory
lag = np.zeros(mxdlv)
vario = np.zeros(mxdlv)
hm = np.zeros(mxdlv)
tm = np.zeros(mxdlv)
hv = np.zeros(mxdlv) # TODO: not used
npp = np.zeros(mxdlv)
ivtail = np.zeros(nvarg + 2)
ivhead = np.zeros(nvarg + 2)
ivtype = np.zeros(nvarg + 2)
ivtail[0] = 0
ivhead[0] = 0
ivtype[0] = 0
# Summary statistics for the data after trimming
inside = (array > tmin) & (array < tmax)
avg = array[(array > tmin) & (array < tmax)].mean() # TODO: not used
stdev = array[(array > tmin) & (array < tmax)].std()
var = stdev ** 2.0
vrmin = array[(array > tmin) & (array < tmax)].min() # TODO: not used
vrmax = array[(array > tmin) & (array < tmax)].max() # TODO: not used
num = ((array > tmin) & (array < tmax)).sum() # TODO: not used
# For the fixed seed point, loop through all directions
for iy in range(0, ny):
for ix in range(0, nx):
if inside[iy, ix]:
vrt = array[iy, ix]
ixinc = ixd
iyinc = iyd
ix1 = ix
iy1 = iy
for il in range(0, nlag):
ix1 = ix1 + ixinc
if 0 <= ix1 < nx:
iy1 = iy1 + iyinc
if 1 <= iy1 < ny:
if inside[iy1, ix1]:
vrh = array[iy1, ix1]
npp[il] = npp[il] + 1
tm[il] = tm[il] + vrt
hm[il] = hm[il] + vrh
vario[il] = vario[il] + ((vrh - vrt) ** 2.0)
# Get average values for gam, hm, tm, hv, and tv, then compute the correct
# "variogram" measure
for il in range(0, nlag):
if npp[il] > 0:
rnum = npp[il]
lag[il] = np.sqrt((ixd * xsiz * il) ** 2 + (iyd * ysiz * il) ** 2)
vario[il] = vario[il] / float(rnum)
hm[il] = hm[il] / float(rnum)
tm[il] = tm[il] / float(rnum)
# Standardize by the sill
if isill == 1:
vario[il] = vario[il] / var
# Semivariogram
vario[il] = 0.5 * vario[il]
return lag, vario, npp
Here's the interactive interface. I make a correlated 1D data set, add noise and then calculate the histogram and variogram with and without noise.
n = 200; mean = 0.20; stdev = 0.03; nlag = 100; pnoise = 0.5;
l = widgets.Text(value=' Variogram Nugger Effect Demonstration, Prof. Michael Pyrcz, The University of Texas at Austin',
layout=Layout(width='930px', height='30px'))
pnoise = widgets.FloatSlider(min=0.0,max = 1.0,value=0.0,step = 0.05,description = 'Noise %',orientation='horizontal',style = {'description_width': 'initial'},layout=Layout(width='500px',height='30px'),continuous_update=False)
vrange = widgets.IntSlider(min=1,max = 100,value=30,step = 5,description = 'Spatial Continuity Range',orientation='horizontal',style = {'description_width': 'initial'},layout=Layout(width='500px',height='30px'),continuous_update=False)
ui = widgets.HBox([pnoise,vrange],)
ui2 = widgets.VBox([l,ui],)
def run_plot(pnoise,vrange):
psignal = 1 - pnoise
np.random.seed(seed = seed)
data0 = np.random.normal(loc=0.20,scale=0.03,size=n+1000)
kern1 = np.ones(vrange)
data1 = np.convolve(data0,kern1,mode='same')
data1_sub = GSLIB.affine(data1[500:n+500],mean,stdev)
data1_sub_rescale = GSLIB.affine(data1[500:n+500],mean,stdev*math.sqrt(psignal))
data1_sub_noise = data1_sub_rescale + np.random.normal(loc=0.0,scale = stdev*math.sqrt(pnoise),size=n)
data1_sub_noise = GSLIB.affine(data1_sub_noise,mean,stdev)
#fig, axs = plt.subplots(2,3, gridspec_kw={'width_ratios': [2, 1, 1, 1]})
fig = plt.figure()
spec = fig.add_gridspec(2, 3)
ax1 = fig.add_subplot(spec[0, :])
plt.plot(np.arange(1,n+1),data1_sub,color='blue',alpha=0.3,lw=3,label='Original')
plt.plot(np.arange(1,n+1),data1_sub_noise,color='red',alpha=0.3,lw=3,label='Original + Noise')
plt.xlim([0,n]); plt.ylim([mean-4*stdev,mean+4*stdev])
plt.xlabel('Location (m)'); plt.ylabel('Porosity (%)'); plt.title('Porosity Over Location, Original and with Random Noise')
plt.grid(); plt.legend(loc='upper right')
ax2 = fig.add_subplot(spec[1, 0])
plt.hist(data1_sub,color='blue',alpha=0.3,edgecolor='black',bins=np.linspace(mean-4*stdev,mean+4*stdev,30),
label='Original')
plt.hist(data1_sub_noise,color='red',alpha=0.3,edgecolor='black',bins=np.linspace(mean-4*stdev,mean+4*stdev,30),
label='Original + Noise')
plt.xlim([mean-4*stdev,mean+4*stdev]); plt.ylim([0,30])
plt.xlabel('Porosity (%)'); plt.ylabel('Frequency'); plt.title('Histogram')
plt.grid(); plt.legend(loc='upper right')
ax3 = fig.add_subplot(spec[1, 1])
labels = ['Signal','Noise',]
plt.pie([psignal, pnoise,],radius = 1, autopct='%1.1f%%',
colors = ['#0000FF','#FF0000'], explode = [.02,.02],wedgeprops = {"edgecolor":"k",'linewidth':1,"alpha":0.3},)
plt.title('Variance of Signal and Noise')
plt.legend(labels,loc='lower left')
ax4 = fig.add_subplot(spec[1, 2])
data1_sub_reshape = data1_sub.reshape((n,1))
lag,gamma,npp = gam(data1_sub,-9999,9999,1.0,1.0,0,1,nlag,1)
_,gamma_noise,_ = gam(data1_sub_noise,-9999,9999,1.0,1.0,0,1,nlag,1)
plt.scatter(lag,gamma,s=30,color='blue',alpha=0.3,edgecolor='black',label='Original')
plt.scatter(lag,gamma_noise,s=30,color='red',alpha=0.3,edgecolor='black',label='Original + Noise')
plt.plot([0,nlag],[1.0,1.0],color='black',ls='--')
plt.xlim([0,nlag]); plt.ylim([0,2.0]); plt.grid(); plt.legend(loc='upper right')
plt.xlabel('Lag Distance (h)'); plt.ylabel('Variogram'); plt.title('Experimental Variogram')
plt.subplots_adjust(left=0.0, bottom=0.0, right=2.0, top=1.6, wspace=0.1, hspace=0.3); plt.show()
# connect the function to make the samples and plot to the widgets
interactive_plot = widgets.interactive_output(run_plot, {'pnoise':pnoise,'vrange':vrange})
interactive_plot.clear_output(wait = True) # reduce flickering by delaying plot updating
Take some time to observe a random phenomenon.
We can use convolution to add spatial continuity to a random set of values
we won't go into the details, but the convolution kernel can actually be related to the variogram in sequential Gaussian simulation.
we apply an affine correction to ensure that we don't change the mean or standard deviation with the convolution, we just change the spatial continuity
since we are using convolution, it is likely that there will be edge artifacts, so we have 'cut off' the edges of the model (500 m on each side).
display(ui2, interactive_plot) # display the interactive plot
VBox(children=(Text(value=' Variogram Nugger Effect Demons…
Output(outputs=({'output_type': 'display_data', 'data': {'text/plain': '<Figure size 432x288 with 4 Axes>', 'i…
This was an interactive demonstration of the variogram nugget effect structure resulting from the addition of random noise to spatial data.
I have many other demonstrations on simulation to build spatial models with spatial continuity and many other workflows available here, along with a package for geostatistics in Python called GeostatsPy.
We hope this was helpful,
Michael
Novel Data Analytics, Geostatistics and Machine Learning Subsurface Solutions
With over 17 years of experience in subsurface consulting, research and development, Michael has returned to academia driven by his passion for teaching and enthusiasm for enhancing engineers' and geoscientists' impact in subsurface resource development.
For more about Michael check out these links:
I hope this content is helpful to those that want to learn more about subsurface modeling, data analytics and machine learning. Students and working professionals are welcome to participate.
Want to invite me to visit your company for training, mentoring, project review, workflow design and / or consulting? I'd be happy to drop by and work with you!
Interested in partnering, supporting my graduate student research or my Subsurface Data Analytics and Machine Learning consortium (co-PIs including Profs. Foster, Torres-Verdin and van Oort)? My research combines data analytics, stochastic modeling and machine learning theory with practice to develop novel methods and workflows to add value. We are solving challenging subsurface problems!
I can be reached at mpyrcz@austin.utexas.edu.
I'm always happy to discuss,
Michael
Michael Pyrcz, Ph.D., P.Eng. Associate Professor The Hildebrand Department of Petroleum and Geosystems Engineering, Bureau of Economic Geology, The Jackson School of Geosciences, The University of Texas at Austin