Optimum Decision Making in Python
Here's a simple interactive demonstration of optimum decision making in the precense of uncertainty. This should help you get started with using your uncertainty models to make decisions.
Definition: the decision that minimizes the expected loss or maximizes the expected profit
Procedure:
calculate the expected profit for each development scenario
\begin{equation} E{𝑃_𝑠} = \frac{1}{\sum_{m=1}^M \lambda_m} \sum_{\ell=1}^L \lambda_m \cdot P_{s,m} \end{equation}define the optimal development scenario as the one with highest expected profit
It is natural to assume that a narrower uncertainty distribution is better.
Yet, higher variance has more downside and more upside.
To make an optimum estimate / decision from a distribution we need a function that describes the:
The optimum decision making in the precense of uncertainty assumes:
Here's the steps to get setup in Python with the GeostatsPy package:
There are examples below with these functions. You can go here to see a list of the available functions, https://git.io/fh4eX, other example workflows and source code.
import geostatspy.GSLIB as GSLIB # GSLIB utilies, visualization and wrapper
import geostatspy.geostats as geostats # GSLIB methods convert to Python
We will also need some standard packages. These should have been installed with Anaconda 3.
import numpy as np # ndarrys for gridded data
import pandas as pd # DataFrames for tabular data
import os # set working directory, run executables
import matplotlib.pyplot as plt # for plotting
from scipy import stats # summary statistics
from scipy.stats.kde import gaussian_kde # PDF estimation from data
import math # trig etc.
import random
from ipywidgets import interactive # widgets and interactivity
from ipywidgets import widgets
from ipywidgets import Layout
from ipywidgets import Label
from ipywidgets import VBox, HBox
working_directory = "C://PGE383//" # for convenience you can set the default input file information here
bootstrap_file = 'avg_por_boot.csv' # only used if you load your bootstrap results from a file
feature_name = 'avg_por_boot'
working_directory
'C://PGE383//'
The following code includes the:
# interactive calculation of the random sample set (control of source parametric distribution and number of samples)
l = widgets.Text(value=' Optimum Decision Making Demonstration, Michael Pyrcz, Associate Professor, The University of Texas at Austin',layout=Layout(width='950px', height='30px'))
load = widgets.Checkbox(
value=False,
description='Load Data',
disabled=False,
button_style='', # 'success', 'info', 'warning', 'danger' or ''
tooltip='Description',
icon='check' # (FontAwesome names without the `fa-` prefix)
)
mean = widgets.FloatSlider(min=0.0, max = 1.0, value = 0.2,step=0.01,description = '$\mu$',orientation='horizontal',layout=Layout(width='230px', height='50px'),continuous_update=False)
mean.style.handle_color = 'red'
stdev = widgets.FloatSlider(min=0.001, max = 0.25, value = 0.03,step=0.01,description = '$\sigma$',orientation='horizontal',layout=Layout(width='230px', height='50px'),continuous_update=False)
stdev.style.handle_color = 'red'
file = widgets.Text(
value= working_directory + bootstrap_file,
placeholder='Type something',
description='String:',
disabled=False
)
feature = widgets.Text(
value=feature_name,
placeholder='Type something',
description='String:',
disabled=False
)
file.layout.display = "none"
feature.layout.display = "none"
ui1 = widgets.VBox([load,mean,stdev,file,feature],kwargs = {'justify_content':'center'})
slope_under = widgets.FloatSlider(min=0.01, max = 1.0,value = 0.1,step=0.01,description = '$m_{under}$',orientation='horizontal',layout=Layout(width='230px', height='50px'),continuous_update=False)
slope_under.style.handle_color = 'blue'
power_under = widgets.FloatSlider(min=1.0, max = 5.0, value = 0.0, description = '$p_{under}$',orientation='horizontal',layout=Layout(width='230px', height='50px'),continuous_update=False)
power_under.style.handle_color = 'blue'
ui2 = widgets.VBox([slope_under,power_under],kwargs = {'justify_content':'center'})
slope_over = widgets.FloatSlider(min=0.01, max = 1.0,value = 0.1,step=0.1,description = '$m_{over}$',orientation='horizontal',layout=Layout(width='230px', height='50px'),continuous_update=False)
slope_over.style.handle_color = 'blue'
power_over = widgets.FloatSlider(min=1.0, max = 5.0, value = 0.0, description = '$p_{over}$',orientation='horizontal',layout=Layout(width='230px', height='50px'),continuous_update=False)
power_over.style.handle_color = 'blue'
ui3 = widgets.VBox([slope_over,power_over],kwargs = {'justify_content':'center'})
step = widgets.FloatLogSlider(min=-4, max = -1, value = 0.01, description = 'Step',orientation='horizontal',layout=Layout(width='230px', height='50px'),continuous_update=False)
step.style.handle_color = 'gray'
ui4 = widgets.VBox([step],kwargs = {'justify_content':'none'})
ui = widgets.HBox([ui1,ui2,ui3,ui4])
ui_title = widgets.VBox([l,ui],)
out = widgets.Output()
def update_interface(change):
if change.new == True:
mean.layout.display = "none"
stdev.layout.display = "none"
file.layout.display = "flex"
feature.layout.display = "flex"
else:
mean.layout.display = "flex"
stdev.layout.display = "flex"
file.layout.display = "none"
feature.layout.display = "none"
load.observe(update_interface, names = 'value')
# def handle_slider_change(change):
# caption.value = 'The slider value is ' + (
# 'negative' if change.new < 0 else 'nonnegative'
# )
# slider.observe(handle_slider_change, names='value')
def make_plot(load,mean,stdev,file,feature,slope_under,power_under,slope_over,power_over,step):
if load == False:
xmin = mean - 5.0 * stdev
xmax = mean + 5.0 * stdev
X = np.arange(xmin,xmax,step) # build uncertainty distribution
pdf = stats.norm.pdf(X, loc = mean, scale = stdev)
y = stats.norm.rvs(loc=mean,scale=stdev,size=10000)
else:
df = pd.read_csv(file)
y = df[feature]
#xmin = np.min(y); xmax = np.max(y)
xmin = 0; xmax = 1.0
X = np.arange(xmin,xmax,step) # build uncertainty distribution
kernel = gaussian_kde(y)
pdf = kernel(X)
delta = np.arange(-1*(xmax-xmin)*4,(xmax-xmin)*4,step) # build loss function
loss = np.zeros(len(delta))
rloss = np.zeros(len(delta)) # flip for convolve operator
for id, d in enumerate(delta):
if d < 0.0:
loss[id] = pow(abs(d),power_under)*slope_under
rloss[len(delta)-id-1] = pow(abs(d),power_under)*slope_under
else:
loss[id] = pow(d,power_over)*slope_over
rloss[len(delta)-id-1] = pow(d,power_over)*slope_over
pdf_norm = pdf / pdf.sum() # calculate expected loss
exp_loss = np.convolve(pdf_norm,loss,mode='valid')
x_exp_loss = np.arange(-0.5*step*(len(exp_loss))+mean,0.5*step*(len(exp_loss))+mean,step)
x_exp_loss = x_exp_loss[0:len(exp_loss)] # avoid rounding error
inside = np.logical_and(x_exp_loss >= xmin, x_exp_loss <= xmax)
max_loss_plot = np.max(exp_loss,where = inside,initial=0.0)
best_estimate = x_exp_loss[np.argmin(exp_loss)]
min_loss = np.min(exp_loss)
plt.subplot(131) # plot uncertainty distribution
plt.plot(X,pdf,'--',color='black',alpha=0.4,zorder=1)
hist = plt.hist(y,bins=X,edgecolor='black',color='red',alpha=0.1,density=True,stacked=True)
plt.plot([best_estimate,best_estimate],[0,np.max(hist[0])*1.2],color='black',linestyle='--')
plt.annotate('Optimum Estimate',[best_estimate,np.max(hist[0])*0.6], rotation=270)
plt.xlim(0.0,1.0); plt.xlabel("Feature, $y_1$"); plt.ylim([0,np.max(hist[0])*1.2])
plt.title('Uncertainty Distribution'); plt.ylabel('Density')
plt.grid(color='grey', ls = '-.', lw = 0.1)
plt.subplot(132) # plot loss function, loss vs. estimate error
plt.plot(delta,loss,color='blue',alpha=0.2)
# plt.plot([0,0],[0,np.max(loss)],color='black',linestyle='--',alpha=0.3)
plt.plot([0,0],[0,1.0],color='black',linestyle='--',alpha=0.3)
# plt.annotate('Underestimation',[(xmax-xmin)*-0.1,np.max(loss)*0.8], rotation=0,horizontalalignment='right')
# plt.annotate('Overestimation',[(xmax-xmin)*0.1,np.max(loss)*0.8], rotation=0,horizontalalignment='left')
plt.annotate('Underestimation',[(xmax-xmin)*-0.1,0.7], rotation=0,horizontalalignment='right')
plt.annotate('Overestimation',[(xmax-xmin)*0.1,0.7], rotation=0,horizontalalignment='left')
# plt.xlim([-1*(xmax-xmin)*3,(xmax-xmin)*3])
# plt.ylim([0,np.max(loss)])
plt.xlim([-1,1]); plt.ylim([0,1.0])
plt.xlabel('Error In Estimate'); plt.title('Loss Function'); plt.ylabel('Loss')
plt.grid(color='grey', ls = '-.', lw = 0.1)
plt.subplot(133) # plot expected loss vs. estimate
plt.plot(x_exp_loss,exp_loss,color='black',alpha=0.2)
plt.plot([best_estimate,best_estimate],[0,max_loss_plot],color='black',linestyle='--')
plt.xlim([0.0,1.0]); # plt.ylim([0,1.0])
plt.annotate('Optimum Estimate',[best_estimate,max_loss_plot*0.6], rotation=270)
# plt.plot([xmin,xmax],[min_loss,min_loss],color='black',linestyle='--',alpha=0.3)
plt.plot([0,1],[min_loss,min_loss],color='black',linestyle='--',alpha=0.3)
plt.annotate('Minimum Expected Loss',[xmin+(xmax-xmin)*0.2,min_loss*1.1], rotation=0,horizontalalignment='left')
plt.ylim([0,max_loss_plot])
plt.xlabel('Estimate, $y^*_1$'); plt.title('Expected Loss vs. Estimate'); plt.ylabel('Expected Loss')
plt.grid(color='grey', ls = '-.', lw = 0.1)
plt.subplots_adjust(left=0.0, bottom=0.0, right=3.0, top=1.2, wspace=0.2, hspace=0.2)
plt.show()
interactive_plot = widgets.interactive_output(make_plot, {'load':load,'mean':mean,'stdev':stdev,'file':file,'feature':feature,'slope_under':slope_under,'power_under':power_under,'slope_over':slope_over,'power_over':power_over,'step':step})
interactive_plot.clear_output(wait = True) # reduce flickering by delaying plot updating
display(ui_title, interactive_plot) # display the interactive plot
VBox(children=(Text(value=' Optimum Decision Making Demonstration, Michae…
Output(outputs=({'output_type': 'display_data', 'data': {'text/plain': '<Figure size 432x288 with 3 Axes>', 'i…
caption = widgets.Label(value='The values of range1 and range2 are synchronized')
slider = widgets.IntSlider(min=-5, max=5, value=1, description='Slider')
def handle_slider_change(change):
caption.value = 'The slider value is ' + (
'negative' if change.new < 0 else 'nonnegative'
)
slider.observe(handle_slider_change, names='value')
display(caption, slider)
Label(value='The values of range1 and range2 are synchronized')
IntSlider(value=1, description='Slider', max=5, min=-5)
This was a basic demonstration of optimum decision making in the precense of uncertainty.
I have other demonstrations on the basics of working with DataFrames, ndarrays, univariate statistics, plotting data, declustering, data transformations, trend modeling, multivariate analysis and many other workflows available at https://github.com/GeostatsGuy/PythonNumericalDemos and https://github.com/GeostatsGuy/GeostatsPy.
I hope this was helpful,
Michael
Novel Data Analytics, Geostatistics and Machine Learning Subsurface Solutions
With over 17 years of experience in subsurface consulting, research and development, Michael has returned to academia driven by his passion for teaching and enthusiasm for enhancing engineers' and geoscientists' impact in subsurface resource development.
For more about Michael check out these links:
I hope this content is helpful to those that want to learn more about subsurface modeling, data analytics and machine learning. Students and working professionals are welcome to participate.
Want to invite me to visit your company for training, mentoring, project review, workflow design and / or consulting? I'd be happy to drop by and work with you!
Interested in partnering, supporting my graduate student research or my Subsurface Data Analytics and Machine Learning consortium (co-PIs including Profs. Foster, Torres-Verdin and van Oort)? My research combines data analytics, stochastic modeling and machine learning theory with practice to develop novel methods and workflows to add value. We are solving challenging subsurface problems!
I can be reached at mpyrcz@austin.utexas.edu.
I'm always happy to discuss,
Michael
Michael Pyrcz, Ph.D., P.Eng. Associate Professor The Hildebrand Department of Petroleum and Geosystems Engineering, Bureau of Economic Geology, The Jackson School of Geosciences, The University of Texas at Austin