Here is an interactive workflow based on Sivia's (1996) Bayesian coin example to demonstrate Bayesian updating that should help you efficient learn Bayesian updating, central for Bayesian probability calculations and machine learning.
I have recorded a walk-through of this interactive dashboard in my Data Science Interactive Python Demonstrations series on my YouTube channel.
Join me for walk-through of this dashboard 03 Data Science Interactive: Bayesian Coin Example. I'm stoked to guide you and share observations and things to try out!
I have a lecture on Bayesian Probability as part of my Data Analytics and Geostatistics. Note, for all my recorded lecture the interactive and well-documented workflow demononstrations are available on my GitHub repository GeostatsGuy's Python Numerical Demos.
Also, I have another interactive Sivia's Bayesian Coin Example for the Sivia (1996) Bayesian coin example and a Walk-Through on my YouTube channel.
First let's remind ourselves of the Bayesian updating with a convenient notation:
\begin{equation} P( Model | Data ) = \frac{P( Data | Model ) \cdot P( Model )}{P( Data )} \quad Posterior = \frac{Likelihood \cdot Prior}{Evidence} \end{equation}we can marginalize or normalize to remove the need for the evidence term.
\begin{equation} P( Model | Data ) \propto P( Data | Model ) \cdot P( Model ) \end{equation}I have a coin and you need to figure out if it is a fair coin!
You start with your prior to data assessment of my coin, a prior distribution representing your uncertainty of the probability of heads, $P(H)$
\begin{equation} P(H | Tosses) \propto P(Tosses | H) \cdot P(H) \quad Posterior \propto Likelihood \cdot Prior \end{equation}Then you perform a set of coin tosses to build a likelihood distribution, $P(Tosses | H)$
Then you update the prior distribution with the likelihood distribution to get the posterior distribution, $P(H | Tosses)$.
\begin{equation} P( H | Tosses ) \propto P( Tosses | H ) \cdot P( H ) \quad Posterior \propto Likelihood \cdot Prior \end{equation}Here's some more details to assist:
Here's the steps to get setup in Python with the GeostatsPy package:
That's all!
The following code loads the required libraries.
%matplotlib inline
from ipywidgets import interactive # widgets and interactivity
from ipywidgets import widgets
from ipywidgets import Layout
import matplotlib; import matplotlib.pyplot as plt # plotting
from matplotlib.ticker import (MultipleLocator, AutoMinorLocator) # control of axes ticks
plt.rc('axes', axisbelow=True) # set axes and grids in the background for all plots
import numpy as np # working with arrays
from scipy.stats import triang # parametric distributions
from scipy.stats import binom
from scipy.stats import norm
Here's the code to build the dashboard.
# 4 slider bars for the model input
l = widgets.Text(value=' Sivia (1996) Bayesian Updating Coin Example Demonstration, Michael Pyrcz, Professor, The University of Texas at Austin',layout=Layout(width='950px', height='30px'))
a = widgets.FloatSlider(min=0.0, max = 1.0, value = 0.5, description = r'$\overline{P(heads)}$',continuous_update=False)
d = widgets.FloatSlider(min=0.01, max = 1.0, value = 0.1, step = 0.01, description = r'$\sigma_{P(Heads)}$',continuous_update=False)
b = widgets.FloatSlider(min = 0, max = 1.0, value = 0.5, description = r'$\frac{n_{heads}}{n_{tosses}}$',continuous_update=False)
c = widgets.IntSlider(min = 5, max = 1000, value = 100, description = r'$n_{tosses}$',continuous_update=False)
ui = widgets.HBox([a,d,b,c],)
uik = widgets.VBox([l,ui],)
def f(a, b, c, d): # function to make the plot
heads = int(c * b)
tails = c - heads
x = np.linspace(0.0, 1.0, num=1000)
prior = norm.pdf(x,loc = a, scale = d)
prior = prior / np.sum(prior)
plt.subplot(221)
plt.plot(x, prior,color='red',alpha=0.8,linewidth=3) # prior distribution of coin fairness
plt.arrow(0.48,0.04,-0.4,0,linewidth=1,facecolor='grey',edgecolor='black',head_length=.03)
plt.annotate('Increasingly Weighted Tails',[0.10,0.042],rotation=0.0)
plt.arrow(0.52,0.04,0.4,0,linewidth=1,facecolor='grey',edgecolor='black',head_length=.03)
plt.annotate('Increasingly Weighted Heads',[0.55,0.042],rotation=0.0)
plt.annotate('Double Tailed Coin',[0.02,0.02],rotation=90.0)
plt.annotate('Double Headed Coin',[0.97,0.02],rotation=90.0)
plt.xlim(0.0,1.0)
plt.xlabel('P(Heads)'); plt.ylabel('Density'); plt.title(r'$Prior$ $Distribution$ - What do you think of the coin before the tosses?')
plt.ylim(0, 0.05)
plt.axvline(x=0.5,color='grey',linestyle='--');
plt.annotate('Fair Coin',[0.49,0.02],rotation=90.0)
plt.grid()
plt.gca().xaxis.grid(True, which='major',linewidth = 1.0); plt.gca().xaxis.grid(True, which='minor',linewidth = 0.2) # add y grids
plt.gca().yaxis.grid(True, which='major',linewidth = 1.0); plt.gca().yaxis.grid(True, which='minor',linewidth = 0.2) # add y grids
plt.gca().tick_params(which='major',length=7); plt.gca().tick_params(which='minor', length=4)
plt.gca().xaxis.set_minor_locator(AutoMinorLocator()); plt.gca().yaxis.set_minor_locator(AutoMinorLocator()) # turn on minor ticks
plt.subplot(222) # results from the coin tosses
labels = ['Heads','Tails']
label_pos = [i for i, _ in enumerate(labels)]
plt.bar(labels,[heads,tails],color=['red','blue'],edgecolor='black',alpha=0.8)
plt.xticks(label_pos, labels)
plt.ylim([0,1000]); plt.ylabel('Frequency')
#plt.pie([heads, tails],labels = ['heads','tails'],radius = 0.5*(c/1000)+0.5, autopct='%1.1f%%', colors = ['#ff9999','#66b3ff'], explode = [.02,.02], wedgeprops = {"edgecolor":"k",'linewidth': 1} )
plt.title('The Experimental Result from ' + str(c) + ' Coin Tosses')
plt.gca().yaxis.grid(True, which='major',linewidth = 1.0); plt.gca().yaxis.grid(True, which='minor',linewidth = 0.2) # add y grids
plt.gca().tick_params(which='major',length=7); plt.gca().tick_params(which='minor', length=4)
plt.gca().yaxis.set_minor_locator(AutoMinorLocator())
likelihood = binom.pmf(heads,c,x)
likelihood = likelihood/np.sum(likelihood)
plt.subplot(223) # likelihood distribution given the coin tosses
plt.plot(x, likelihood,color='red',alpha=0.8,linewidth=3)
plt.arrow(0.48,0.04,-0.4,0,linewidth=1,facecolor='grey',edgecolor='black',head_length=.03)
plt.annotate('Increasingly Weighted Tails',[0.10,0.042],rotation=0.0)
plt.arrow(0.52,0.04,0.4,0,linewidth=1,facecolor='grey',edgecolor='black',head_length=.03)
plt.annotate('Increasingly Weighted Heads',[0.55,0.042],rotation=0.0)
plt.annotate('Double Tailed Coin',[0.02,0.02],rotation=90.0)
plt.annotate('Double Headed Coin',[0.97,0.02],rotation=90.0)
plt.xlim(0.0,1.0); plt.ylim(0, 0.05)
plt.xlabel('P(Tosses | Heads)'); plt.ylabel('Density'); plt.title(r'$Likelihood$ $Distribution$ - What do the coin tosses tell you about the coin?')
plt.axvline(x=0.5,color='grey',linestyle='--');
plt.annotate('Fair Coin',[0.49,0.02],rotation=90.0)
plt.grid()
plt.gca().xaxis.grid(True, which='major',linewidth = 1.0); plt.gca().xaxis.grid(True, which='minor',linewidth = 0.2) # add y grids
plt.gca().yaxis.grid(True, which='major',linewidth = 1.0); plt.gca().yaxis.grid(True, which='minor',linewidth = 0.2) # add y grids
plt.gca().tick_params(which='major',length=7); plt.gca().tick_params(which='minor', length=4)
plt.gca().xaxis.set_minor_locator(AutoMinorLocator()); plt.gca().yaxis.set_minor_locator(AutoMinorLocator()) # turn on minor ticks
post = prior * likelihood
post = post / np.sum(post)
plt.subplot(224) # posterior distribution
plt.plot(x, post,color='red',alpha=0.8,linewidth=3)
plt.arrow(0.48,0.04,-0.4,0,linewidth=1,facecolor='grey',edgecolor='black',head_length=.03)
plt.annotate('Increasingly Weighted Tails',[0.10,0.042],rotation=0.0)
plt.arrow(0.52,0.04,0.4,0,linewidth=1,facecolor='grey',edgecolor='black',head_length=.03)
plt.annotate('Increasingly Weighted Heads',[0.55,0.042],rotation=0.0)
plt.annotate('Double Tailed Coin',[0.02,0.02],rotation=90.0)
plt.annotate('Double Headed Coin',[0.97,0.02],rotation=90.0)
plt.xlim(0.0,1.0); plt.ylim(0, 0.05)
plt.xlabel('P(Heads | Tosses)'); plt.ylabel('Density'); plt.title(r"$Posterior$ $Distribution$ - What do you think of the coin after the tosses?")
plt.axvline(x=0.5,color='grey',linestyle='--');
plt.annotate('Fair Coin',[0.49,0.02],rotation=90.0)
plt.grid()
plt.gca().xaxis.grid(True, which='major',linewidth = 1.0); plt.gca().xaxis.grid(True, which='minor',linewidth = 0.2) # add y grids
plt.gca().yaxis.grid(True, which='major',linewidth = 1.0); plt.gca().yaxis.grid(True, which='minor',linewidth = 0.2) # add y grids
plt.gca().tick_params(which='major',length=7); plt.gca().tick_params(which='minor', length=4)
plt.gca().xaxis.set_minor_locator(AutoMinorLocator()); plt.gca().yaxis.set_minor_locator(AutoMinorLocator()) # turn on minor ticks
plt.subplots_adjust(left=0.0, bottom=0.0, right=2.0, top=1.6, wspace=0.2, hspace=0.3)
plt.show()
interactive_plot = widgets.interactive_output(f, {'a': a, 'd': d, 'b': b, 'c': c})
interactive_plot.clear_output(wait = True) # reduce flickering by delaying plot updating
What is the probability that Dr. Pyrcz has a fair coin? Let's solve this with Bayesian updating. The inputs:
, $\sigma_{P(Heads)}$: standard deviation of your prior distribution
display(uik, interactive_plot) # display the interactive plot
VBox(children=(Text(value=' Sivia (1996) Bayesian Updating Coin Example Demonstra…
Output()
This was a simple demonstration of interactive plots in Jupyter Notebook Python with the ipywidgets and matplotlib packages.
I have many other demonstrations on data analytics and machine learning, e.g. on the basics of working with DataFrames, ndarrays, univariate statistics, plotting data, declustering, data transformations, trend modeling and many other workflows available at https://github.com/GeostatsGuy/PythonNumericalDemos and https://github.com/GeostatsGuy/GeostatsPy.
I hope this was helpful,
Michael
Novel Data Analytics, Geostatistics and Machine Learning Subsurface Solutions
With over 17 years of experience in subsurface consulting, research and development, Michael has returned to academia driven by his passion for teaching and enthusiasm for enhancing engineers' and geoscientists' impact in subsurface resource development.
For more about Michael check out these links:
I hope this content is helpful to those that want to learn more about subsurface modeling, data analytics and machine learning. Students and working professionals are welcome to participate.
Want to invite me to visit your company for training, mentoring, project review, workflow design and / or consulting? I'd be happy to drop by and work with you!
Interested in partnering, supporting my graduate student research or my Subsurface Data Analytics and Machine Learning consortium (co-PIs including Profs. Foster, Torres-Verdin and van Oort)? My research combines data analytics, stochastic modeling and machine learning theory with practice to develop novel methods and workflows to add value. We are solving challenging subsurface problems!
I can be reached at mpyrcz@austin.utexas.edu.
I'm always happy to discuss,
Michael
Michael Pyrcz, Ph.D., P.Eng. Professor, Cockrell School of Engineering and The Jackson School of Geosciences, The University of Texas at Austin