#!/usr/bin/env python # coding: utf-8 # # Notebook 1: Bayes's Theorem # # [Bayesian Decision Analysis](https://allendowney.github.io/BayesianDecisionAnalysis/) # # Copyright 2021 Allen B. Downey # # License: [Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/) # In[1]: import numpy as np import pandas as pd import matplotlib.pyplot as plt # ## Bayes's Theorem # # There are two ways to think about Bayes's Theorem: # # * It is a divide-and conquer strategy for computing conditional probabilities. If it's hard to compute $P(A|B)$ directly, sometimes it is easier to compute the terms on the other side of the equation: $P(A)$, $P(B|A)$, and $P(B)$. # # * It is also a recipe for updating beliefs in the light of new data. # # When we are working with the second interpretation, we often write Bayes's Theorem with different variables. Instead of $A$ and $B$, we use $H$ and $D$, where # # * $H$ stands for "hypothesis", and # # * $D$ stands for "data". # # So we write Bayes's Theorem like this: # # $$P(H|D) = \frac{P(H) ~ P(D|H)}{P(D)}$$ # In this context, each term has a name: # # * $P(H)$ is the **prior probability** of the hypothesis, which represents how confident you are that $H$ is true prior to seeing the data, # # * $P(D|H)$ is the **likelihood** of the data, which is the probability of seeing $D$ if the hypothesis is true, # # * $P(D)$ is the **total probability of the data**, that is, the chance of seeing $D$ regardless of whether $H$ is true or not. # # * $P(H|D)$ is the **posterior probability** of the hypothesis, which indicates how confident you should be that $H$ is true after taking the data into account. # # An example will make all of this clearer. # ## The cookie problem # # Here's a problem I got from Wikipedia a long time ago, but now it's been edited away. # # > Suppose you have two bowls of cookies. # > # > * Bowl 1 contains 30 vanilla and 10 chocolate cookies. # > # > * Bowl 2 contains 20 vanilla and 20 chocolate cookies. # > # > You choose one of the bowls at random and, without looking into the bowl, choose one of the cookies at random. It turns out to be a vanilla cookie. # > # > What is the chance that you chose Bowl 1? # # We'll assume that there was an equal chance of choosing either bowl and an equal chance of choosing any cookie in the bowl. # We can solve this problem using Bayes's Theorem. First, I'll define $H$ and $D$: # # * $H_1$ is the hypothesis that the bowl you chose is Bowl 1. # # * $D$ is the datum that the cookie is vanilla ("datum" is the rarely-used singular form of "data"). # # What we want is the posterior probability of $H_1$, which is $P(H_1|D)$. It is not obvious how to compute it directly, but if we can figure out the terms on the right-hand side of Bayes's Theorem, we can get to it indirectly. # 1. $P(H_1)$ is the prior probability of $H_1$, which is the probability of choosing Bowl 1 before we see the data. If there was an equal chance of choosing either bowl, $P(H_1)$ is $1/2$. # # 2. $P(D|H_1)$ is the likelihood of the data, which is the chance of getting a vanilla cookie if $H_1$ is true, in other words, the chance of getting a vanilla cookie from Bowl 1, which is $30/40$ or $3/4$. # # 3. $P(D)$ is the total probability of the data, which is the chance of getting a vanilla cookie whether $H_1$ is true or not. # The prior and likelihood are relatively easy to compute. # In[2]: # Solution prior = 1/2 prior # In[3]: # Solution likelihood = 3/4 likelihood # The probability of the data is more difficult. # To compute $P(D)$, I'll use [the law of total probability](https://en.wikipedia.org/wiki/Law_of_total_probability). # # Let's define $H_2$ to be the hypothesis that the bowl you chose is Bowl 2. # # We know that either $H_1$ or $H_2$ is true (and not both), so we can write: # # $$ P(D) = P(H_1) ~ P(D|H_1) + P(H_2) ~ P(D|H_2)$$ # # Based on the statement of the problem, we have: # # * $P(H_1) = 1/2$ # # * $P(D|H_1) = 3/4$ # # * $P(H_2) = 1/2$ # # * $P(D|H_2) = 1/2$ # In[4]: # Solution prob_data = (1/2) * (3/4) + (1/2) * (1/2) prob_data # Now that we have the terms on the right-hand side, we can use Bayes's Theorem to combine them. # In[5]: # Solution posterior = prior * likelihood / prob_data posterior # The posterior probability is $0.6$, a little higher than the prior, which was $0.5$. # So the vanilla cookie makes us a little more certain that we chose Bowl 1. # ## The Bayes table # # Computing the total probability of the data is often the hardest part of the problem. # Fortunately, there is another way to solve problems like this that makes it easier: the Bayes table. # You can write a Bayes table on paper or use a spreadsheet, but in this notebook I'll use a Pandas DataFrame. # # As an example, I'll use a Bayes table to solve the cookie problem. Here's an empty DataFrame with one row for each hypothesis: # In[6]: table = pd.DataFrame(index=['Bowl 1', 'Bowl 2']) # Now I'll add a column to represent the priors: # In[7]: table['prior'] = 1/2, 1/2 table # And a column for the likelihoods: # # * The chance of getting a vanilla cookie from Bowl 1 is 3/4. # # * The chance of getting a vanilla cookie from Bowl 2 is 1/2. # # In[8]: table['likelihood'] = 3/4, 1/2 table # # The next step is similar to what we did with Bayes's Theorem; we multiply the priors by the likelihoods: # In[9]: table['unnorm'] = table['prior'] * table['likelihood'] table # Each value in `unnorm` is the product of a prior and a likelihood. # # * The first element is $P(H_1) ~ P(D|H_1)$. # # * The second element is $P(H_2) ~ P(D|H_2)$. # # According to the law of total probability, the sum of those terms is the probability of the data, $P(D)$: # # $$P(D) = P(H_1) ~ P(D|H_1) + P(H_2) ~ P(D|H_2)$$ # # So we can compute $P(D)$ by adding up the elements of `unnorm`: # In[10]: prob_data = table['unnorm'].sum() prob_data # Notice that we get 5/8, which is what we got by computing $P(D)$ explicitly. # # Now we divide by $P(D)$ to get the posterior probabilities: # In[11]: table['posterior'] = table['unnorm'] / prob_data table # The posterior probability for Bowl 1 is 0.6, which is what we got using Bayes's Theorem. # As a bonus, we also get the posterior probability of Bowl 2, which is 0.4. # # When we add up the unnormalized posteriors and divide through, we force the posteriors to add up to 1. This process is called "normalization", which is why the total probability of the data is also called the "[normalizing constant](https://en.wikipedia.org/wiki/Normalizing_constant#Bayes'_theorem)" # In[12]: import pandas as pd def make_bayes_table(hypos, prior, likelihood): """Make a Bayes table. hypos: sequence of hypotheses prior: prior probabilities likelihood: sequence of likelihoods returns: DataFrame """ table = pd.DataFrame(index=hypos) table['prior'] = prior table['likelihood'] = likelihood table['unnorm'] = table['prior'] * table['likelihood'] prob_data = table['unnorm'].sum() table['posterior'] = table['unnorm'] / prob_data return table # Here's how we can use this function to solve the cookie problem. # In[13]: hypos = 'Bowl 1', 'Bowl 2' prior = 1/2, 1/2 likelihood = 3/4, 1/2 make_bayes_table(hypos, prior, likelihood) #