This Jupyter notebook is the Python equivalent of the R code in section 10.6 R, pp. 447 - 450, Introduction to Probability, 2nd Edition, Blitzstein & Hwang.

In [1]:

```
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
```

Python/NumPy/SciPy make it easy to compare the expectations of $X$ and $g(X)$ for a given choice of $g$, and this allows us to verify some special cases of Jensen's inequality. For example, suppose we simulate 10^{4} times from the $Expo(1)$ distribution:

In [2]:

```
np.random.seed(24157817)
from scipy.stats import expon
x = expon.rvs(size=10**4)
```

`numpy.mean(numpy.log(x))`

and the latter can be approximated by `numpy.log(numpy.mean(x))`

, so compute both

In [3]:

```
meanlog = np.mean(np.log(x))
print('numpy.mean(numpy.log(x)) = {}'.format(meanlog))
logmean = np.log(np.mean(x))
print('numpy.log(numpy.mean(x)) = {}'.format(logmean))
```

numpy.mean(numpy.log(x)) = -0.5600958563379892 numpy.log(numpy.mean(x)) = 0.00800014338803644

`numpy.mean(numpy.log(x))`

is approximately −0.56 (the true value is around −0.577), while `numpy.log(numpy.mean(x))`

is approximately 0 (the true value is 0). This indeed suggests $\mathbb{E}(log \, X) \leq log \, \mathbb{E} \, X$. We could also compare `numpy.mean(x**3)`

to `numpy.mean(x)**3`

, or `numpy.mean(numpy.sqrt(x))`

to `numpy.sqrt(numpy.mean(x))`

- the possibilities are endless.

To plot the running proportion of Heads in a sequence of independent fair coin tosses, we first generate the coin tosses themselves:

In [4]:

```
np.random.seed(39088169)
from scipy.stats import binom
nsim = 300
p = 1/2
x = binom.rvs(1, p, size=nsim)
```

Then we compute $\bar{X}_n$ for each value of $n$ and store the results in `xbar`

:

In [5]:

```
# divide by sequence from 1 to nsim, inclusive
xbar = np.cumsum(x) / np.arange(1, nsim+1)
```

`numpy.cumsum(x)`

and `np.arange(1, nsim+1)`

. Finally, we plot `xbar`

against the number of coin tosses:

In [6]:

```
x = np.arange(1, nsim+1)
y = xbar
plt.figure(figsize=(12, 4))
plt.plot(x, y, '-', label=r'$\bar{x}$')
plt.hlines(p, 0, nsim,
linestyle='dotted', lw=1.1, alpha=0.5, label=r'$p={}$'.format(p))
plt.xlim([0.0, nsim])
plt.xlabel(r'$n$: number of coin tosses')
plt.yticks([0.0, 0.5, 1.0])
plt.ylabel(r'$\bar{x}_{n}$: proportion of H to $n$')
plt.title('Visualizing the Law of Large Numbers')
plt.legend()
plt.show()
```

You should see that the values of `xbar`

approach `p`

, by the law of large numbers.

A famous example of Monte Carlo integration is the Monte Carlo estimate of $\pi$. The unit disk ${(x, y): x^2 +y^2 ≤ 1}$ is inscribed in the square $[−1, 1] \times [−1, 1]$, which has area 4. If we generate a large number of points that are Uniform on the square, the proportion of points falling inside the disk is approximately equal to the ratio of the disk's area to the square’s area, which is $\pi/4$. Thus, to estimate $\pi$ we can take the proportion of points inside the circle and multiply by 4.

In `matplotlib.pyplot`

, to generate Uniform points on the 2D square, we can independently generate the $x$-coordinate and the $y$-coordinate as $Unif(−1, 1)$ r.v.s, using the results of Example 7.1.22:

In [7]:

```
np.random.seed(63245986)
from scipy.stats import uniform
nsim = 10**6
a = -1
b = 1
x = uniform.rvs(loc=a, scale=b-a, size=nsim)
y = uniform.rvs(loc=a, scale=b-a, size=nsim)
```

Let's try graphing a small portion of those $x$- and $y$-coordinates.

In [8]:

```
inside = x**2 + y**2 < 1.0
outside = ~inside
_, ax = plt.subplots(figsize=(8,8))
# we'll graph the first n co-ordinate pairs
# for points inside and points outside of
# x**2 + y**2 = 1.0
n = 5000
ax.plot(x[inside][0:n], y[inside][0:n], '.', color='#fc8d59')
ax.plot(x[outside][0:n], y[outside][0:n], '.', color='#91bfdb')
ax.set_xlim([-1.0, 1.0])
ax.set_xlabel('x')
ax.set_ylim([-1.0, 1.0])
ax.set_ylabel('y')
ax.set_title(r'Monte Carlo estimate of $\pi$: {} / {} points'.format(n, nsim))
plt.show()
```

*in* the disk, we use `numpy.sum(x**2 + y**2 < 1.0)`

. The array `x**2 + y**2 < 1.0`

is effectively an indicator vector whose i^{th} element is `True`

(equivalent to 1) if the i^{th} point falls *inside* the disk, and `False`

(equivalent to 0) otherwise, so the sum of the boolean elements is the number of points in the disk. To get our estimate of $\pi$, we convert the sum into a proportion and multiply by 4. Altogether, we have

In [9]:

```
num_points_inside = np.sum(x**2 + y**2 < 1.0)
est_pi = 4.0 * num_points_inside / nsim
print('estimated value for pi: {}'.format(est_pi))
```

estimated value for pi: 3.143172

How close was your estimate to the actual value of $\pi$?

One way to visualize the central limit theorem for a distribution of interest is to plot the distribution of $\bar{X}_{n}$ for various values of $n$, as in Figure 10.5. To do this, we first have to generate i.i.d. $X_{1}, \, \ldots, \, X_{n}$ a bunch of times from our distribution of interest. For example, suppose that our distribution of interest is $Unif(0, 1)$, and we are interested in the distribution of $\bar{X}_{12}$, i.e., we set $n = 12$. In the following code, we create a matrix of i.i.d. standard Uniforms. The matrix has 12 columns, corresponding to $X_{1}$ through $X_{12}$. Each row of the matrix is a different realization of $X_{1}$ through $X_{12}$.

In [10]:

```
np.random.seed(102334155)
nsim = 10**4
n = 12
x = uniform.rvs(size=n*nsim).reshape((nsim, n))
print('matrix x has shape: {}'.format(x.shape))
```

matrix x has shape: (10000, 12)

`x`

; we can do this by calling the `numpy.ndarray.mean`

method on `numpy.array`

object `x`

, specifying `axis=1`

to take the average of each row of matrix `x`

:

In [11]:

```
xbar = x.mean(axis=1)
```

Finally, we create a histogram:

In [12]:

```
plt.figure(figsize=(10, 4))
plt.hist(xbar, bins=20)
plt.title(r'Histogram of $\bar{X}_{12}$, $X_i \sim Unif(0,1)$')
plt.xlabel(r'$\bar{x}$')
plt.ylabel(r'Frequency')
plt.show()
```

`scipy.stats.uniform`

to `scipy.stats.expon`

, we see that for $X_j$ generated from the $Expo(1)$ distribution, the distribution of $\bar{X}_{n}$ remains skewed when $n = 12$, so a larger value of $n$ is required before the Normal approximation is adequate.

In [13]:

```
np.random.seed(165580141)
from scipy.stats import expon
n1, n2, n3 = 12, 32, 256
x1 = expon.rvs(size=n1*nsim).reshape((nsim, n1))
x2 = expon.rvs(size=n2*nsim).reshape((nsim, n2))
x3 = expon.rvs(size=n3*nsim).reshape((nsim, n3))
xbar1 = x1.mean(axis=1)
xbar2 = x2.mean(axis=1)
xbar3 = x3.mean(axis=1)
_, (ax1, ax2, ax3) = plt.subplots(1, 3, sharey=True, figsize=(14, 5))
ax1.hist(xbar1, bins=40, color='#1b9e77')
ax1.set_ylabel('Frequency')
ax1.set_title(r'Histogram of $\bar{X}_{12}$, $X_j \sim \, Expo(1)$')
ax2.hist(xbar2, bins=40, color='#d95f02')
ax2.set_title(r'Histogram of $\bar{X}_{32}$, $X_j \sim \, Expo(1)$')
ax3.hist(xbar3, bins=40, color='#7570b3')
ax3.set_title(r'Histogram of $\bar{X}_{256}$, $X_j \sim \, Expo(1)$')
plt.show()
```

Have a look at Appendix A below for another neat visualization of the CLT.

Although the Chi-Square is just a special case of the Gamma (refer to `scipy.stats.gamma`

), it still has its own functions in `scipy.stats.chi2`

: `chi2.pdf(x, n)`

and `chi2.cdf(x, n)`

return the values of the $\chi^2_{n}$ PDF and CDF at `x`

; and `chi2.rvs(n, size=nsim)`

generates `nsim`

$\chi^2_{n}$ r.v.s.

The graph below illustrates Theorem 10.4.2:

In [14]:

```
from scipy.stats import chi2, gamma
_, (ax1, ax2) = plt.subplots(1, 2, sharey=True, figsize=(14, 6))
x = np.linspace(0, 20, 1000)
n_vals = [1, 2, 4, 8, 16]
alphas = [1.0, .7, .5, .3, .2]
# graph for Chi-Square
for i,n in enumerate(n_vals):
ax1.plot(x, chi2.pdf(x, n), lw=3.2, alpha=alphas[i], color='#33aaff', label='n={}'.format(n))
ax1.set_title(r'$X \sim \chi^2_{n}$')
ax1.set_xlim((0, 20.0))
ax1.set_ylim((0.0, 0.5))
ax1.legend()
# graph for Gamma
lambd = 0.5
for i,n in enumerate(n_vals):
ax2.plot(x, gamma.pdf(x, n/2, scale=1/lambd),
lw=3.2, alpha=alphas[i], color='#ff9933', label='n={}'.format(n))
ax2.set_title(r'$X \sim Gamma(\frac{n}{2}, \frac{1}{2})$')
ax2.set_xlim((0, 20.0))
ax2.set_ylim((0.0, 0.5))
ax2.legend()
plt.suptitle((r'Theorem 10.4.2: $\chi^2_{n}$ distribution is the $Gamma(\frac{n}{2},'
r' \, \frac{1}{2})$ distribution'))
plt.show()
```

`scipy.stats.t`

. To evaluate the PDF or CDF of the $t_n$ distribution at $x$, we use `t.pdf(x, n)`

or `t.cdf(x, n)`

. To generate `nsim`

r.v.s from the $t_{n}$ distribution, we use `t.rvs(n, size=nsim)`

. Of course, `t.pdf(x, 1)`

is the same as `scipy.stats.cauchy`

's `cauchy.pdf(x)`

.

In [15]:

```
from scipy.stats import t, cauchy
_, (ax1, ax2) = plt.subplots(1, 2, sharey=True, figsize=(14, 6))
x = np.linspace(0, 20, 1000)
# graph for Student-t
y1 = t.pdf(x,1)
ax1.plot(x, y1, lw=3.2, alpha=0.8, color='#33aaff')
ax1.set_title(r'Student-$t$: $\tt{t.pdf(x, 1)}$')
# graph for Cauchy
y2 = cauchy.pdf(x)
ax2.plot(x, y2, lw=3.2, alpha=0.8, color='#ff9933')
ax2.set_title(r'Cauchy: $\tt{cauchy.pdf(x)}$')
plt.suptitle(r'Student-$t$ vs. Cauchy')
plt.show()
```

Statistician and geneticist Francis Galton invented the *quincunx*, or *bean machine*, to illustrate the Normal distribution. Here is an implementation built with IPython and D3.js (version 5.7.0), embedded right in this notebook.

In this interactive visualization of the CLR, you can:

- alter the animation by controlling the time between redraws with the
`delay`

control (from 50 to 1000 milliseconds, changes are immediately effective without restarting the animation) - change the number of bins with the
`num. bins`

control (from 1 to 25, board is redrawn upon completion of change) - change the probability $p$ that the ball will go to the right (slider moves left to 0.00, right to 1.00, board is redrawn upon completion of change)
- as the balls drop into the bins below, the heights of the bins will increase to form a histogram
- the running percentage of balls per bin is displayed on each bin and updated in real-time (
`Math.floor`

is used, so some accuracy is lost)

First, we import `display`

, `Javascript`

and `HTML`

from the `IPython.display`

module.

Next we use `display`

to load the D3.js file from the Google Hosted Libraries content delivery network. We also load the main JavaScript code in `assets/quincunx.js`

and Cascading Style Sheet definitions in `assets/quincunx.html`

.

Lastly, we embed a small JavaScript snippet into the code cell below to provide an entry point to the `quincunx`

module in `assets/quincunx.js`

.

In [16]:

```
from IPython.display import display, Javascript, HTML
display(Javascript("""
require.config({
paths: {
d3: 'https://ajax.googleapis.com/ajax/libs/d3js/5.7.0/d3.min'
}
});
"""))
display(Javascript(filename="./assets/quincunx.js"))
display(HTML(filename="./assets/quincunx.html"))
Javascript("""
(function(element){
require(['quincunx'], function(quincunx) {
quincunx(element.get(0))
});
})(element);
""")
```

Out[16]:

*Can you use the central limit theorem to explain why the distribution of particles at the bottom is approximately Normal?*

This interactive visualization extends ideas from the following blogposts: