#!/usr/bin/env python # coding: utf-8 # # B # # ## Python / NumPy / SciPy # # This Jupyter notebook is the Python equivalent of the R code in appendix B, R, pp. 561 - 565, [Introduction to Probability, 1st Edition](https://www.crcpress.com/Introduction-to-Probability/Blitzstein-Hwang/p/book/9781466575578), Blitzstein & Hwang. # # ---- # ## B.1 Vectors # # Rather than using the usual Python list (array), for probability and statistics it is more convenient to use a [NumPy](https://docs.scipy.org/doc/numpy/user/basics.creation.html) array, as there are many [advantages to using a NumPy array over a regular Python lists](https://stackoverflow.com/questions/993984/what-are-the-advantages-of-numpy-over-regular-python-lists). The rules for operating on NumPy arrays of different shape are quite different from vectors in R (please see [Broadcasting](https://docs.scipy.org/doc/numpy-1.15.0/user/basics.broadcasting.html)). # # Building on NumPy, [SciPy](https://docs.scipy.org/doc/scipy-1.2.0/reference/tutorial/basic.html) is provides functions for mathematics, science, and engineering. The [`scipy.stats`](https://docs.scipy.org/doc/scipy/reference/stats.html) module contains a large number of probability distributions as well as a growing library of statistical functions. # # # # | Command | What it does | # |---------|--------------| # | [`numpy.array([1, 1, 0, 2.7, 3.1])`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.array.html) | creates the array [1, 1, 0, 2.7, 3.1] | # | [`numpy.arange(1, 101)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.arange.html) | creates the array [1, 2, ..., 100] | # | [`numpy.arange(1, 100)**3`](https://docs.scipy.org/doc/numpy-1.15.0/user/basics.broadcasting.html#general-broadcasting-rules) | creates the array [1³, 2³, ..., 100³] | # | [`numpy.zeros(50)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.zeros.html) | creates the array [0, 0,..., 0] of length 50 | # | `numpy.arange(0, 100, 3)` | creates the array [0, 3, 6, 9, ..., 99] | # | [`v[4]`](https://docs.scipy.org/doc/numpy-1.15.1/reference/arrays.indexing.html#integer-array-indexing) | 5^th entry of vector $\mathtt{v}$ (index starts at 0) | # | [`v[[True, True, True, True, False, True, ... , True]]`](https://docs.scipy.org/doc/numpy-1.15.1/reference/arrays.indexing.html#boolean-array-indexing) | all but the 5^th entry of $\mathtt{v}$ | # | `v[[2, 0, 3]]` | 3^rd, 1^st, 4^th entries of array $\mathtt{v}$ | # | `v[v>2]` | entries of $\mathtt{v}$ that exceed 2 | # | [`numpy.where(v > 2)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.where.html) | indices of $\mathtt{v}$ such that entry exceeds 2 | # | `numpy.where(v == 7)` | indices of $\mathtt{v}$ such that entry equals 7 | # | [`v.min()`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.min.html) | minimum of $\mathtt{v}$ | # | [`v.max()`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.max.html) | maximum of $\mathtt{v}$ | # | `numpy.where(v==v.max())` | indices where $\mathtt{v.max()}$ is achieved | # | [`v.sum()`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.sum.html) | sum of the entries in $\mathtt{v}$ | # | [`v.cumsum()`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.cumsum.html) | cumulative sums of the entries in $\mathtt{v}$ | # | [`v.prod()`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.prod.html) | product of the entries in $\mathtt{v}$ | # | [`scipy.stats.rankdata(v)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rankdata.html) | ranks of the entries in $\mathtt{v}$ | # | [`v.size`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.size.html) | number of elements in array $\mathtt{v}$ | # | [`numpy.sort(v)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.sort.html) | return a sorted copy of array $\mathtt{v}$ (in increasing order) | # | [`numpy.unique(v)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.unique.html) | lists each element of $\mathtt{v}$ once, without duplicates | # | `numpy.unique(v, return_counts=True)` | tallies how many times each element of $\mathtt{v}$ occurs | # | `dict(zip(*numpy.unique(v, return_counts=True)))` | return mapping of array $\mathtt{v}$ element keys and frequency values | # | [`numpy.hstack([v, w])`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.hstack.html) | concatenates vectors $\mathtt{v}$ and $\mathtt{w}$ | # | [`numpy.union1d(v, w)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.union1d.html) | union of $\mathtt{v}$ and $\mathtt{w}$ as sets | # | [`numpy.intersect1d(v, w)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.intersect1d.html) | intersection of $\mathtt{v}$ and $\mathtt{w}$ as sets | # | `v + w` | adds $\mathtt{v}$ and $\mathtt{w}$ entrywise, if w is a scalar (see [Broadcasting](https://docs.scipy.org/doc/numpy-1.15.0/user/basics.broadcasting.html)) | # | `v * w` | multiplies $\mathtt{v}$ and $\mathtt{w}$ entrywise, if w is a scalar (see [Broadcasting](https://docs.scipy.org/doc/numpy-1.15.0/user/basics.broadcasting.html)) | # ## B.2 Matrices # # # From [Linear Algebra (scipy.linalg)](https://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html) page at SciPy.org: # * The use of the [`numpy.matrix`](https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.matrix.html) class is discouraged, since it adds nothing that cannot be accomplished with 2D [`numpy.ndarray`](https://docs.scipy.org/doc/numpy-1.15.0/reference/arrays.ndarray.html) objects. # * [`scipy.linalg`](https://docs.scipy.org/doc/scipy/reference/linalg.html) contains all the functions in [`numpy.linalg`](https://docs.scipy.org/doc/numpy-1.15.1/reference/routines.linalg.html), plus some other more advanced ones... Another advantage of using `scipy.linalg` over `numpy.linalg` is that it is always compiled with BLAS/LAPACK support... Therefore, unless you don't want to add scipy as a dependency to your numpy program, _use `scipy.linalg` instead of `numpy.linalg`_ # # Therefore, these notebooks will use `numpy.ndarray` for matrices, and use `scipy.linalg` for its linear algebra functionality. # # # | Command | What it does | # |---------|--------------| # | `numpy.array([[1,5], [3,7]])` | creates the matrix $\begin{pmatrix} 1 & 5 \\ 3 & 7 \end{pmatrix}$ | # | [`A.shape`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.shape.html) | gives the dimensions of matrix `A` | # | [`numpy.diag(A)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.diag.html) | extracts the diagonal of matrix `A` | # | `numpy.diag([1, 7])` | creates the diagonal matrix $\begin{pmatrix} 1 & 0 \\ 0 & 7 \end{pmatrix}$ | # | [`numpy.stack([u, v, w])`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.stack.html) | binds arrays `u`, `v`, `w` into a matrix, as rows | # | [`numpy.column_stack([u, v, w])`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.column_stack.html) | binds arrays `u`, `v`, `w` into a matrix, as columns | # | [`A.T`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.T.html#numpy.ndarray.T) | transpose of matrix `A` | # | [`A[1,2]`](https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.ndarray.html#indexing-arrays) | row 2, column 3 entry of matrix `A` | # | `A[1, :]` | row 2 of matrix `A` (as a vector) | # | `A[:, 2]` | column 3 of matrix `A` (as a vector) | # | `A[[0,2],:][:,[1,3]]` | submatrix of A, keeping rows 1, 3 and columns 2, 4
see [Array object, Combining advanced and basic indexing](https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.indexing.html#combining-advanced-and-basic-indexing) | # | `A.sum(axis=1)` | row sums of matrix `A` | # | `A.mean(axis=1)` | row averages of matrix `A` | # | `A.sum(axis=0)` | column sums of matrix `A` | # | `A.mean(axis=0)` | column averages of matrix `A` | # | [`scipy.linalg.eig(A)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.eig.html#scipy.linalg.eig) | eigenvalues and eigenvectors of matrix `A` | # | [`scipy.linalg.inv(A)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.inv.html) | matrix `A`⁻¹ | # | [`scipy.linalg.solve(A, b)`](https://docs.scipy.org/doc/scipy-1.2.0/reference/generated/scipy.linalg.solve.html) | solves `Ax = b` for `x` (where `b` is a column vector) | # | [`A @ B`](https://www.python.org/dev/peps/pep-0465/) | matrix multiplication `AB` | # | [`scipy.linalg.fractional_matrix_power(A, k)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.fractional_matrix_power.html#scipy.linalg.fractional_matrix_power) | matrix power `A`^k | # ## B.3 Math # # * NumPy has many convenient [mathematical functions](https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html), which are geared for operations on scalars, vectors and matrices. # * SciPy also offers functions similar to NumPy, but the SciPy submodules are more extensive, and, depending upon your Python distribution, may also already be optimized to take advantage of the Intel Math Kernel Library, LAPACK, and BLAS. # * Thus, unless you have clear reasons for not using SciPy, we suggest you use the `scipy` submodules whenever possible. # * Note that while many of these functions are provided in the [`math`](https://docs.python.org/3/library/math.html) standard Python module, the NumPy and SciPy functions below work out of the box on scalar values as well as vectors. # # # | Command | What it does | # |---------|--------------| # | [`numpy.abs(x)`](https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.absolute.html) | $\lvert x \rvert$ | # | [`numpy.exp(x)`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.exp.html) | $e^x$ | # | [`numpy.log(x)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.log.html) | $log(x)$ | # | `numpy.log(x) / numpy.log(b)` | $log_{b}(x)$ | # | [`numpy.sqrt(x)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.sqrt.html) | $\sqrt{x}$ | # | [`numpy.floor(x)`](https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.floor.html) | $\lfloor x \rfloor$ | # | [`numpy.ceil(x)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ceil.html) | $\lceil x \rceil$ | # | [`scipy.special.factorial(n)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.factorial.html) | $n!$ | # | [`scipy.special.gammaln(n + 1)`](https://docs.scipy.org/doc/scipy-0.19.0/reference/generated/scipy.special.gammaln.html) | $log(n!)$ (helps prevent overflow) | # | [`scipy.special.gamma(a)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.gamma.html) | $\Gamma(a)$ | # | `scipy.special.gammaln(a)` | $log(\Gamma(a))$ (helps prevent overflow) | # | [`scipy.special.comb(n, k)`](https://docs.scipy.org/doc/scipy-0.15.0/reference/generated/scipy.special.comb.html#scipy.special.comb) | binomial coefficient $\binom{n}{k}$ | # | `pbirthday(k)` | (see Birthday problem calculation and simulation, Ch. 1 notebook) | # | [`x**2 if x > 0 else \x**3`](https://docs.python.org/3/reference/expressions.html#conditional-expressions) | $x^2$ if $x > 0$, $x^3$ otherwise (piecewise) | # | `def f(x): return exp(-x)` | defines the function $f$ by $f(x) = e^{-x}$ | # | [`scipy.integrate.quad(f, a, b)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.quad.html) | finds $\int_{a}^{b} f(x) \, dx$ numerically | # | [`scipy.optimize.fminbound(f, a, b)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fminbound.html) | _minimizes_ $f$ numerically on [a, b] | # | [`scipy.optimize.fminbound(lambda x: -f(x), a, b)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fminbound.html) | _maximizes_ $f$ numerically on [a, b] | # | [`scipy.optimize.root(f, x0=numpy.mean([a,b]))`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.root.html) | searches numerically for a zero of $f$ in $[a, b]$
initially guessing at $\frac{a + b}{2}$| # # ## B.4 Sampling and simulation # # # | Command | What it does | # |---------|--------------| # | [`numpy.random.permutation(7)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.permutation.html#numpy.random.permutation) | random permutation of 0, 1, ..., 6 | # | `numpy.random.permutation(range(1,8))` | random permutation of 1, 2, ..., 7 | # | [`numpy.random.choice(range(1,53), size=5, replace=False)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.choice.html#numpy.random.choice) | picks 5 times from 1, 2, ..., 52 (don't replace) | # | `numpy.random.choice(list(string.ascii_lowercase),`
`size=5, replace=False)` | picks 5 random letters of the alphabet (don't replace)
see [Common `string` operations](https://docs.python.org/3/library/string.html) | # | `numpy.random.choice(range(1,4), size=5, p=p)` | picks 5 times from 1, 2, 3 with probabilities $\mathtt{p}$ (replace) | # | [`[experiment() for _ in range(10**4)]`](https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions) | gathers 10⁴ executions of $\mathtt{experiment}$ function into array | # ## B.5 Plotting # # * The Python-based Jupyter notebooks here in this repository all make use of the [`matplotlib.pyplot`](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.html) state-based interface to matplotlib. # * The [`%matplotlib inline` magic command](https://ipython.readthedocs.io/en/stable/interactive/plotting.html#id1) provides a MATLAB-like way of plotting in the notebook. # * `matplotlib.pyplot` is mainly intended for interactive plots and simple cases of programmatic plot generation # * If the [`matplotlib` colormaps](https://matplotlib.org/2.0.2/api/colors_api.html) are not enough, you can find all sorts of color suggestions at [`colorbrewer2.org`](http://colorbrewer2.org/#type=sequential&scheme=BuGn&n=3). # * The single underscore variable name `_` is often used to denote a "throwaway" variable whose value is not used. # * For example, in a for loop, if the loop counter is never really used, you can write a for loop like this: # # for _ in range(10): # print("I don't care what iteration I'm on!") # # * `_` is also used to assign but ignore return values from functions. In the `matplotlib.pyplot.subplots` example below, we ignore the `fig` returned by assigning that to `_`. # * `matplotlib` being state-based, you can continue to add to a plot until [`matplotlib.pyplot.show`](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.show.html#matplotlib.pyplot.show) is called; or in a Jupyter notebook, until the end of a code cell. # # # | Command | What it does | # |---------|--------------| # | [`x = numpy.linspace(a, b, num=n)`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html) | create a vector of `n` equally spaced points from `a` to `b` | # | [`matplotlib.pyplot.plot(x, f(x))`](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.plot.html) | graphs the given vector `x` and a function `f` on `x` | # | [`matplotlib.pyplot.scatter(x, f(x))`](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.scatter.html#matplotlib.pyplot.scatter) | creates scatter plot of the points $(x_i, \, y_i)$ | # | `matplotlib.pyplot.plot(x, y, linestyle="--")` | creates line plot of the points $(x_i, \, y_i)$, with the given [linestyle](https://matplotlib.org/gallery/lines_bars_and_markers/line_styles_reference.html) | # | `matplotlib.pyplot.scatter(x,y)` | adds the points $(x_i, \, y_i)$ to the plot | # | `lines(x,y)` | adds line segments through the $(x_i, \, y_i)$ to the plot | # | `matplotlib.pyplot.plot(x, b*x + a)` | adds the line with intercept `a`, slope `b` to the plot, given `x` | # | [`matplotlib.pyplot.hist(x, bins=b, col="blue")`](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.hist.html) | blue histogram of the values in `x`, with `b` bins | # | [`_, (ax1, ax2) = matplotlib.pyplot.subplots(1, 2)`](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.subplots.html) | tells `matplotlib` we want 2 side-by-side plots (a 1 $\times$ 2 array of plots)
where the axes assigned are `ax1` and `ax2` | # ## B.6 Programming # # | Command | What it does | # |---------|--------------| # | [`x = numpy.pi`](https://docs.scipy.org/doc/numpy-1.15.0/reference/constants.html) | sets $\mathtt{x}$ equal to $\pi$ | # | `x>3 and x<5` | Is $\mathtt{x > 3}$ and $\mathtt{x < 5}$? ($\mathtt{True/False}$) | # | `x>3 or x<5` | Is $\mathtt{x > 3}$ or $\mathtt{x < 5}$? ($\mathtt{True/False}$) | # | `if n>3: x = x + 1` | adds 1 to $\mathtt{x}$ if $\mathtt{n > 3}$ | # | `x = x+1 if n==0 else x + 2` | adds 1 to $\mathtt{x}$ if $\mathtt{n = 0}$, else adds 2 | # ## B.7 Summary statistics # # | Command | What it does | # |---------|--------------| # | [`numpy.mean(v)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.mean.html#numpy.mean) | sample mean of array `v` | # | [`numpy.var(v, ddof=1)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.var.html#numpy.var) | sample variance of vector `v` | # | [`numpy.std(v, ddof=1)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.std.html) | sample standard deviation of vector `v` | # | [`numpy.median(v)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.median.html) | sample median of vector `v` | # | [`numpy.percentile(v, [0, 25, 50, 75, 100])`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.percentile.html) | min, 1^st quartile, median, 3^rd quartile, max of `v` | # | `numpy.percentile(v, p)` | p^th sample quantile of vector `v` | # | [`numpy.cov(v, w, ddof=1)[0,1]`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.cov.html) | sample covariance of vectors `v` and `w` | # | [`numpy.corrcoef(v, w)[0,1]`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.corrcoef.html) | sample correlation of vectors `v` and `w` | # ## B.8 Distributions # # * [Discrete distributions in `scipy.stats`](https://docs.scipy.org/doc/scipy/reference/stats.html#discrete-distributions) # * [Continuous distributions in `scipy.stats`](https://docs.scipy.org/doc/scipy/reference/stats.html#discrete-distributions) # * [Multivariate distributions in `scipy.stats`](https://docs.scipy.org/doc/scipy/reference/stats.html#multivariate-distributions) # * Please see the [Statistical functions page for `scipy.stats`](https://docs.scipy.org/doc/scipy/reference/stats.html) for a full list of the probability distributions and statistical functions provided. # # | Command | What it does | # |---------|--------------| # | [`help(`_distribution_`)`](https://docs.python.org/3/library/functions.html#help) | shows documentation on the _distribution_ named | # | [`scipy.stats.binom.pmf(k, n, p)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binom.html#scipy.stats.binom) | PMF $P(X = k)$ for $X \sim Bin(n, p)$ | # | `scipy.stats.binom.cdf(x, n, p)` | CDF $P(X \leq x)$ for $X \sim Bin(n, p)$ | # | `scipy.stats.binom.ppf(q, n, p)` | quantile $P(X \leq x) \geq q$ for $X \sim Bin(n, p)$ | # | `scipy.stats.binom.rvs(n, p, size=r)` | vector of `r` i.i.d. $Bin(n, p)$ r.v.s | # | [`scipy.stats.geom.pmf(k, p)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.geom.html#scipy.stats.geom) | PMF $P(X=k) = (1-p)^{k-1}\,p$ | # | [`scipy.stats.hypergeom.pmf(k, M, n, N)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.hypergeom.html#scipy.stats.hypergeom) | PMF $P(X=k) = \frac{\binom{n}{k}\binom{M-n}{N-k}}{\binom{M}{N}} $ | # | [`scipy.stats.nbinom.pmf(k, n, p)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.nbinom.html#scipy.stats.nbinom) | PMF $P(X=k) = \binom{k+n-1}{n-1} \, p^n \, (1-p)^k$ | # | [`scipy.stats.poisson.pmf(k, mu)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.poisson.html#scipy.stats.poisson) | PMF $P(X=k) = e^{-\mu}\,\frac{\mu^k}{k!}$ | # | [`scipy.stats.beta.pdf(x,a,b)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.beta.html#scipy.stats.beta) | PDF $f(x) = \frac{\Gamma(a+b)\,x^{a-1}\,(1-x)^{b-1}}{\Gamma(a)\,\Gamma(b)}$ | # | [`scipy.stats.cauchy.pdf(x)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.cauchy.html#scipy.stats.cauchy) | PDF $f(x) \frac{1}{\pi \, (1 + x^2)}$ | # | [`scipy.stats.chi2.pdf(x, df)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chi2.html#scipy.stats.chi2) | PDF $f(x)$ for $X \sim \chi^2_{df}$ | # | [`scipy.stats.expon.pdf(x, scale=1/b)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.expon.html#scipy.stats.expon) | PDF $f(x)$ for $X \sim Expo(b)$ | # | [`scipy.stats.gamma.pdf(x, a, scale=1/b)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gamma.html#scipy.stats.gamma) | PDF $f(x)$ for $X \sim Gamma(a, b)$ | # | [`scipy.stats.lognorm.pdf(x, loc=m, scale=s)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html#scipy.stats.lognorm) | PDF $f(x)$ for $X \sim \mathcal{LN}(m, s^2)$ | # | [`scipy.stats.norm.pdf(x, loc=m, scale=s)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html) | PDF $f(x)$ for $X \sim \mathcal{N}(m, s^2)$ | # | [`scipy.stats.t.pdf(x, df)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.html) | PDF $f(x)$ for $X \sim t_{df}$ | # | [`scipy.stats.uniform.pdf(x, a, a+b)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.uniform.html) | PDF $f(x)$ for $X \sim Unif(a, b)$ | # # ---- # # The application programming interface (API for the [`scipy.stats`](https://docs.scipy.org/doc/scipy/reference/stats.html) module is fairly consistent for the distributions supported. [Discrete distributions](https://docs.scipy.org/doc/scipy/reference/stats.html#discrete-distributions) have `pmf`, `cdf`, and `rvs` functions for probability mass, cumulative distribution, and random value generation. Similarly for [continuous distributions](https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions), the functions for `pdf`, `cdf`, and `rvs` correspond to probability density, cumulative distribution, and random value generation. Note that for Normal distributions, the mean and **standard deviation** need to be provided as the parameters, rather than mean and _variance_. # # Among the [multivariate distributions](https://docs.scipy.org/doc/scipy/reference/stats.html#multivariate-distributions) supported, [`scipy.stats.multinomial`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.multinomial.html#scipy.stats.multinomial) has `pmf` and `rvs` for probability mass function and random value generation. [`scipy.stats.multivariate_normal`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.multivariate_normal.html#scipy.stats.multivariate_normal) has `pdf`, `cdf`, and `rvs` for probabilty density, cumulative distribution, and random value generation functions. # ---- # # © Blitzstein, Joseph K.; Hwang, Jessica. Introduction to Probability (Chapman & Hall/CRC Texts in Statistical Science).