Discrete Distributions¶

Joint PMF¶

The joint probability mass function of two discrete random variable is defined as

$$ P_{XY}(x, y) = P(X = x, Y=y) $$

It is convenient to define a finite range for $X$ and $Y$, $R_X = \{x_1, x_2, ...\}$ and $R_Y = \{y_1, y_2, ...\}$ and its cartesian product

$$ R_{XY}\subset R_X \times R_Y = \{(x_i, y_j)|x_i\in R_X, y_j \in R_Y\} $$

is the range for joint distribution.

The most common property for probability distribution is

$$ \sum_{(x_i,y_j)\in R_{XY}}P_{XY}(x_i,y_j)=1 $$

Marginal PMF¶

Let's consider a probability mass function table.

\begin{array}{|c|c|} \hline & Y = 0 & Y = 1 & Y= 2 \\ \hline X = 0 & 1/6 & 1/4 & 1/8 \\ \hline X = 1 & 1/8 & 1/6 & 1/6 \\ \hline \end{array}

Find $P(X=0, Y = 1)$. It is easy, just eyeball the table.

$$P(X=0, Y = 1) = 1/4$$

Find $P(X=0| Y = 1)$ and $P(Y = 1| X =0)$

$$ P(X=0| Y = 1) = \frac{1/4}{1/4+1/6}=3/5\\ P(Y = 1| X =0) = \frac{1/4}{1/6+1/4+1/8}= 6/13 $$

Find marginal PMFs of $X$ and $Y$

In [36]:

from fractions import Fraction as frac
pY_0 = frac(1,6) + frac(1,8)
pY_1 = frac(1,4) + frac(1,6)
pY_2 = frac(1,8) + frac(1,6)

pX_0 = frac(1,6) + frac(1,4) + frac(1,8)
pX_1 = frac(1,8) + frac(1,6) + frac(1,6)

print('Marginal PMF of pY are {0}, {1}, {2}.'.format(pY_0,pY_1,pY_2))
print('Marginal PMF of pX are {0}, {1}.'.format(pX_0,pX_1))

Marginal PMF of pY are 7/24, 5/12, 7/24.
Marginal PMF of pX are 13/24, 11/24.

The reason we call them marginal is because they are written at the margin of the table.

\begin{array}{|c|c|} \hline & Y = 0 & Y = 1 & Y= 2 & P_X(x)\\ \hline X = 0 & 1/6 & 1/4 & 1/8 & 13/24 \\ \hline X = 1 & 1/8 & 1/6 & 1/6 & 11/24\\ \hline P_Y(y) & 7/24 & 5/12 & 7/24 & \\ \hline \end{array}

Are $X$ and $Y$ independent?

If independent, a conditional probability should equal to marginal probability, for instance

$$ P(X=0| Y = 1)= \frac{1/4}{1/4+1/6} =3/5\\ P_X(X=0)=13/24 $$

They are not equal, which means they are not independent.

The relationship of marginal PMF and conditional PMF is

$$ P(X|Y) = \frac{P(X,Y)}{P_Y(Y)} $$

i.e.

$$ \text{Conditional PMF} = \frac{\text{Joint PMF}}{\text{Marginal PMF}} $$

Joint and Marginal CDF ¶

The joint CDF of two random variables $X$ and $Y$ is defined as

$$ F_{XY}(x,y)=P(X\leq x, Y\leq y) $$

where $0\leq F_{XY}(x,y) \leq 1$.

For instance, the joint CDF of $P(X\leq 2, Y\leq 1)$ in range $(-6,\ 6)$ is the probability of the shaded area.

In [92]:

x = np.linspace(-6, 1)
y = 2*np.ones(len(x))
fig, ax = plt.subplots(figsize = (8, 8))

ax.plot([1, -5], [2, 2], color = 'b')
ax.scatter(1, 2, s = 80, zorder = 3, color = 'red')
ax.plot([1, 1], [2, -5], color = 'b')
ax.axis([-5, 6, -5, 6])
ax.scatter(np.random.uniform(low = -5, high = 6, size = 50),
           np.random.uniform(low = -5, high = 6, size = 50))
ax.fill_between(x, y, -5, color = 'red', alpha =.2)
ax.text(1, 2.1, '$(1, 2)$', size = 15)
ax.grid()

Marginal CDF $F_X(x)$ and $F_Y(y)$ are denoted

$$ F_X(x) = P(X\leq x, Y\leq \infty)\\ F_Y(y) = P(X\leq \infty, Y\leq y) $$

Conditional PMF and CDF¶

If $A$ is a random event, the conditional PMF of $X$ given $A$ is denoted as

$$ P_{X|A}(X = x_i) = \frac{P(X=x_i,A)}{P(A)} $$

Consider a PMF as below

\begin{array}{|c|c|} \hline & X = -2 & X = -1 & X = 0 & X = 1 & X = 2 \\ \hline Y = 2 & 0 & 0 & 1/13 & 0 & 0 \\ \hline Y = 1 & 0 & 1/13 & 1/13 & 1/13 & 0 \\ \hline Y = 0 & 1/13 & 1/13 & 1/13 & 1/13 & 1/13 \\ \hline Y = -1 & 0 & 1/13 & 1/13 & 1/13 & 0 \\ \hline Y = -2 & 0 & 0 & 1/13 & 0 & 0 \\ \hline \end{array}

Mathematically, it is defined as $G=\{(x, y)|x, y \in \mathbb{Z},| x|+| y | \leq 2\}$.

Find the marginal PMFs of $X$ and $Y$.

In [94]:

pY_2 = frac(1,13)
pY_1 = frac(1,13)*3
pY_0 = frac(1,13)*5
pY_m1 = frac(1,13)*3
pY_m2 = frac(1,13)

pX_2 = frac(1,13)
pX_1 = frac(1,13)*3
pX_0 = frac(1,13)*5
pX_m1 = frac(1,13)*3
pX_m2 = frac(1,13)

print('Marginal PMF of pY are {0}, {1}, {2}, {3}, {4}.'.format(pY_2,pY_1,pY_0,pY_m1,pY_m2))
print('Marginal PMF of pX are {0}, {1}, {2}, {3}, {4}.'.format(pX_2,pX_1,pX_0,pX_m1,pX_m2))

Marginal PMF of pY are 1/13, 3/13, 5/13, 3/13, 1/13.
Marginal PMF of pX are 1/13, 3/13, 5/13, 3/13, 1/13.

We add marginals to the table

\begin{array}{|c|c|} \hline & X = -2 & X = -1 & X = 0 & X = 1 & X = 2 & P_Y(y) \\ \hline Y = 2 & 0 & 0 & 1/13 & 0 & 0 & 1/13\\ \hline Y = 1 & 0 & 1/13 & 1/13 & 1/13 & 0 & 3/13 \\ \hline Y = 0 & 1/13 & 1/13 & 1/13 & 1/13 & 1/13 & 5/13 \\ \hline Y = -1 & 0 & 1/13 & 1/13 & 1/13 & 0 & 3/13 \\ \hline Y = -2 & 0 & 0 & 1/13 & 0 & 0 & 1/13\\ \hline P_X(x) &1/13 &3/13 & 5/13 & 3/13 & 1/13 \\ \hline \end{array}

Find the conditional PMF of $X$ given $Y = 1$, i.e. $P(X|Y=1)$

\begin{array}{|c|c|} \hline & X = -2 & X = -1 & X = 0 & X = 1 & X = 2\\ \hline Y = 1 & 0 & 1/3 & 1/3 & 1/3 & 0 \\ \hline \end{array}

It shows that given $Y=1$, $X$ is uniformly distributed over $\{-1,0,1\}$.

Are X and Y independent?

No, for instance $P(X=0|Y=1) \neq P_X(X = 0)$

If random event $A$ is replaced by a discrete random variable $Y$, the conditional density PMFs are defined as

$$ \begin{array}{l} P_{X | Y}\left(x_{i} | y_{j}\right)=\frac{P_{X Y}\left(x_{i}, y_{j}\right)}{P_{Y}\left(y_{j}\right)} \\ P_{Y | X}\left(y_{j} | x_{i}\right)=\frac{P_{X Y}\left(x_{i}, y_{j}\right)}{P_{X}\left(x_{i}\right)} \end{array} $$

where $x_i$ and $y_j$ are realizations of $X$ and $Y$.

Conditional Expectation¶

The expectation can be conditional on a random event or a realization of random variable.

$$\begin{align} E[X | A]&=\sum_{x_{i}\in R_{X}}x_{i} P_{X | A}\left(x_{i}|A\right) \\ E[X | Y=y_{j}]&=\sum_{x_{i} \in R_{X}} x_{i} P_{X | Y}\left(x_{i} | Y=y_{j}\right) \end{align}$$

Use the PMF example in last section, let's try to answer questions below.

Find $E[X|Y = 1]$.

To calculate the conditional expectation, we must use conditional probability as weight:

$$ E[X|Y = 1]= -1 \left(\frac{1}{3}\right)+ 0 \left(\frac{1}{3}\right)+1 \left(\frac{1}{3}\right)=0 $$

Find $E[X|−1 < Y < 2]$.

First, calculate the conditional PMF

$$ P_{X|-1<Y<2}(x_i |-1<Y<2) = -2\frac{1/13}{8/13}-\frac{2/13}{8/13}+0\frac{3/13}{8/13}+ \frac{2/13}{8/13} + 2\frac{1/13}{8/13}=0 $$

Conditional Expectation as A Function¶

If you paid attention to the conditional expection expression

$$ E[X | Y=y_{j}]=\sum_{x_{i} \in R_{X}} x_{i} P_{X | Y}\left(x_{i} | Y=y_{j}\right) $$

you would find that it is actually a function of $Y$.

Consider a joint PMF below

$$ \begin{array}{|c|c|} \hline & X = 0 & X = 1 & P_Y(y) \\ \hline Y = 0 & 1/5 & 2/5 & 3/5\\ \hline Y = 1 & 2/5 & 0 & 2/5 \\ \hline P_X(x) &3/5 & 2/5 \\ \hline \end{array} $$

What are the conditional PMF $P_{X|Y}(x|0)$ and $P_{X|Y}(x|1)$?

$$ P_{X|Y}(X= 0|Y = 0) = \frac{1/5}{3/5}=1/3\\ P_{X|Y}(X= 1|Y = 0) = \frac{2/5}{3/5}=2/3\\ P_{X|Y}(X= 0|Y = 1) = \frac{2/5}{2/5}=1\\ P_{X|Y}(X= 1|Y = 1) = 0\\ $$

Let $Z = E[X|Y]$, find PMF of $Z$.

Remember that $Z$ is a function of $Y$. To calculate conditional expectation, we need to use conditional probability as well.

$$ E[X|Y = 0] = 0 \left(\frac{\frac{1}{5}}{\frac{1}{5}+\frac{2}{5}}\right)+1\left(\frac{\frac{2}{5}}{\frac{1}{5}+\frac{2}{5}}\right) =\frac{2}{3}\\ E[X|Y = 1] = 0 $$

Find $E[Z]$, and check that if $E[Z] = E[X]$.

Because $E[X|Y]$ itself is a variable, it must have an expectation as well

$$ E[Z] = E[E[X|Y]] = P_Y(Y = 0)E[X|Y = 0]+ P_Y(Y = 1)E[X|Y = 1] = \frac{3}{5}\cdot\frac{2}{3}+\frac{2}{5}\cdot0=\frac{2}{5} $$

Actually, $E[Z] = E[E[X|Y]] = E[X]$ must hold, it is the law of iterated expectation.

Expectation for Independent Variables¶

All the rules of expectation for independent variables are here, they are fairly straightforward, because conditioning on $Y$ does not provide any extra information

$E[X | Y]=E[ X]$
$E[g(X) | Y]=E[g(X)]$
$E[X Y]=E[ X] E [Y]$
$E[g(X) h(Y)]=E[g(X)] E[h(Y)]$

Continuous Distributions¶

Joint PDF¶

Joint PDF of $X$ and $Y$ is defined as

$$ P((X, Y) \in A)=\iint_{A} f_{X Y}(x, y) d x d y =1 $$

where $f_{XY}(x, y)$ is a non-negative function, mapping $\mathbb{R}^2$ to $\mathbb{R}$.

However, we are particularly interested in the case that $A$ is a rectangular,

$$ P(a\geq X \geq b, c\geq Y \geq d) =\int_c^d\int_a^b f_{X Y}(x, y) d x d y $$

And within $A$, there are infinite amount of small rectangles

$$ P(a\geq X \geq a+\delta, c\geq Y \geq c+\delta )\approx f_{XY}(a,c)\delta^2 $$

An Example of Joint PDF ¶

Let's consider an example other than normal distribution.

$$ f_{X Y}(x, y)=\left\{\begin{array}{ll} x+c y^{2} & 0 \leq x \leq 1,\quad 0 \leq y \leq 1 \\ 0 & \text { otherwise } \end{array}\right. $$

Find out constant $c$.

Use the property $\iint_{A} f_{X Y}(x, y) d x d y =1$

\begin{align} \int^1_0\int^1_0(x+cy^2)dxdy &= 1\\ \int^1_0\left[\frac{x^2}{2}+cxy^2\right]^1_0dy &= 1\\ \int^1_0\left[\frac{1}{2}+cy^2\right]dy &= 1\\ \left[\frac{y}{2}+c\frac{y^3}{3}\right]^1_0&=1\\ \frac{1}{2}+\frac{c}{3}&=1\\ c&=\frac{3}{2}\\ \end{align}

Find out $P(0 ≤ X ≤ 1/2,0 ≤ Y ≤ 1/2)$

Plug in $c$, perform double integration

\begin{align} \int^{1/2}_{0}\int^{1/2}_0\left(x+\frac{3}{2}y^2\right)dxdy &= \int_0^{1/2}\left[\frac{x^2}{2}+\frac{3}{2}y^2x\right]_0^{1/2}dy \\ &=\int_0^{1/2}\left[\frac{1}{8}+\frac{3}{4}y^2\right]dy\\ &=\left[\frac{1}{8}+\frac{y^3}{4}\right]_0^{1/2}\\ &=\frac{3}{32} \end{align}

The joint distribution is depicted as below, the volume between the curved plane and $xy$ plane is $1$.

In [3]:

x, y = np.linspace(0, 1), np.linspace(0, 1)
X, Y = np.meshgrid(x, y)
Z = X + 3/2*Y**2

fig = plt.figure(figsize = (8, 8))
ax = fig.gca(projection='3d')
ax.plot_surface(X, Y, Z, cmap = 'coolwarm')
ax.contourf(X, Y, Z, zdir='z', offset=0, cmap='coolwarm')
plt.show()

Marginal PDF ¶

Maringal PDF of $X$ and $Y$ are

\begin{equation} f_{X}(x)=\int_{-\infty}^{\infty} f_{X Y}(x, y) d y,\quad \text { for all } x \\ f_{Y}(y)=\int_{-\infty}^{\infty} f_{X Y}(x, y) d x,\quad \text { for all } y \end{equation}

Let's use the same example as in last section to find out $f_X(x)$ and $f_Y(y)$.

$$ f_{X}(x)=\int_{0}^{1}\left(x+\frac{3}{2}y^2\right) d y =x+\frac{1}{2}\\ f_{Y}(y)=\int_{0}^{1}\left(x+\frac{3}{2}y^2\right) d x =\frac{3}{2} y^{2}+\frac{1}{2} $$

Joint CDF ¶

Joint CDF and joint PDF has relationship as follows:

$$ F_{X Y}(x, y)=\int_{-\infty}^{y} \int_{-\infty}^{x} f_{X Y}(u, v) d u d v \\ f_{X Y}(x, y)=\frac{\partial^{2}}{\partial x \partial y} F_{X Y}(x, y) $$

The same PDF as above, find the CDF.

$$ f_{X Y}(x, y)=\left\{\begin{array}{ll} x+\frac{3}{2} y^{2} & 0 \leq x \leq 1,\quad 0 \leq y \leq 1 \\ 0 & \text { otherwise } \end{array}\right. $$

\begin{align} F_{XY}(x,y)=\int_{0}^{y} \int_{0}^{x} f_{X Y}(u, v) d u d v&=\int_{0}^{y} \int_{0}^{x} \left(u+\frac{3}{2}v^2\right) d u d v\\ & = \int_0^y\left[\frac{u^2}{2}+\frac{3}{2}v^2u\right]^x_0dv\\ & = \int_0^y\left(\frac{x^2}{2}+\frac{3}{2}v^2x\right)dv\\ & = \left[\frac{x^2}{2}v+\frac{3}{2}\frac{v^3}{3}x\right]^y_0\\ & =\frac{x^2y}{2}+\frac{y^3x}{2} \end{align}

Conditional PDF and CDF ¶

Consider the conditional PDF of $X$ given that $X\in A$

\begin{align} P(x\leq X \leq x+\delta|X \in A)\approx f_{X|X\in A}(x)\cdot \delta &= \frac{P(x\leq X \leq x+\delta,X \in A)}{P(A)}\\ &=\frac{P(x\leq X \leq x+\delta)}{P(A)}\\ &\approx\frac{f_X(x)\delta}{P(A)} \end{align}

We have shown that

$$ f_{X|X\in A}(x) = \frac{f_X(x)}{P(A)} $$

You can imagin $P(A)$ as a scaling factor that normalize the conditional PDF into an area of $1$.

For two jointly continuous random variables $X$ and $Y$, we can define the following conditional concepts:

The conditional PDF of $X$ given $Y=y$ :

$$ f_{X | Y}(x | y)=\frac{f_{X Y}(x, y)}{f_{Y}(y)} $$

The conditional probability that $X \in A$ given $Y=y:$

$$ P(X \in A | Y=y)=\int_{A} f_{X | Y}(x | y) d x $$

The conditional CDF of $X$ given $Y=y$

$$ F_{X | Y}(x | y)=P(X \leq x | Y=y)=\int_{-\infty}^{x} f_{X | Y}(x | y) d x $$

The intuition of the first expression, i.e. conditional PDF is

$$ P(x\leq X \leq x+\delta| y\leq Y\leq y+\epsilon)\approx \frac{f_{XY}(xy)\delta\epsilon}{f_Y(y)\epsilon}=f_{X|Y}(x|y)\delta $$

Conditional probability must satisfy the basic rule of probability as well,

$$ \int_{-\infty}^\infty f_{X|Y}(x|y)dx = 1 $$

because

$$ \frac{\int_{-\infty}^\infty f_{XY}(xy)dx}{f_Y(y)} = 1 $$

Rearrange the conditional PDF, we obtain the multiplication rule

$$ f_{XY}(xy)=f_{X|Y}(x|y)f_Y(y) $$

Independence¶

If continuous variables $X$ and $Y$ are independent, then knowing either of them does not provide information for the other. That is

$$ f_{X|Y}(x|y) = f_X(x),\qquad \text{or} \qquad f_{Y|X}(y|x) = f_Y(y) $$

Thus the multiplication rule for independent distribution

$$ f_{XY}(xy)=f_X(x)f_Y(y) $$

Other rules derived from this are

\begin{align} E[XY]&= E[X]E[Y]\\ \text{Var}(X+Y)&=\text{Var}(X)+\text{Var}(Y)\\ E[g(X)h(Y)]&=E[g(X)]E[h(Y)] \end{align}

Table of Contents