In [28]:

using LinearAlgebra

Markov matrices¶

A matrix $A$ is a Markov matrix if

Its entries are all $\ge 0$
Each column's entries sum to 1

Typicaly, a Markov matrix's entries represent transition probabilities from one state to another.

For example, consider the $2 \times 2$ Markov matrix:

In [29]:

A = [0.9 0.2
     0.1 0.8]

Out[29]:

2×2 Matrix{Float64}:
 0.9  0.2
 0.1  0.8

Let us suppose that this represents the fraction of people switching majors each year between math and English literature.

Let $$ x = \begin{pmatrix} m \\ e \end{pmatrix} $$

represent the number of math majors $m$ and English majors $e$. Suppose that each year, 10% of math majors and 20% of English majors switch majors. After one year, the new number of math and English majors is:

$$ m' = 0.9 m + 0.2 e \\ e' = 0.1 m + 0.8 e $$

But this is equivalent to a matrix multiplication! i.e. the numbers $x'$ of majors after one year is

$$ x' = A x \, $$

Note that the two Markov properties are critical: we never have negative numbers of majors (or negative probabilities), and the probabilities must sum to 1 (the net number of majors is not changing: we're not including new students or people that graduate in this silly model).

Eigenvalues of Markov matrices¶

There are two key questions about Markov matrices that can be answered by analysis of their eigenvalues:

Is there a steady state?
- i.e. is there an $x_0 \ne 0$ such that $A x_0 = x_0$?
- i.e. is there $\lambda_0 = 1$ eigenvector $x_0$?
Does the system tend toward a steady state?
- i.e. does $A^n x \to \mbox{multiple of } x_0$ as $n \to \infty$?
- i.e. is $\lambda = 1$ the largest $|\lambda|$?

The answers are YES for any Markov matrix $A$, and YES for any positive Markov matrix (Markov matrices with entries $> 0$, not just $\ge 0$). For any Markov matrix, all of the λ satisfy $|\lambda| \le 1$, but if there are zero entries in the matrix we may have multiple $|\lambda|=1$ eigenvalues (though this doesn't happen often in practical Markov problems).

In [30]:

eigvals(A)

Out[30]:

2-element Vector{Float64}:
 0.7
 1.0

Let's try just multipling it many times by a "random" vector and see whether it is converging to a steady state:

In [31]:

A^100 * [17, 4]

Out[31]:

2-element Vector{Float64}:
 14.000000000000089
  7.000000000000044

Yes, it seems to be giving a vector that is not changing, which shoud be a multiple $c x_0$ of a steady-state eigenvector:

In [32]:

cx₀ = A^1000 * [17, 4]

Out[32]:

2-element Vector{Float64}:
 14.000000000000874
  7.000000000000437

Let's check that this is an eigenvector of $A$ with eigenvalue $\lambda=1$:

In [33]:

A * cx₀

Out[33]:

2-element Vector{Float64}:
 14.000000000000874
  7.000000000000437

To see why, the key idea is to write the columns-sum-to-one property of Markov matrices in linear-algebra terms. It is equivalent to the statement:

$$ \underbrace{\begin{pmatrix} 1 & 1 & \cdots & 1 & 1 \end{pmatrix}}_{o^T} A = o^T $$

since this is just the operation that sums all of the rows of $A$. Equivalently, if we transpose both sides:

$$ A^T o = o $$

i.e. $o$ is an eigenvector of $A^T$ (called a left eigenvector of A) with eigenvalue $\lambda = 1$.

But since $A$ and $A^T$ have the same eigenvalues (they have the same characteristic polynomial $\det (A - \lambda I) = \det (A^T - \lambda I)$ because transposed don't change determinants), this means that $A$ also has an eigenvalue 1 but with a different eigenvector.

In [34]:

o = [1,1]
o'A

Out[34]:

1×2 adjoint(::Vector{Float64}) with eltype Float64:
 1.0  1.0

In [35]:

A' * o

Out[35]:

2-element Vector{Float64}:
 1.0
 1.0

An eigenvector of $A$ with eigenvalue $1$ must be a basis for $N(A - I)$:

In [36]:

A - 1*I

Out[36]:

2×2 Matrix{Float64}:
 -0.1   0.2
  0.1  -0.2

By inspection, $A - I$ is singular here: the second column is -2 times the first. So, $x_0 = (2,1)$ is a basis for its nullspace, and is a steady state:

In [37]:

(A - I) * [2,1]

Out[37]:

2-element Vector{Float64}:
 5.551115123125783e-17
 5.551115123125783e-17

Let's check if some arbitrary starting vector $(3,0)$ tends towards this steady state:

In [38]:

# The following code prints a sequence of Aⁿx values
# for n=0,1,2,… nicely formatted with LaTeX.

x = [3, 0]
pmatrix(x) = string("\\begin{pmatrix} ", round(x[1],digits=3), "\\\\", round(x[2],digits=3), "\\end{pmatrix}")
buf = IOBuffer()
println(buf, "\$")
for k = 1:6
    print(buf, pmatrix(x), " \\longrightarrow ")
    for i = 1:4
        x = A*x
        print(buf, pmatrix(x), " \\longrightarrow ")
    end
    println(buf, "\\\\")
end
println(buf, "\$")
display("text/latex", String(take!(buf)))

$ \begin{pmatrix} 3.0\\0.0\end{pmatrix} \longrightarrow \begin{pmatrix} 2.7\\0.3\end{pmatrix} \longrightarrow \begin{pmatrix} 2.49\\0.51\end{pmatrix} \longrightarrow \begin{pmatrix} 2.343\\0.657\end{pmatrix} \longrightarrow \begin{pmatrix} 2.24\\0.76\end{pmatrix} \longrightarrow \\ \begin{pmatrix} 2.24\\0.76\end{pmatrix} \longrightarrow \begin{pmatrix} 2.168\\0.832\end{pmatrix} \longrightarrow \begin{pmatrix} 2.118\\0.882\end{pmatrix} \longrightarrow \begin{pmatrix} 2.082\\0.918\end{pmatrix} \longrightarrow \begin{pmatrix} 2.058\\0.942\end{pmatrix} \longrightarrow \\ \begin{pmatrix} 2.058\\0.942\end{pmatrix} \longrightarrow \begin{pmatrix} 2.04\\0.96\end{pmatrix} \longrightarrow \begin{pmatrix} 2.028\\0.972\end{pmatrix} \longrightarrow \begin{pmatrix} 2.02\\0.98\end{pmatrix} \longrightarrow \begin{pmatrix} 2.014\\0.986\end{pmatrix} \longrightarrow \\ \begin{pmatrix} 2.014\\0.986\end{pmatrix} \longrightarrow \begin{pmatrix} 2.01\\0.99\end{pmatrix} \longrightarrow \begin{pmatrix} 2.007\\0.993\end{pmatrix} \longrightarrow \begin{pmatrix} 2.005\\0.995\end{pmatrix} \longrightarrow \begin{pmatrix} 2.003\\0.997\end{pmatrix} \longrightarrow \\ \begin{pmatrix} 2.003\\0.997\end{pmatrix} \longrightarrow \begin{pmatrix} 2.002\\0.998\end{pmatrix} \longrightarrow \begin{pmatrix} 2.002\\0.998\end{pmatrix} \longrightarrow \begin{pmatrix} 2.001\\0.999\end{pmatrix} \longrightarrow \begin{pmatrix} 2.001\\0.999\end{pmatrix} \longrightarrow \\ \begin{pmatrix} 2.001\\0.999\end{pmatrix} \longrightarrow \begin{pmatrix} 2.001\\0.999\end{pmatrix} \longrightarrow \begin{pmatrix} 2.0\\1.0\end{pmatrix} \longrightarrow \begin{pmatrix} 2.0\\1.0\end{pmatrix} \longrightarrow \begin{pmatrix} 2.0\\1.0\end{pmatrix} \longrightarrow \\ $

Yes! In fact, it tends to exactly $(2,1)$, because the other eigenvalue is $< 1$ (and hence that eigenvector component decays exponentially fast).

An interesting property is that the sum of the vector components is conserved when we multiply by a Markov matrix. Given a vector $x$, $o^T x$ is the sum of its components. But $o^T A = o^T$, so:

$$ o^T A x = o^T x = o^T A^n x $$

for any $n$! This is why $(3,0)$ must tend to $(2,1)$, and not to any other multiple of $(2,1)$, because both of them sum to 3. (The "number of majors" is conserved in this problem.)

Why no eigenvalues > 1?¶

Why are all $|\lambda| \le 1$ for a Markov matrix?

The key fact is that the product AB of two Markov matrices A and B is also Markov. Reasons:

If $A$ and $B$ have nonnegative entries, $AB$ does as well: matrix multiplication uses only $\times$ and $+$, and can't introduce a minus sign.
If $o^T A = o^T$ and $o^T B = o^T$ (both have columns summing to 1), then $o^T AB = o^T B = o^T$: the columns of $AB$ sum to 1.

For example, $A^n$ is a Markov matrix for any $n$ if $A$ is Markov.

Now, if there were an eigenvalue $|\lambda| > 1$, the matrix $A^n$ would have to blow up exponentially as $n\to \infty$ (since the matrix times that eigenvector, or any vector with a nonzero component of that eigenvector, would blow up). But since $A^n$ is Markov, all of its entries must be between 0 and 1. It can't blow up! So we must have all $|\lambda| \le 1$.

In [39]:

A^100

Out[39]:

2×2 Matrix{Float64}:
 0.666667  0.666667
 0.333333  0.333333

(In fact, $A^n$ is pretty boring for large $n$: it just takes in any vector and redistributes it to the steady state.)

Another way of thinking about $A^{100}$ is $$ A^{100} = A^{100} \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} = \begin{pmatrix} A^{100} \begin{pmatrix} 1 \\ 0 \end{pmatrix} & A^{100} \begin{pmatrix} 0 \\ 1 \end{pmatrix} \end{pmatrix} $$ i.e. it multiplies $A^{100}$ by each column of the identity matrix (= different possible "starting populations"). Because of this, each column of $A^{100}$ tends towards an eigenvector with the biggest $|\lambda|$.

Can there be more than one steady state?¶

We have just showed that we have at least one eigenvalue $\lambda = 1$, and that all eigenvalues satisfy $|\lambda| \le 1$. But can there be more than one independent eigenvector with $\lambda = 1$?

Yes! For example, the identity matrix $I$ is a Markov matrix, and all of its eigenvectors have eigenvalue $1$. Since $Ix = x$ for any $x$, every vector is a steady state for $I$!

But this does not usually happen for interesting Markov matrices coming from real problems. In fact, there is a theorem:

If all the entries of a Markov matrix are $> 0$ (not just $\ge 0$), then exactly one of its eigenvalues $\lambda = 1$ (that eigenvalue has "multiplicity 1": $N(A-I)$ is one-dimensional), and all other eigenvalues have $|\lambda| < 1$. There is a unique steady state (up to an overall scale factor).

I'm not going to prove this in 18.06, however.

Can the solutions oscillate?¶

If you have a Markov matrix with zero entries, then there might be more than one eigenvalue with $|\lambda| = 1$, but these additional solutions might be oscillating solutions rather than steady states.

For example, consider the permutation matrix $$ P = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} $$ that simply swaps the first and second entries of any 2-component vector.

If $x = (1,0)$, then $P^n x$ will oscillate forever, never reaching a steady state! It simply oscillates between $(1,0)$ (for even $n$) and $(0,1)$ (for odd $n$):

In [40]:

P = [0 1
     1 0]

Out[40]:

2×2 Matrix{Int64}:
 0  1
 1  0

In [41]:

[P^n * [1,0] for n = 0:5]

Out[41]:

6-element Vector{Vector{Int64}}:
 [1, 0]
 [0, 1]
 [1, 0]
 [0, 1]
 [1, 0]
 [0, 1]

But this is a Markov matrix, so all $|\lambda|$ are $\le 1$:

In [42]:

eigvals(P)

Out[42]:

2-element Vector{Float64}:
 -1.0
  1.0

The problem is that the $\lambda = -1$ eigenvalue corresponds to an oscillating solution:

$$ P^n \begin{pmatrix} 1 \\ -1 \end{pmatrix} = (-1)^n \begin{pmatrix} 1 \\ -1 \end{pmatrix} $$

for the eigenvector $(1,-1)$.

The steady state still exists, corresponding to the eigenvector $(1,1)$:

$$ P^n \begin{pmatrix} 1 \\ 1 \end{pmatrix} = \begin{pmatrix} 1 \\ 1 \end{pmatrix} $$

In [43]:

X = eigvecs(P) # the eigenvectors

Out[43]:

2×2 Matrix{Float64}:
 -0.707107  0.707107
  0.707107  0.707107

In [44]:

X ./ X[1,:]' # normalize the first row to be 1, to resemble our hand solutions

Out[44]:

2×2 Matrix{Float64}:
  1.0  1.0
 -1.0  1.0

Since $(1,0) = [(1,1) + (1,-1)]/2$, we have:

$$ P^n \begin{pmatrix} 1 \\ 0 \end{pmatrix} = \frac{1}{2} \left[ \begin{pmatrix} 1 \\ 1 \end{pmatrix} + (-1)^n \begin{pmatrix} 1 \\ -1 \end{pmatrix} \right] $$

which alternates between $(1,0)$ and $(0,1)$.

Another example¶

Let's generate a random 5x5 Markov matrix:

In [45]:

M = rand(5,5) # random entries in [0,1]

Out[45]:

5×5 Matrix{Float64}:
 0.410618  0.306837  0.410031  0.707623   0.290909
 0.307687  0.22414   0.676996  0.0455438  0.904309
 0.999213  0.714056  0.357485  0.913338   0.715352
 0.647026  0.995701  0.789245  0.577309   0.391341
 0.73899   0.967951  0.914835  0.565266   0.447786

In [46]:

sum(M,dims=1) # not Markov yet

Out[46]:

1×5 Matrix{Float64}:
 3.10353  3.20869  3.14859  2.80908  2.7497

In [47]:

M = M ./ sum(M,dims=1)

Out[47]:

5×5 Matrix{Float64}:
 0.132307   0.0956271  0.130227  0.251906   0.105797
 0.0991408  0.0698541  0.215016  0.0162131  0.328876
 0.32196    0.222539   0.113538  0.325138   0.260157
 0.20848    0.310314   0.250666  0.205515   0.142322
 0.238113   0.301666   0.290554  0.201228   0.162849

In [48]:

sum(M,dims=1)

Out[48]:

1×5 Matrix{Float64}:
 1.0  1.0  1.0  1.0  1.0

In [49]:

eigvals(M)

Out[49]:

5-element Vector{ComplexF64}:
   -0.17776058595953462 + 0.0im
    -0.1352931760939033 + 0.0im
 -0.0014416561523480091 - 0.07745841517651891im
 -0.0014416561523480091 + 0.07745841517651891im
     1.0000000000000004 + 0.0im

In [50]:

abs.(eigvals(M))

Out[50]:

5-element Vector{Float64}:
 0.17776058595953462
 0.1352931760939033
 0.07747183006822272
 0.07747183006822272
 1.0000000000000004

In [51]:

x = rand(5)
x = x / sum(x) # normalize x to have sum = 1

Out[51]:

5-element Vector{Float64}:
 0.05662283043686728
 0.13759834468209436
 0.3955727782747076
 0.09136601599437516
 0.3188400306119557

In [52]:

M^100 * x

Out[52]:

5-element Vector{Float64}:
 0.14590618751218656
 0.15842031877820567
 0.24194675375641556
 0.2186167846876927
 0.23510995526549555

In [53]:

sum(x), sum(M^100 * x) # still = 1

Out[53]:

(1.0, 0.9999999999999962)

In [54]:

λ, X = eigen(M)
X[:,end] / sum(X[:,end]) # eigenvector for λ=1, normalized to sum=1

Out[54]:

5-element Vector{ComplexF64}:
  0.1459061875121874 + 0.0im
 0.15842031877820623 + 0.0im
 0.24194675375641644 + 0.0im
 0.21861678468769355 + 0.0im
 0.23510995526549633 + 0.0im

Again, $M^n x$ is approaching a steady-state ($\lambda = 1$) eigenvector of $M$ as $n$ grows large.