versioninfo()
Julia Version 1.9.3 Commit bed2cd540a1 (2023-08-24 14:43 UTC) Build Info: Official https://julialang.org/ release Platform Info: OS: macOS (x86_64-apple-darwin22.4.0) CPU: 8 × Intel(R) Core(TM) i5-8279U CPU @ 2.40GHz WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-14.0.6 (ORCJIT, skylake) Threads: 2 on 8 virtual cores
using Pkg
Pkg.activate(pwd())
Pkg.instantiate()
Pkg.status()
Activating new project at `~/Dropbox/class/M1399.000200/2023/M1300_000200-2023fall/lectures/07-gelu` No Changes to `~/Dropbox/class/M1399.000200/2023/M1300_000200-2023fall/lectures/07-gelu/Project.toml` No Changes to `~/Dropbox/class/M1399.000200/2023/M1300_000200-2023fall/lectures/07-gelu/Manifest.toml`
Status `~/Dropbox/class/M1399.000200/2023/M1300_000200-2023fall/lectures/07-gelu/Project.toml` (empty project)
For simplicity we consider a square matrix $\mathbf{A} \in \mathbb{R}^{n \times n}$.
History: the method is named after Carl Friedrich Gauss (1777–1855), although it stems from the notes of Isaac Newton. If fact, GE was known to Chinese mathematicians as early as 179 AD.
A toy example.
A = [2.0 -4.0 2.0; 4.0 -9.0 7.0; 2.0 1.0 3.0]
3×3 Matrix{Float64}: 2.0 -4.0 2.0 4.0 -9.0 7.0 2.0 1.0 3.0
b = [6.0, 20.0, 14.0]
3-element Vector{Float64}: 6.0 20.0 14.0
# Julia way to solve linear equation
# equivalent to `solve(A, b)` in R
A \ b
3-element Vector{Float64}: 2.000000000000001 1.0 2.9999999999999996
What happens when we call A \ b
to solve a linear equation?
$\mathbf{E}_{jk}(c)$ is unit triangular, full rank, with inverse $\mathbf{E}_{jk}^{-1}(c) = \mathbf{E}_{jk}(-c)$.
$\mathbf{E}_{jk}(c)$ left-multiplied to an $n \times m$ matrix $\mathbf{X}$ effectively replaces its $j$-th row $\mathbf{x}_{j\cdot}$ by $c \mathbf{x}_{k \cdot} + \mathbf{x}_{j \cdot}$
+ $2n$ flops (why?).
Column 1:
E21 = [1.0 0.0 0.0; -2.0 1.0 0.0; 0.0 0.0 1.0]
3×3 Matrix{Float64}: 1.0 0.0 0.0 -2.0 1.0 0.0 0.0 0.0 1.0
# zero (2, 1) entry
E21 * A # Step 1, A'
3×3 Matrix{Float64}: 2.0 -4.0 2.0 0.0 -1.0 3.0 2.0 1.0 3.0
E31 = [1.0 0.0 0.0; 0.0 1.0 0.0; -1.0 0.0 1.0]
3×3 Matrix{Float64}: 1.0 0.0 0.0 0.0 1.0 0.0 -1.0 0.0 1.0
# zero (3, 1) entry
E31 * E21 * A # Step 2, A''
3×3 Matrix{Float64}: 2.0 -4.0 2.0 0.0 -1.0 3.0 0.0 5.0 1.0
Column 2:
E32 = [1.0 0.0 0.0; 0.0 1.0 0.0; 0.0 5.0 1.0]
3×3 Matrix{Float64}: 1.0 0.0 0.0 0.0 1.0 0.0 0.0 5.0 1.0
# zero (3, 2) entry
E32 * E31 * E21 * A # Step 3, A'''
3×3 Matrix{Float64}: 2.0 -4.0 2.0 0.0 -1.0 3.0 0.0 0.0 16.0
M1 = E31 * E21 # inv(L2 * L1)
3×3 Matrix{Float64}: 1.0 0.0 0.0 -2.0 1.0 0.0 -1.0 0.0 1.0
M2 = E32 # inv(L3)
3×3 Matrix{Float64}: 1.0 0.0 0.0 0.0 1.0 0.0 0.0 5.0 1.0
inv(M1) # L2 * L1. inv(A) give the inverse of A, but use with caution (see below)
3×3 Matrix{Float64}: 1.0 0.0 0.0 2.0 1.0 0.0 1.0 0.0 1.0
inv(M2) # L3
3×3 Matrix{Float64}: 1.0 0.0 0.0 0.0 1.0 0.0 0.0 -5.0 1.0
Gaussian elimination does $\mathbf{M}_{n-1} \cdots \mathbf{M}_1 \mathbf{A} = \mathbf{U}$.
Let
where L
is unit lower triangular and U
is upper triangular.
+ LU decomposition exists if the principal sub-matrix $\mathbf{A}[1:k, 1:k]$ is non-singular for $k=1,\ldots,n-1$.
A
is non-singular, then the LU decmpositon is unique and
$$
\small
\det(\mathbf{A}) = \det(\mathbf{L}) \det(\mathbf{U}) = \prod_{k=1}^n u_{kk}.
$$A
is overwritten by L
and U
.for k=1:n-1
for i=k+1:n
A[i, k] /= A[k, k]
for j=k+1:n
A[i, j] -= A[i, k] * A[k, j]
end
end
end
The LU decomposition costs $2(n-1)^2 + 2(n-2)^2 + \cdots + 2 \cdot 1^2 \approx \frac 23 n^3$ flops.
Actual implementations can differ: outer product LU (kij
loop), block outer product LU (higher level-3 fraction), Crout's algorithm (jki
loop).
Given LU decomposition $\mathbf{A} = \mathbf{L} \mathbf{U}$, solving $(\mathbf{L} \mathbf{U}) \mathbf{x} = \mathbf{b}$ for one right hand side costs $2n^2$ flops:
$$ \mathbf{L} \mathbf{y} = \mathbf{b} $$
$$ \mathbf{U} \mathbf{x} = \mathbf{y} $$
Therefore to solve $\mathbf{A} \mathbf{x} = \mathbf{b}$ via LU, we need a total of $\frac 23 n^3 + 2n^2$ flops.
If there are multiple right hand sides, LU only needs to be done once.
For matrix inversion, there are $n$ right hand sides $\mathbf{e}_1, \ldots, \mathbf{e}_n$. Taking advantage of 0s reduces $2n^3$ flops to $\frac 43 n^3$ flops. So matrix inversion via LU costs $\frac 23 n^3 + \frac 43 n^3 = 2n^3$ flops.
No inversion mentality:
inv(A)
.We do not compute matrix inverse unless
Is there a solution to $\mathbf{A} \mathbf{x} = \mathbf{b}$ for an arbitrary $\mathbf{b}$? Does GE/LU work for $\mathbf{A}$?
Example: $$ \begin{split} 0.001x_1 + x_2 &= 1 \\ x_1 + x_2 &= 2 \end{split} $$ with solution
$$ \begin{bmatrix} x_1 \\ x_ 2 \end{bmatrix} = \begin{bmatrix} 1.001001 \\ 0.998999 \end{bmatrix} \approx \begin{bmatrix} 1.0 \\ 1.0 \end{bmatrix} . $$or $$ \begin{split} 0.001x_1 + x_2 &= 1 \\ -1000 x_2 &= -1000 \end{split} , $$ yielding $$ \begin{bmatrix} x_1 \\ x_ 2 \end{bmatrix} = \begin{bmatrix} 0.0 \\ 1.0 \end{bmatrix} ! $$
or $$ \begin{split} x_1 + x_2 &= 2 \\ 1.0 x_2 &= 1.0 \end{split} $$
yielding $$ \begin{bmatrix} x_1 \\ x_ 2 \end{bmatrix} = \begin{bmatrix} 1.0 \\ 1.0 \end{bmatrix} . $$
Partial pivoting: before zeroing the $k$-th column, move the row with $\max_{i=k}^n |a_{ik}|$ is into the $k$-th row (called GEPP).
LU with partial pivoting yields
where $\mathbf{P}$ is a permutation matrix, $\mathbf{L}$ is unit lower triangular with $|\ell_{ij}| \le 1$, and $\mathbf{U}$ is upper triangular.
A[k:n, k:n]
is permuted to the $(k,k)$-th entry. This yields the decompositionwhere $|\ell_{ij}| \le 1$.
LAPACK: ?GETRF
does $\mathbf{P} \mathbf{A} = \mathbf{L} \mathbf{U}$ (LU decomposition with partial pivoting) in place.
R: solve()
implicitly performs LU decomposition (wrapper of LAPACK routine DGESV
). solve()
allows specifying a single or multiple right hand sides. If none, it computes the matrix inverse. The matrix
package contains lu()
function that outputs L
, U
, and pivot
.
LinearAlgebra.lu
.LinearAlgebra.lu!
. In-place version. Input matrix gets overwritten with L and U.LAPACK.getrf!
directly.getrs
, gesv
, gesvx
, trtri
(inverse of triangular matrix), trtrs
.A
3×3 Matrix{Float64}: 2.0 -4.0 2.0 4.0 -9.0 7.0 2.0 1.0 3.0
using LinearAlgebra
# second argument indicates partial pivoting (default) or not
alu = lu(A)
typeof(alu)
LU{Float64, Matrix{Float64}, Vector{Int64}}
fieldnames(typeof(alu))
(:factors, :ipiv, :info)
alu.L
3×3 Matrix{Float64}: 1.0 0.0 0.0 0.5 1.0 0.0 0.5 0.0909091 1.0
alu.U
3×3 Matrix{Float64}: 4.0 -9.0 7.0 0.0 5.5 -0.5 0.0 0.0 -1.45455
alu.p
3-element Vector{Int64}: 2 3 1
alu.P
3×3 Matrix{Float64}: 0.0 1.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0
alu.L * alu.U
3×3 Matrix{Float64}: 4.0 -9.0 7.0 2.0 1.0 3.0 2.0 -4.0 2.0
A[alu.p, :]
3×3 Matrix{Float64}: 4.0 -9.0 7.0 2.0 1.0 3.0 2.0 -4.0 2.0
# this is doing two triangular solves, 2n^2 flops
alu \ b
3-element Vector{Float64}: 2.000000000000001 1.0 2.9999999999999996
det(A) # this does LU!
-32.0
det(alu) # this is cheap
-32.0
inv(A) # this does LU!
3×3 Matrix{Float64}: 1.0625 -0.4375 0.3125 -0.0625 -0.0625 0.1875 -0.6875 0.3125 0.0625
inv(alu) # this is cheap
3×3 Matrix{Float64}: 1.0625 -0.4375 0.3125 -0.0625 -0.0625 0.1875 -0.6875 0.3125 0.0625
Sections II.5.2 and II.5.3, Computational Statistics by James Gentle (2010).
Chapter 2, Applied Numerical Linear Algebra by James W. Demmel (1997).
Chapter 3, Matrix Computation by Gene Golub and Charles Van Loan (2013).