Let's make up a matrix
n=5
5
A = [ 3 21 8 19 6
5 -4 11 14 13
√5 -2 2.5 0 0
0 0 0 0 1
2 99 5 π π]
5×5 Array{Float64,2}: 3.0 21.0 8.0 19.0 6.0 5.0 -4.0 11.0 14.0 13.0 2.23607 -2.0 2.5 0.0 0.0 0.0 0.0 0.0 0.0 1.0 2.0 99.0 5.0 3.14159 3.14159
m1 = A[2:5,1] / A[1,1]
4-element Array{Float64,1}: 1.66667 0.745356 0.0 0.666667
E1 = eye(5)
E1[2:5,1] = -m1
4-element Array{Float64,1}: -1.66667 -0.745356 -0.0 -0.666667
A2 = E1 * A
5×5 Array{Float64,2}: 3.0 21.0 8.0 19.0 6.0 0.0 -39.0 -2.33333 -17.6667 3.0 0.0 -17.6525 -3.46285 -14.1618 -4.47214 0.0 0.0 0.0 0.0 1.0 0.0 85.0 -0.333333 -9.52507 -0.858407
m2 = A2[3:5,2]/A2[2,2]
3-element Array{Float64,1}: 0.452628 -0.0 -2.17949
E2 = eye(5)
E2[3:5,2]= -m2
3-element Array{Float64,1}: -0.452628 0.0 2.17949
E2
5×5 Array{Float64,2}: 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 -0.452628 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 2.17949 0.0 0.0 1.0
A3 = E2 * A2
5×5 Array{Float64,2}: 3.0 21.0 8.0 19.0 6.0 0.0 -39.0 -2.33333 -17.6667 3.0 0.0 0.0 -2.40672 -6.16534 -5.83002 0.0 0.0 0.0 0.0 1.0 0.0 1.42109e-14 -5.4188 -48.0293 5.68005
m3 = A3[4:5,3]/A3[3,3]
2-element Array{Float64,1}: -0.0 2.25153
E4 = eye(5)
E4[4:5,3] = - m3
2-element Array{Float64,1}: 0.0 -2.25153
A4 = E4 * A3
5×5 Array{Float64,2}: 3.0 21.0 8.0 19.0 6.0 0.0 -39.0 -2.33333 -17.6667 3.0 0.0 0.0 -2.40672 -6.16534 -5.83002 0.0 0.0 0.0 0.0 1.0 0.0 1.42109e-14 0.0 -34.1479 18.8065
P = eye(5)
P[[4 5],:] = P[[5,4],:]
2×5 Array{Float64,2}: 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0 0.0
A4P = P * A4
5×5 Array{Float64,2}: 3.0 21.0 8.0 19.0 6.0 0.0 -39.0 -2.33333 -17.6667 3.0 0.0 0.0 -2.40672 -6.16534 -5.83002 0.0 1.42109e-14 0.0 -34.1479 18.8065 0.0 0.0 0.0 0.0 1.0
E1
5×5 Array{Float64,2}: 1.0 0.0 0.0 0.0 0.0 -1.66667 1.0 0.0 0.0 0.0 -0.745356 0.0 1.0 0.0 0.0 -0.0 0.0 0.0 1.0 0.0 -0.666667 0.0 0.0 0.0 1.0
inv(E1)
5×5 Array{Float64,2}: 1.0 0.0 0.0 0.0 0.0 1.66667 1.0 0.0 0.0 0.0 0.745356 0.0 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.666667 0.0 0.0 0.0 1.0
Let's look more closely at the process of Gaussian elimination in matrix form, using the matrix from lecture 1.
A = [1 3 1
1 1 -1
3 11 6]
3×3 Array{Int64,2}: 1 3 1 1 1 -1 3 11 6
Gaussian elimination produces the matrix $U$, which we can compute in Julia as in lecture 1:
# LU factorization (Gaussian elimination) of the matrix A,
# passing the ( will go away) option Val{false} to prevent row re-ordering
L, U = lu(A, Val{false})
U # just show U
3×3 Array{Float64,2}: 1.0 3.0 1.0 0.0 -2.0 -2.0 0.0 0.0 1.0
Now, let's go through Gaussian elimination in matrix form, by expressing the elimination steps as matrix multiplications. In Gaussian elimination, we make linear combination of rows to cancel elements below the pivot, and we now know that this corresponds to multiplying on the left by some elimination matrix $E$.
The first step is to eliminate in the first column of $A$. The pivot is the 1 in the upper-left-hand corner. For this $A$, we need to:
This corresponds to multiplying $A$ on the left by the matrix E1
. As above (in the "row × matrix" picture), the three rows of E1
correspond exactly to the three row operations listed above:
E1 = [ 1 0 0
-1 1 0
-3 0 1]
3×3 Array{Int64,2}: 1 0 0 -1 1 0 -3 0 1
Int.(inv(E1)) ## What does this mean?
3×3 Array{Int64,2}: 1 0 0 1 1 0 3 0 1
E1 * A
3×3 Array{Float64,2}: 0.486247 0.762176 0.269746 -0.325667 -0.127095 -0.254637 -0.549413 -1.57389 -0.559246
inv(E1) * (E1 * A)
3×3 Array{Float64,2}: 0.486247 0.762176 0.269746 0.160581 0.635081 0.0151095 0.909329 0.712643 0.249992
We would love for you to see the above not as a mechanical observation but interpret it as reversing the row operationa above.
E1*A
3×3 Array{Int64,2}: 1 3 1 0 -2 -2 0 2 3
As desired, this introduced zeros below the diagonal in the first column. Now, we need to eliminate the 2 below the diagonal in the second column of E1*A
. Our new pivot is $-2$ (in the second row), and we just add the second row of E1*A
with the third row to make the new third row.
This corresponds to multiplying on the left by the matrix E2
, which leaves the first two rows alone and makes the new third row by adding the second and third rows:
E2 = [1 0 0
0 1 0
0 1 1]
3×3 Array{Int64,2}: 1 0 0 0 1 0 0 1 1
E2*E1*A
3×3 Array{Int64,2}: 1 3 1 0 -2 -2 0 0 1
As expected, this is upper triangular, and in fact the same as the U
matrix returned by the Julia lu
function above:
E2*E1*A == U
true
Thus, we have arrived at the formula: $$ \underbrace{E_2 E_1}_E A = U $$ Notice that we multiplied $A$ by the elimination matrices from right to left in the order of the steps: it is $E_2 E_1 A$, not $E_1 E_2 A$. Because matrix multiplication is generally not commutative, $E_2 E_1$ and $E_1 E_2$ give different matrices:
E2*E1
3×3 Array{Int64,2}: 1 0 0 -1 1 0 -4 1 1
E1*E2
3×3 Array{Int64,2}: 1 0 0 -1 1 0 -3 1 1
E1
3×3 Array{Int64,2}: 1 0 0 -1 1 0 -3 0 1
Notice, furthermore, that the matrices $E_1$ and $E_2$ are both lower-triangular matrices with ones on the diagonal. This is a consequence of the structure of Gaussian elimination (assuming no row re-ordering): we always add the pivot row to rows below it, never above it.
The product of lower-triangular matrices is always lower-triangular too. (In homework, you will explore a similar property for upper-triangular matrices) In consequence, the product $E = E_2 E_1$ is lower-triangular, and Gaussian elimination can be viewed as yielding $EA=U$ where $E$ is lower triangular and $U$ is upper triangular.
We can write $EA=U$ as $A= E^{-1} U$, where $E^{-1}$ is the inverse of the matrix $E$. We will have more to say about matrix inverses later in 18.06, but for now we just need to know that it is the matrix that reverses the steps of Gaussian elimination, taking us back from $U$ to $A$. Computing matrix inverses is laborious in general, but in this particular case it is easy. We just need to reverse the steps one by one starting with the last elimination step and working back to the first one.
Hence, we need to reverse (invert) $E_2$ first on $U$, and then reverse (invert) $E_1$: $A = E_1^{-1} E_2^{-1} U$. But reversing an individual elimination step like $E_2$ is easy: we just flip the signs below the diagonal, so that wherever we added the pivot row we subtract and vice-versa. That is: $$ \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 1 & 1 \end{pmatrix}^{-1} = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & -1 & 1 \end{pmatrix} $$ (The last elimination step was adding the second row to the third row, so we reverse it by subtracting the second row from the third row of $U$.)
Julia can compute matrix inverses for us with the inv
function. (It doesn't know the trick of flipping the sign, which only works for very special matrices, but it can compute it the "hard way" so quickly (for such a small matrix) that it doesn't matter.) Of course that gives the same result:
inv(E2)
3×3 Array{Float64,2}: 1.0 0.0 0.0 0.0 1.0 0.0 0.0 -1.0 1.0
Similarly for $E_1$:
inv(E1)
3×3 Array{Float64,2}: 1.0 0.0 0.0 1.0 1.0 0.0 3.0 0.0 1.0
If we didn't make any mistakes, then $E_1^{-1} E_2^{-1} U$ should give $A$, and it does:
inv(E1)*inv(E2)*U == A
true
We use the letter "L" for the inverse elimination matrix $L = E^{-1} = E_1^{-1} E_2^{-1}$ Since the inverses of each elimination matrix were lower-triangular (with flipped signs), their product $L$ is also lower triangular:
(Think of a mechanical reason that products of lower-triangular are lower-triangular and a conceptual reason.)
L = inv(E1)*inv(E2)
3×3 Array{Float64,2}: 1.0 0.0 0.0 1.0 1.0 0.0 3.0 -1.0 1.0
As mentioned above, this is the same as the inverse of $E = E_2 E_1$:
inv(E2*E1)
3×3 Array{Float64,2}: 1.0 0.0 0.0 1.0 1.0 0.0 3.0 -1.0 1.0
The final result, therefore, is that Gaussian elimination (without row swaps) can be viewed as a factorization of the original matrix $A$
$$
A = LU
$$
into a product of lower- and upper-triangular matrices. (Furthermore, although we didn't comment on this above, $L$ is always 1 along its diagonal.) This factorization is called the LU factorization of $A$. (It's why we used the lu
function in Julia above.) When a computer performs Gaussian elimination, what it computes are the $L$ and $U$ factors.
What this accomplishes is to break a complicated matrix $A$ into much simpler pieces $L$ and $U$. It may not seem at first that $L$ and $U$ are that much simpler than $A$, but they are: lots of operations that are very difficult with $A$, like solving equations or computing the determinant, become easy once you known $L$ and $U$.
L
3×3 Array{Float64,2}: 1.0 0.0 0.0 1.0 1.0 0.0 3.0 -1.0 1.0
inv(E2*E1)
3×3 Array{Float64,2}: 1.0 0.0 0.0 1.0 1.0 0.0 3.0 -1.0 1.0
E1
3×3 Array{Int64,2}: 1 0 0 -1 1 0 -3 0 1
E2
3×3 Array{Int64,2}: 1 0 0 0 1 0 0 1 1
E1 and E2 have the negative multipliers below the diagonal
inv(E1) and inv(E2) have the multipliers below the diagonal.
L has the multipliers below the diagonal.
inv(L) does not generally have multipliers anywhere.
L
3×3 Array{Float64,2}: 1.0 0.0 0.0 1.0 1.0 0.0 3.0 -1.0 1.0
inv(L)
3×3 Array{Float64,2}: 1.0 0.0 0.0 -1.0 1.0 0.0 -4.0 1.0 1.0
E1
3×3 Array{Int64,2}: 1 0 0 -1 1 0 -3 0 1
inv(E1)
3×3 Array{Float64,2}: 1.0 0.0 0.0 1.0 1.0 0.0 3.0 0.0 1.0
E2
3×3 Array{Int64,2}: 1 0 0 0 1 0 0 1 1
inv(E2)
3×3 Array{Float64,2}: 1.0 0.0 0.0 0.0 1.0 0.0 0.0 -1.0 1.0
L
3×3 Array{Float64,2}: 1.0 0.0 0.0 1.0 1.0 0.0 3.0 -1.0 1.0
inv(L)
3×3 Array{Float64,2}: 1.0 0.0 0.0 -1.0 1.0 0.0 -4.0 1.0 1.0
E1 # differs from I in column 1
3×3 Array{Int64,2}: 1 0 0 -1 1 0 -3 0 1
E2 # differs from I in column 2
3×3 Array{Int64,2}: 1 0 0 0 1 0 0 1 1
L# differs from I in more than one column L
3×3 Array{Float64,2}: 1.0 0.0 0.0 1.0 1.0 0.0 3.0 -1.0 1.0
E2 * E1
3×3 Array{Int64,2}: 1 0 0 -1 1 0 -4 1 1