Let's make up a matrix

In [19]:

n=5

Out[19]:

In [44]:

A = [ 3 21 8 19 6
    5 -4 11 14 13
     √5 -2 2.5 0 0
      0 0 0 0 1
     2 99 5 π π]

Out[44]:

5×5 Array{Float64,2}:
 3.0      21.0   8.0  19.0       6.0    
 5.0      -4.0  11.0  14.0      13.0    
 2.23607  -2.0   2.5   0.0       0.0    
 0.0       0.0   0.0   0.0       1.0    
 2.0      99.0   5.0   3.14159   3.14159

In [45]:

m1 = A[2:5,1] / A[1,1]

Out[45]:

4-element Array{Float64,1}:
 1.66667 
 0.745356
 0.0     
 0.666667

In [46]:

E1 = eye(5)
E1[2:5,1] = -m1

Out[46]:

4-element Array{Float64,1}:
 -1.66667 
 -0.745356
 -0.0     
 -0.666667

In [47]:

A2 = E1 * A

Out[47]:

5×5 Array{Float64,2}:
 3.0   21.0      8.0        19.0       6.0     
 0.0  -39.0     -2.33333   -17.6667    3.0     
 0.0  -17.6525  -3.46285   -14.1618   -4.47214 
 0.0    0.0      0.0         0.0       1.0     
 0.0   85.0     -0.333333   -9.52507  -0.858407

In [48]:

m2 = A2[3:5,2]/A2[2,2]

Out[48]:

3-element Array{Float64,1}:
  0.452628
 -0.0     
 -2.17949

In [49]:

E2 = eye(5)
E2[3:5,2]= -m2

Out[49]:

3-element Array{Float64,1}:
 -0.452628
  0.0     
  2.17949

In [50]:

E2

Out[50]:

5×5 Array{Float64,2}:
 1.0   0.0       0.0  0.0  0.0
 0.0   1.0       0.0  0.0  0.0
 0.0  -0.452628  1.0  0.0  0.0
 0.0   0.0       0.0  1.0  0.0
 0.0   2.17949   0.0  0.0  1.0

In [51]:

A3 = E2 * A2

Out[51]:

5×5 Array{Float64,2}:
 3.0   21.0           8.0       19.0       6.0    
 0.0  -39.0          -2.33333  -17.6667    3.0    
 0.0    0.0          -2.40672   -6.16534  -5.83002
 0.0    0.0           0.0        0.0       1.0    
 0.0    1.42109e-14  -5.4188   -48.0293    5.68005

In [53]:

m3 = A3[4:5,3]/A3[3,3]

Out[53]:

2-element Array{Float64,1}:
 -0.0    
  2.25153

In [54]:

E4 = eye(5)
E4[4:5,3] = - m3

Out[54]:

2-element Array{Float64,1}:
  0.0    
 -2.25153

In [56]:

A4 = E4 * A3

Out[56]:

5×5 Array{Float64,2}:
 3.0   21.0           8.0       19.0       6.0    
 0.0  -39.0          -2.33333  -17.6667    3.0    
 0.0    0.0          -2.40672   -6.16534  -5.83002
 0.0    0.0           0.0        0.0       1.0    
 0.0    1.42109e-14   0.0      -34.1479   18.8065

In [58]:

P = eye(5)
P[[4 5],:] = P[[5,4],:]

Out[58]:

2×5 Array{Float64,2}:
 0.0  0.0  0.0  0.0  1.0
 0.0  0.0  0.0  1.0  0.0

In [59]:

A4P = P * A4

Out[59]:

5×5 Array{Float64,2}:
 3.0   21.0           8.0       19.0       6.0    
 0.0  -39.0          -2.33333  -17.6667    3.0    
 0.0    0.0          -2.40672   -6.16534  -5.83002
 0.0    1.42109e-14   0.0      -34.1479   18.8065 
 0.0    0.0           0.0        0.0       1.0

In [60]:

E1

Out[60]:

5×5 Array{Float64,2}:
  1.0       0.0  0.0  0.0  0.0
 -1.66667   1.0  0.0  0.0  0.0
 -0.745356  0.0  1.0  0.0  0.0
 -0.0       0.0  0.0  1.0  0.0
 -0.666667  0.0  0.0  0.0  1.0

In [61]:

inv(E1)

Out[61]:

5×5 Array{Float64,2}:
 1.0       0.0  0.0  0.0  0.0
 1.66667   1.0  0.0  0.0  0.0
 0.745356  0.0  1.0  0.0  0.0
 0.0       0.0  0.0  1.0  0.0
 0.666667  0.0  0.0  0.0  1.0

Gaussian elimination: towards the wholistic view A=LU through elimination matrices¶

Let's look more closely at the process of Gaussian elimination in matrix form, using the matrix from lecture 1.

In [74]:

A = [1 3  1
     1 1 -1
     3 11 6]

Out[74]:

3×3 Array{Int64,2}:
 1   3   1
 1   1  -1
 3  11   6

Gaussian elimination produces the matrix $U$, which we can compute in Julia as in lecture 1:

In [63]:

# LU factorization (Gaussian elimination) of the matrix A, 
# passing the ( will go away) option Val{false} to prevent row re-ordering
L, U = lu(A, Val{false}) 
U # just show U

Out[63]:

3×3 Array{Float64,2}:
 1.0   3.0   1.0
 0.0  -2.0  -2.0
 0.0   0.0   1.0

Now, let's go through Gaussian elimination in matrix form, by expressing the elimination steps as matrix multiplications. In Gaussian elimination, we make linear combination of rows to cancel elements below the pivot, and we now know that this corresponds to multiplying on the left by some elimination matrix $E$.

The first step is to eliminate in the first column of $A$. The pivot is the 1 in the upper-left-hand corner. For this $A$, we need to:

Leave the first row alone.
Subtract the first row from the second row to get the new second row.
Subtract $3 \times {}$ first frow from the third row to get the new third row.

This corresponds to multiplying $A$ on the left by the matrix E1. As above (in the "row × matrix" picture), the three rows of E1 correspond exactly to the three row operations listed above:

In [64]:

E1 = [ 1 0 0
      -1 1 0
      -3 0 1]

Out[64]:

3×3 Array{Int64,2}:
  1  0  0
 -1  1  0
 -3  0  1

In [67]:

Int.(inv(E1))  ## What does this mean?

Out[67]:

3×3 Array{Int64,2}:
 1  0  0
 1  1  0
 3  0  1

In [69]:

E1 * A

Out[69]:

3×3 Array{Float64,2}:
  0.486247   0.762176   0.269746
 -0.325667  -0.127095  -0.254637
 -0.549413  -1.57389   -0.559246

In [71]:

inv(E1) *  (E1 * A)

Out[71]:

3×3 Array{Float64,2}:
 0.486247  0.762176  0.269746 
 0.160581  0.635081  0.0151095
 0.909329  0.712643  0.249992

We would love for you to see the above not as a mechanical observation but interpret it as reversing the row operationa above.

In [5]:

E1*A

Out[5]:

3×3 Array{Int64,2}:
 1   3   1
 0  -2  -2
 0   2   3

As desired, this introduced zeros below the diagonal in the first column. Now, we need to eliminate the 2 below the diagonal in the second column of E1*A. Our new pivot is $-2$ (in the second row), and we just add the second row of E1*A with the third row to make the new third row.

This corresponds to multiplying on the left by the matrix E2, which leaves the first two rows alone and makes the new third row by adding the second and third rows:

In [72]:

E2 = [1 0 0
      0 1 0
      0 1 1]

Out[72]:

3×3 Array{Int64,2}:
 1  0  0
 0  1  0
 0  1  1

In [75]:

E2*E1*A

Out[75]:

3×3 Array{Int64,2}:
 1   3   1
 0  -2  -2
 0   0   1

As expected, this is upper triangular, and in fact the same as the U matrix returned by the Julia lu function above:

In [76]:

E2*E1*A == U

Out[76]:

true

Thus, we have arrived at the formula: $$ \underbrace{E_2 E_1}_E A = U $$ Notice that we multiplied $A$ by the elimination matrices from right to left in the order of the steps: it is $E_2 E_1 A$, not $E_1 E_2 A$. Because matrix multiplication is generally not commutative, $E_2 E_1$ and $E_1 E_2$ give different matrices:

In [77]:

E2*E1

Out[77]:

3×3 Array{Int64,2}:
  1  0  0
 -1  1  0
 -4  1  1

In [78]:

E1*E2

Out[78]:

3×3 Array{Int64,2}:
  1  0  0
 -1  1  0
 -3  1  1

In [79]:

E1

Out[79]:

3×3 Array{Int64,2}:
  1  0  0
 -1  1  0
 -3  0  1

Notice, furthermore, that the matrices $E_1$ and $E_2$ are both lower-triangular matrices with ones on the diagonal. This is a consequence of the structure of Gaussian elimination (assuming no row re-ordering): we always add the pivot row to rows below it, never above it.

The product of lower-triangular matrices is always lower-triangular too. (In homework, you will explore a similar property for upper-triangular matrices) In consequence, the product $E = E_2 E_1$ is lower-triangular, and Gaussian elimination can be viewed as yielding $EA=U$ where $E$ is lower triangular and $U$ is upper triangular.

Inverse elimination: LU factors¶

We can write $EA=U$ as $A= E^{-1} U$, where $E^{-1}$ is the inverse of the matrix $E$. We will have more to say about matrix inverses later in 18.06, but for now we just need to know that it is the matrix that reverses the steps of Gaussian elimination, taking us back from $U$ to $A$. Computing matrix inverses is laborious in general, but in this particular case it is easy. We just need to reverse the steps one by one starting with the last elimination step and working back to the first one.

Hence, we need to reverse (invert) $E_2$ first on $U$, and then reverse (invert) $E_1$: $A = E_1^{-1} E_2^{-1} U$. But reversing an individual elimination step like $E_2$ is easy: we just flip the signs below the diagonal, so that wherever we added the pivot row we subtract and vice-versa. That is: $$ \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 1 & 1 \end{pmatrix}^{-1} = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & -1 & 1 \end{pmatrix} $$ (The last elimination step was adding the second row to the third row, so we reverse it by subtracting the second row from the third row of $U$.)

Julia can compute matrix inverses for us with the inv function. (It doesn't know the trick of flipping the sign, which only works for very special matrices, but it can compute it the "hard way" so quickly (for such a small matrix) that it doesn't matter.) Of course that gives the same result:

In [11]:

inv(E2)

Out[11]:

3×3 Array{Float64,2}:
 1.0   0.0  0.0
 0.0   1.0  0.0
 0.0  -1.0  1.0

Similarly for $E_1$:

In [12]:

inv(E1)

Out[12]:

3×3 Array{Float64,2}:
 1.0  0.0  0.0
 1.0  1.0  0.0
 3.0  0.0  1.0

If we didn't make any mistakes, then $E_1^{-1} E_2^{-1} U$ should give $A$, and it does:

In [13]:

inv(E1)*inv(E2)*U == A

Out[13]:

true

We use the letter "L" for the inverse elimination matrix $L = E^{-1} = E_1^{-1} E_2^{-1}$ Since the inverses of each elimination matrix were lower-triangular (with flipped signs), their product $L$ is also lower triangular:

(Think of a mechanical reason that products of lower-triangular are lower-triangular and a conceptual reason.)

In [14]:

L = inv(E1)*inv(E2)

Out[14]:

3×3 Array{Float64,2}:
 1.0   0.0  0.0
 1.0   1.0  0.0
 3.0  -1.0  1.0

As mentioned above, this is the same as the inverse of $E = E_2 E_1$:

In [15]:

inv(E2*E1)

Out[15]:

3×3 Array{Float64,2}:
 1.0   0.0  0.0
 1.0   1.0  0.0
 3.0  -1.0  1.0

The final result, therefore, is that Gaussian elimination (without row swaps) can be viewed as a factorization of the original matrix $A$ $$ A = LU $$ into a product of lower- and upper-triangular matrices. (Furthermore, although we didn't comment on this above, $L$ is always 1 along its diagonal.) This factorization is called the LU factorization of $A$. (It's why we used the lu function in Julia above.) When a computer performs Gaussian elimination, what it computes are the $L$ and $U$ factors.

What this accomplishes is to break a complicated matrix $A$ into much simpler pieces $L$ and $U$. It may not seem at first that $L$ and $U$ are that much simpler than $A$, but they are: lots of operations that are very difficult with $A$, like solving equations or computing the determinant, become easy once you known $L$ and $U$.

In [80]:

Out[80]:

3×3 Array{Float64,2}:
 1.0   0.0  0.0
 1.0   1.0  0.0
 3.0  -1.0  1.0

In [81]:

inv(E2*E1)

Out[81]:

3×3 Array{Float64,2}:
 1.0   0.0  0.0
 1.0   1.0  0.0
 3.0  -1.0  1.0

In [17]:

E1

Out[17]:

3×3 Array{Int64,2}:
  1  0  0
 -1  1  0
 -3  0  1

In [18]:

E2

Out[18]:

3×3 Array{Int64,2}:
 1  0  0
 0  1  0
 0  1  1

E1 and E2 have the negative multipliers below the diagonal

inv(E1) and inv(E2) have the multipliers below the diagonal.

L has the multipliers below the diagonal.

inv(L) does not generally have multipliers anywhere.

In [82]:

Out[82]:

3×3 Array{Float64,2}:
 1.0   0.0  0.0
 1.0   1.0  0.0
 3.0  -1.0  1.0

In [83]:

inv(L)

Out[83]:

3×3 Array{Float64,2}:
  1.0  0.0  0.0
 -1.0  1.0  0.0
 -4.0  1.0  1.0

In [84]:

E1

Out[84]:

3×3 Array{Int64,2}:
  1  0  0
 -1  1  0
 -3  0  1

In [85]:

inv(E1)

Out[85]:

3×3 Array{Float64,2}:
 1.0  0.0  0.0
 1.0  1.0  0.0
 3.0  0.0  1.0

In [86]:

E2

Out[86]:

3×3 Array{Int64,2}:
 1  0  0
 0  1  0
 0  1  1

In [87]:

inv(E2)

Out[87]:

3×3 Array{Float64,2}:
 1.0   0.0  0.0
 0.0   1.0  0.0
 0.0  -1.0  1.0

In [88]:

Out[88]:

3×3 Array{Float64,2}:
 1.0   0.0  0.0
 1.0   1.0  0.0
 3.0  -1.0  1.0

In [89]:

inv(L)

Out[89]:

3×3 Array{Float64,2}:
  1.0  0.0  0.0
 -1.0  1.0  0.0
 -4.0  1.0  1.0

In [91]:

E1 # differs from I in column 1

Out[91]:

3×3 Array{Int64,2}:
  1  0  0
 -1  1  0
 -3  0  1

In [92]:

E2 # differs from I in column 2

Out[92]:

3×3 Array{Int64,2}:
 1  0  0
 0  1  0
 0  1  1

In [94]:

L# differs from I in more than one column L

Out[94]:

3×3 Array{Float64,2}:
 1.0   0.0  0.0
 1.0   1.0  0.0
 3.0  -1.0  1.0

In [95]:

E2 * E1

Out[95]:

3×3 Array{Int64,2}:
  1  0  0
 -1  1  0
 -4  1  1

In [ ]: