The solution is $$ \mathbf{x} = \mathbf{A}^{-1} \mathbf{b}. $$ We want to know how the solution changes with a small perturbation of the input $\mathbf{b}$ (or $\mathbf{A}$).
Then $$ \mathbf{x}_{\text{new}} = \mathbf{A}^{-1} (\mathbf{b} + \Delta \mathbf{b}) = \mathbf{x} + \Delta \mathbf{x}. $$ Thus $$ \|\Delta \mathbf{x}\| = \|\mathbf{A}^{-1} \Delta \mathbf{b}\| \le \|\mathbf{A}^{-1}\| \|\Delta \mathbf{b}\|. $$ Because $\mathbf{b} = \mathbf{A} \mathbf{x}$, $$ \frac{1}{\|\mathbf{x}\|} \le \|\mathbf{A}\| \frac{1}{\|\mathbf{b}\|}. $$ This results $$ \frac{ \|\Delta \mathbf{x}\|}{\|\mathbf{x}\|} \le \|\mathbf{A}\|\|\mathbf{A}^{-1}\| \frac{\|\Delta \mathbf{b}\|}{\|\mathbf{b}\|}. $$
$\kappa(\mathbf{A}) = \|\mathbf{A}\| \|\mathbf{A}^{-1}\|$ is called the condition number for linear equation. It depends on the matrix norm being used.
Large condition number means "bad".
Some facts:
The last fact says that the condition number of $\mathbf{A}^T \mathbf{A}$ can be much larger than that of $\mathbf{A}$.
The smallest singular value $\sigma_n$ indicates the distance to the trouble.
Condition number for the least squares problem is more complicated. Roughly speaking,
Consider the simple case
Forming normal equation yields a singular Gramian matrix $$ \begin{eqnarray*} \mathbf{X}^T \mathbf{X} = \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix} \end{eqnarray*} $$ if executed with a precision of 6 decimal digits.
The Longley (1967) macroeconomic data set is a famous test example for numerical software in early dates.
using Requests
longleyurl = get("http://hua-zhou.github.io/teaching/biostatm280-2018spring/slides/13-cond/longley.txt")
longley = readdlm(IOBuffer(longleyurl.data))
16×7 Array{Float64,2}: 60323.0 83.0 234289.0 2356.0 1590.0 107608.0 1947.0 61122.0 88.5 259426.0 2325.0 1456.0 108632.0 1948.0 60171.0 88.2 258054.0 3682.0 1616.0 109773.0 1949.0 61187.0 89.5 284599.0 3351.0 1650.0 110929.0 1950.0 63221.0 96.2 328975.0 2099.0 3099.0 112075.0 1951.0 63639.0 98.1 346999.0 1932.0 3594.0 113270.0 1952.0 64989.0 99.0 365385.0 1870.0 3547.0 115094.0 1953.0 63761.0 100.0 363112.0 3578.0 3350.0 116219.0 1954.0 66019.0 101.2 397469.0 2904.0 3048.0 117388.0 1955.0 67857.0 104.6 419180.0 2822.0 2857.0 118734.0 1956.0 68169.0 108.4 442769.0 2936.0 2798.0 120445.0 1957.0 66513.0 110.8 444546.0 4681.0 2637.0 121950.0 1958.0 68655.0 112.6 482704.0 3813.0 2552.0 123366.0 1959.0 69564.0 114.2 502601.0 3931.0 2514.0 125368.0 1960.0 69331.0 115.7 518173.0 4806.0 2572.0 127852.0 1961.0 70551.0 116.9 554894.0 4007.0 2827.0 130081.0 1962.0
using StatPlots
gr()
corrplot(longley,
label = ["Employ" "Price" "GNP" "Jobless" "Military" "PopSize" "Year"])
WARNING: Keyword argument match_dimensions not supported with Plots.GRBackend(). Choose from: Set(Symbol[:top_margin, :group, :background_color, :yforeground_color_text, :yguidefontcolor, :seriesalpha, :legendfontcolor, :seriescolor, :ztick_direction, :zlims, :overwrite_figure, :xguidefonthalign, :normalize, :linestyle, :xflip, :fillcolor, :ygrid, :background_color_inside, :zguidefonthalign, :bins, :yscale, :xtickfontcolor, :xguide, :fillalpha, :tick_direction, :yguidefontsize, :legendfontfamily, :foreground_color, :xtickfonthalign, :x, :ytickfontrotation, :legend, :discrete_values, :ytick_direction, :xguidefontrotation, :ribbon, :tickfontrotation, :xdiscrete_values, :legendtitle, :xgridstyle, :orientation, :gridstyle, :markersize, :camera, :xforeground_color_grid, :quiver, :zticks, :markerstrokecolor, :ztickfontrotation, :ztickfonthalign, :legendfonthalign, :xtickfontsize, :levels, :zgridstyle, :foreground_color_border, :zguidefontvalign, :marker_z, :markerstrokealpha, :markeralpha, :tickfontvalign, :zguidefontcolor, :ygridlinewidth, :zlink, :zscale, :smooth, :xticks, :zguidefontsize, :y, :margin, :ytickfontcolor, :yforeground_color_border, :zguidefontfamily, :zgridalpha, :yguidefontvalign, :yguidefonthalign, :ztickfontcolor, :html_output_format, :tickfontcolor, :titlefontrotation, :legendfontvalign, :tickfontsize, :z, :yforeground_color_axis, :xtickfontrotation, :xerror, :contour_labels, :xguidefontcolor, :primary, :guidefonthalign, :aspect_ratio, :link, :yguide, :guidefontvalign, :yguidefontfamily, :layout, :polar, :right_margin, :xlink, :series_annotations, :inset_subplots, :ytickfontsize, :tickfontfamily, :xgrid, :ygridalpha, :xtick_direction, :colorbar, :zflip, :ticks, :legendfontrotation, :linealpha, :arrow, :xtickfontvalign, :zgrid, :bar_width, :zguide, :zforeground_color_text, :weights, :xgridalpha, :ygridstyle, :fill_z, :ztickfontfamily, :markershape, :background_color_subplot, :xguidefontvalign, :markerstrokewidth, :xguidefontfamily, :gridlinewidth, :foreground_color_subplot, :xgridlinewidth, :foreground_color_text, :titlefonthalign, :yerror, :zgridlinewidth, :grid, :xguidefontsize, :xforeground_color_axis, :background_color_outside, :titlefontcolor, :line_z, :size, :projection, :zguidefontrotation, :ydiscrete_values, :seriestype, :yflip, :fillrange, :ztickfontvalign, :xlims, :xforeground_color_border, :markercolor, :ylink, :yforeground_color_grid, :color_palette, :lims, :xscale, :left_margin, :annotations, :window_title, :foreground_color_axis, :yguidefontrotation, :guidefontsize, :zdiscrete_values, :tickfonthalign, :bottom_margin, :framestyle, :scale, :zforeground_color_border, :background_color_legend, :linecolor, :foreground_color_legend, :title, :subplot_index, :flip, :titlefontvalign, :foreground_color_grid, :linewidth, :ztickfontsize, :gridalpha, :guidefontfamily, :ylims, :xtickfontfamily, :ytickfontvalign, :ytickfontfamily, :xforeground_color_text, :show, :guidefontrotation, :legendfontsize, :subplot, :label, :ytickfonthalign, :guide, :guidefontcolor, :titlefontsize, :titlefontfamily, :zforeground_color_axis, :zforeground_color_grid, :yticks])
# response: Employment
y = longley[:, 1]
# predictor matrix
X = [ones(length(y)) longley[:, 2:end]]
16×7 Array{Float64,2}: 1.0 83.0 234289.0 2356.0 1590.0 107608.0 1947.0 1.0 88.5 259426.0 2325.0 1456.0 108632.0 1948.0 1.0 88.2 258054.0 3682.0 1616.0 109773.0 1949.0 1.0 89.5 284599.0 3351.0 1650.0 110929.0 1950.0 1.0 96.2 328975.0 2099.0 3099.0 112075.0 1951.0 1.0 98.1 346999.0 1932.0 3594.0 113270.0 1952.0 1.0 99.0 365385.0 1870.0 3547.0 115094.0 1953.0 1.0 100.0 363112.0 3578.0 3350.0 116219.0 1954.0 1.0 101.2 397469.0 2904.0 3048.0 117388.0 1955.0 1.0 104.6 419180.0 2822.0 2857.0 118734.0 1956.0 1.0 108.4 442769.0 2936.0 2798.0 120445.0 1957.0 1.0 110.8 444546.0 4681.0 2637.0 121950.0 1958.0 1.0 112.6 482704.0 3813.0 2552.0 123366.0 1959.0 1.0 114.2 502601.0 3931.0 2514.0 125368.0 1960.0 1.0 115.7 518173.0 4806.0 2572.0 127852.0 1961.0 1.0 116.9 554894.0 4007.0 2827.0 130081.0 1962.0
# Julia function for obtaining condition number
cond(X)
4.859257015454868e9
# what's the implementation?
@which cond(X)
# we see the smallest singular value (aka trouble number) is very small
xsvals = svdvals(X)
7-element Array{Float64,1}: 1.66367e6 83899.6 3407.2 1582.64 41.6936 3.64809 0.000342371
# condition number of the design matrix
xcond = maximum(xsvals) / minimum(xsvals)
4.859257015454868e9
# X is full rank from SVD
xrksvd = rank(X)
7
@which rank(X)
# least squares from QR
X \ y
7-element Array{Float64,1}: -3.48226e6 15.0619 -0.0358192 -2.02023 -1.03323 -0.0511041 1829.15
# Gram matrix
G = X'X
7×7 Array{Float64,2}: 16.0 1626.9 6.20318e6 … 1.87878e6 31272.0 1626.9 1.67172e5 6.46701e8 1.9214e8 3.18054e6 6.20318e6 6.46701e8 2.55315e12 7.3868e11 1.21312e10 51093.0 5.28908e6 2.06505e10 6.06649e9 9.99059e7 41707.0 4.29317e6 1.66329e10 4.92386e9 8.15371e7 1.87878e6 1.9214e8 7.3868e11 … 2.2134e11 3.67258e9 31272.0 3.18054e6 1.21312e10 3.67258e9 6.11215e7
# rank of Gram matrix from SVD
# rank deficient!
rank(G)
6
svdvals(G)
7-element Array{Float64,1}: 2.76779e12 7.03914e9 1.1609e7 2.50476e6 1738.36 13.3086 1.16099e-7
# rank of Gram matrix from cholesky
# full!
gchol = cholfact(G, :L, Val{true})
rank(gchol)
7
# least squares solution from Cholesky matches those from QR
gchol \ (X'y)
7-element Array{Float64,1}: -3.48226e6 15.0619 -0.0358192 -2.02023 -1.03323 -0.0511041 1829.15
Xsp = convert(Matrix{Float32}, X)
ysp = convert(Vector{Float32}, y)
# least squares by QR: dramatically different from Float64 solution
Xsp \ ysp
7-element Array{Float32,1}: 0.0237241 -52.9953 0.0710731 -0.42347 -0.572563 -0.414199 48.4177
# least squares by Cholesky: failed
G = Xsp'Xsp
gchol = cholfact(G, :L, Val{true})
gchol \ (Xsp'ysp)
Base.LinAlg.RankDeficientException(1) Stacktrace: [1] chkfullrank at ./linalg/cholesky.jl:548 [inlined] [2] A_ldiv_B!(::Base.LinAlg.CholeskyPivoted{Float32,Array{Float32,2}}, ::Array{Float32,1}) at ./linalg/cholesky.jl:458 [3] \(::Base.LinAlg.CholeskyPivoted{Float32,Array{Float32,2}}, ::Array{Float32,1}) at ./linalg/factorization.jl:48
rank(Xsp)
6
# rank of Gram matrix by Cholesky
rank(gchol)
6
# rank of Gram matrix by SVD
rank(G)
4
using StatsBase
# center and standardize matrix along dimension 1
Xcs = zscore(X[:, 2:end], 1)
Xcs = [ones(length(y)) Xcs]
16×7 Array{Float64,2}: 1.0 -1.7311 -1.54343 -0.896035 -1.46093 -1.41114 -1.57532 1.0 -1.22144 -1.29053 -0.929209 -1.65348 -1.26393 -1.36527 1.0 -1.24924 -1.30434 0.52296 -1.42357 -1.0999 -1.15523 1.0 -1.12878 -1.03727 0.168746 -1.37471 -0.933713 -0.945189 1.0 -0.50792 -0.590809 -1.17106 0.707427 -0.768965 -0.735147 1.0 -0.331857 -0.409472 -1.34977 1.41872 -0.597174 -0.525105 1.0 -0.248458 -0.224493 -1.41612 1.35118 -0.334958 -0.315063 1.0 -0.155793 -0.247361 0.411666 1.0681 -0.173229 -0.105021 1.0 -0.0445951 0.0983004 -0.309603 0.634143 -0.00517531 0.105021 1.0 0.270466 0.316732 -0.397353 0.359686 0.188324 0.315063 1.0 0.622593 0.554058 -0.275358 0.274906 0.434295 0.525105 1.0 0.84499 0.571936 1.59202 0.0435575 0.650652 0.735147 1.0 1.01179 0.955839 0.663147 -0.0785831 0.854214 0.945189 1.0 1.16005 1.15602 0.789423 -0.133187 1.14202 1.15523 1.0 1.29905 1.31269 1.72579 -0.0498441 1.49912 1.36527 1.0 1.41025 1.68213 0.870753 0.316578 1.81955 1.57532
cond(Xcs)
110.54415344231359
rank(Xcs)
7
rank(Xcs'Xcs)
7
Chapter 6 of Numerical Analysis for Statisticians of Kenneth Lange (2010).
Section 2.6 of Matrix Computation by Gene Golub and Charles Van Loan (2013).
versioninfo()
Julia Version 0.6.2 Commit d386e40c17 (2017-12-13 18:08 UTC) Platform Info: OS: macOS (x86_64-apple-darwin14.5.0) CPU: Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz WORD_SIZE: 64 BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell) LAPACK: libopenblas64_ LIBM: libopenlibm LLVM: libLLVM-3.9.1 (ORCJIT, skylake)