Foundations of Computational Economics #23

Foundations of Computational Economics #23¶

by Fedor Iskhakov, ANU

More on Newton-Raphson method¶

https://youtu.be/DKG_bJhVbsM

Description: Failures of Newton method, domain of attraction. Multivariate Newton for optimization problems.

Another look at Newton method¶

iterative solver
Newton step $ x_{i+1} = x_i - \frac{f(x_i)}{f'(x_i)} $
fast (quadratic) convergence

Today:

potential problems
multivariate version

In [2]:

def newton(fun,grad,x0,tol=1e-6,maxiter=100,callback=None):
    '''Newton method for solving equation f(x)=0
    with given tolerance and number of iterations.
    Callback function is invoked at each iteration if given.
    '''
    for i in range(maxiter):
        x1 = x0 - fun(x0)/grad(x0)
        err = abs(x1-x0)
        if callback != None: callback(iter=i,err=err,x0=x0,x1=x1,fun=fun)
        if err<tol: break
        x0 = x1
    else:
        raise RuntimeError('Failed to converge in %d iterations'%maxiter)
    return x1

In [3]:

f = lambda x: -4*x**3+5*x+1
g = lambda x: -12*x**2+5
x = newton(f,g,x0=-2.5,maxiter=7)
print('Solution is x=%1.3f, f(x)=%1.12f' % (x,f(x)))

Solution is x=-1.000, f(x)=0.000000000000

In [4]:

# make nice plot
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = [12, 8]

def newton_pic(f,g,x0,    # function, gradient, initial point
               a=0,b=1,   # bounds for the picture
               **kwargs):  # additional parameters passed to solver
    '''Illustrate the Newton method iterations'''
    xd = np.linspace(a,b,1000)       # x grid
    plt.plot(xd,f(xd),c='red')       # plot the function
    plt.plot([a,b],[0,0],c='black')  # plot zero line
    ylim=[f(a),min(f(b),0)]
    def plot_step(**kwargs):
        plot_step.counter += 1
        x0,x1 = kwargs['x0'],kwargs['x1']
        f0 = kwargs['fun'](x0)
        plt.plot([x0,x0],[0,f0],c='green')
        plt.plot([x0,x1],[f0,0],c='green')
    plot_step.counter = 0
    try:
        newton(f,g,x0,callback=plot_step,**kwargs)
        print('Converged in %d steps'%plot_step.counter)
    except RuntimeError:  # catch the Runtime convergence error
        print('Failed to converge in %d iterations!'%plot_step.counter)
    plt.xlim((a,b))
    plt.show()

In [5]:

newton_pic(f=f,g=g,x0=-2.5,a=-3,b=-.5)

Converged in 7 steps

When does Newton fail?¶

Multiple solutions — be aware
Failures

diversion (domain of attraction)
cycles
function domain
differentiability

Reduced performance — may be slower in special cases

Multiple solutions¶

In [5]:

x0 = -2.5  # -2.5 -0.595 0.46
newton_pic(f=f,g=g,x0=-2.5,a=-3,b=1.5)

Converged in 7 steps

Diversion¶

In [6]:

f = lambda x: np.arctan(x)
g = lambda x: 1/(1+x**2)
newton_pic(f=f,g=g,x0=1.25,a=-20,b=20)
newton_pic(f=f,g=g,x0=1.5,a=-20,b=20,maxiter=8)

Converged in 6 steps

Failed to converge in 8 iterations!

Cycles¶

In [7]:

f = lambda x: -4*x**3+5*x+1 # function
g = lambda x: -12*x**2+5    # derivative
x0 = -0.5689842546416416
newton_pic(f,g,x0,a=-1.5,b=1.5,maxiter=15)

Failed to converge in 15 iterations!

In [8]:

f = lambda x: -4*x**3+5*x+1 # function
g = lambda x: -12*x**2+5    # derivative
h = lambda x: -24*x         # second derivative
# find the unfortunate looping x0
ns = lambda x0: x0 - f(x0)/g(x0) # Newton step
ds = lambda x0: f(x0)*h(x0)/g(x0)**2 # derivative of ns
f2 = lambda x0: ns(ns(x0))-x0
g2 = lambda x0: ds(ns(x0))*ds(x0)-1

x0 = newton(f2,g2,x0=-0.56,tol=1e-16)  # find the cycling starting point
print('To cycle start with x0 = %1.16f'%x0)

newton_pic(f,g,x0,a=-1.5,b=1.5,maxiter=15)

To cycle start with x0 = -0.5689842546416416
Failed to converge in 15 iterations!

Function domain and differentiability¶

undefined value of function or the derivative during Newton steps

In [9]:

f = lambda x: np.log(x)
g = lambda x: 1/x
x0 = 2.9
newton_pic(f,g,x0,a=0.001,b=3)

/usr/local/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: RuntimeWarning: invalid value encountered in log
  """Entry point for launching an IPython kernel.

Failed to converge in 100 iterations!

Suboptimal performance¶

In [10]:

f = lambda x: x**9  # VERY SPECIAL CASE
g = lambda x: 9*x**8
x0 = 1.0
newton_pic(f,g,x0,a=-1.5,b=1.5)
def print_err(**kwargs):
    print('{:4d}:  x = {:17.14f}  err = {:8.6e}'.format(kwargs['iter'],kwargs['x1'],kwargs['err']))
newton(f,g,x0,callback=print_err)

Converged in 100 steps

   0:  x =  0.88888888888889  err = 1.111111e-01
   1:  x =  0.79012345679012  err = 9.876543e-02
   2:  x =  0.70233196159122  err = 8.779150e-02
   3:  x =  0.62429507696997  err = 7.803688e-02
   4:  x =  0.55492895730664  err = 6.936612e-02
   5:  x =  0.49327018427257  err = 6.165877e-02
   6:  x =  0.43846238602006  err = 5.480780e-02
   7:  x =  0.38974434312895  err = 4.871804e-02
   8:  x =  0.34643941611462  err = 4.330493e-02
   9:  x =  0.30794614765744  err = 3.849327e-02
  10:  x =  0.27372990902883  err = 3.421624e-02
  11:  x =  0.24331547469230  err = 3.041443e-02
  12:  x =  0.21628042194871  err = 2.703505e-02
  13:  x =  0.19224926395441  err = 2.403116e-02
  14:  x =  0.17088823462614  err = 2.136103e-02
  15:  x =  0.15190065300101  err = 1.898758e-02
  16:  x =  0.13502280266757  err = 1.687785e-02
  17:  x =  0.12002026903784  err = 1.500253e-02
  18:  x =  0.10668468358919  err = 1.333559e-02
  19:  x =  0.09483082985706  err = 1.185385e-02
  20:  x =  0.08429407098405  err = 1.053676e-02
  21:  x =  0.07492806309693  err = 9.366008e-03
  22:  x =  0.06660272275283  err = 8.325340e-03
  23:  x =  0.05920242022474  err = 7.400303e-03
  24:  x =  0.05262437353310  err = 6.578047e-03
  25:  x =  0.04677722091831  err = 5.847153e-03
  26:  x =  0.04157975192739  err = 5.197469e-03
  27:  x =  0.03695977949101  err = 4.619972e-03
  28:  x =  0.03285313732534  err = 4.106642e-03
  29:  x =  0.02920278873364  err = 3.650349e-03
  30:  x =  0.02595803442990  err = 3.244754e-03
  31:  x =  0.02307380838213  err = 2.884226e-03
  32:  x =  0.02051005189523  err = 2.563756e-03
  33:  x =  0.01823115724020  err = 2.278895e-03
  34:  x =  0.01620547310240  err = 2.025684e-03
  35:  x =  0.01440486497991  err = 1.800608e-03
  36:  x =  0.01280432442659  err = 1.600541e-03
  37:  x =  0.01138162171253  err = 1.422703e-03
  38:  x =  0.01011699707780  err = 1.264625e-03
  39:  x =  0.00899288629138  err = 1.124111e-03
  40:  x =  0.00799367670345  err = 9.992096e-04
  41:  x =  0.00710549040306  err = 8.881863e-04
  42:  x =  0.00631599146939  err = 7.894989e-04
  43:  x =  0.00561421463946  err = 7.017768e-04
  44:  x =  0.00499041301285  err = 6.238016e-04
  45:  x =  0.00443592267809  err = 5.544903e-04
  46:  x =  0.00394304238052  err = 4.928803e-04
  47:  x =  0.00350492656047  err = 4.381158e-04
  48:  x =  0.00311549027597  err = 3.894363e-04
  49:  x =  0.00276932468975  err = 3.461656e-04
  50:  x =  0.00246162194645  err = 3.077027e-04
  51:  x =  0.00218810839684  err = 2.735135e-04
  52:  x =  0.00194498524164  err = 2.431232e-04
  53:  x =  0.00172887577034  err = 2.161095e-04
  54:  x =  0.00153677846253  err = 1.920973e-04
  55:  x =  0.00136602530002  err = 1.707532e-04
  56:  x =  0.00121424471113  err = 1.517806e-04
  57:  x =  0.00107932863212  err = 1.349161e-04
  58:  x =  0.00095940322855  err = 1.199254e-04
  59:  x =  0.00085280286982  err = 1.066004e-04
  60:  x =  0.00075804699540  err = 9.475587e-05
  61:  x =  0.00067381955146  err = 8.422744e-05
  62:  x =  0.00059895071241  err = 7.486884e-05
  63:  x =  0.00053240063326  err = 6.655008e-05
  64:  x =  0.00047324500734  err = 5.915563e-05
  65:  x =  0.00042066222875  err = 5.258278e-05
  66:  x =  0.00037392198111  err = 4.674025e-05
  67:  x =  0.00033237509432  err = 4.154689e-05
  68:  x =  0.00029544452828  err = 3.693057e-05
  69:  x =  0.00026261735847  err = 3.282717e-05
  70:  x =  0.00023343765198  err = 2.917971e-05
  71:  x =  0.00020750013509  err = 2.593752e-05
  72:  x =  0.00018444456452  err = 2.305557e-05
  73:  x =  0.00016395072402  err = 2.049384e-05
  74:  x =  0.00014573397691  err = 1.821675e-05
  75:  x =  0.00012954131281  err = 1.619266e-05
  76:  x =  0.00011514783361  err = 1.439348e-05
  77:  x =  0.00010235362987  err = 1.279420e-05
  78:  x =  0.00009098100433  err = 1.137263e-05
  79:  x =  0.00008087200385  err = 1.010900e-05
  80:  x =  0.00007188622564  err = 8.985778e-06
  81:  x =  0.00006389886724  err = 7.987358e-06
  82:  x =  0.00005679899310  err = 7.099874e-06
  83:  x =  0.00005048799387  err = 6.310999e-06
  84:  x =  0.00004487821677  err = 5.609777e-06
  85:  x =  0.00003989174824  err = 4.986469e-06
  86:  x =  0.00003545933177  err = 4.432416e-06
  87:  x =  0.00003151940602  err = 3.939926e-06
  88:  x =  0.00002801724979  err = 3.502156e-06
  89:  x =  0.00002490422204  err = 3.113028e-06
  90:  x =  0.00002213708626  err = 2.767136e-06
  91:  x =  0.00001967741001  err = 2.459676e-06
  92:  x =  0.00001749103112  err = 2.186379e-06
  93:  x =  0.00001554758321  err = 1.943448e-06
  94:  x =  0.00001382007397  err = 1.727509e-06
  95:  x =  0.00001228451019  err = 1.535564e-06
  96:  x =  0.00001091956462  err = 1.364946e-06
  97:  x =  0.00000970627966  err = 1.213285e-06
  98:  x =  0.00000862780414  err = 1.078476e-06
  99:  x =  0.00000766915924  err = 9.586449e-07

Out[10]:

7.669159237266011e-06

Conclusions for the use of Newton-Raphson method in economics¶

many of the problematic cases rarely happen!
the problems that are real in economic applications:
1. multiple solutions
2. function domain constraints
3. domain of attraction
good idea to combine slow robust solver (to get into the domain of attraction) with Newton method (for fast convergence)
- successive approximations + Newton method is a classic approach in estimation of dynamic programming models (NFXP estimator, Rust 1987)

Newton fractals¶

visualizing the domain of attraction for particular problems

$$ x^3 - 1 = 0 $$

solutions $ x=1 $ and two more in the complex number plan $ x=1 $
draw the number of iterations to converge to any solution

https://www.youtube.com/watch?v=MWD2A0Vg2V0

Multivariate Newton-Raphson¶

Consider a system of $ n $ non-linear equations with $ n $ variables, given by

$$ G(x) = 0, $$

$ x \in \mathbb{R}^n $ and $ G(x) $ is $ n $-valued function.
Jacobian matrix is $ n $-by-$ n $ matrix of partial derivatives $ \nabla G(x) $
Newton step:

$$ x_{i+1} = x_{i} - \big( \nabla G(x_i) \big)^{-1} G(x_i) $$

Maximization of multivariate function¶

classic example for multivariate Newton is solution of F.O.C. in multivariate optimization problems
consider $ F(x) \rightarrow \max $, where $ x \in \mathbb{R}^n $
let $ G(x) $ be the gradient of $ F(x) $, which is $ n $-valued function (derivatives with respect to each variable).

$$ G(x) = \nabla F(x) = 0 $$

Newton step in this case $ x_{i+1} = x_{i} - \big( \nabla^2 F(x_i) \big)^{-1} \nabla F(x_i) $
$ \nabla^2 F(x) $ is $ n $-by-$ n $ matrix of second derivatives is Hessian matrix

Example¶

$$ F(x,y) = 2.575 - 2 \cos(x) \cos(y+\pi) - 0.575 \cos(1.25 \pi - 2x) $$

In [12]:

F = lambda x, y: 2.575 - 2*np.cos(x)*np.cos(y+np.pi) - 0.575*np.cos(1.25*np.pi - 2*x)

def contour_plot(fun,levels=20,bound=1,npoints=100,ax=None):
    '''Make a contour plot for illustrations'''
    xx = np.linspace(-bound, bound, npoints)
    yy = np.linspace(-bound, bound, npoints)
    X,Y = np.meshgrid(xx, yy)
    Z = fun(X, Y)
    if ax==None:
        fig, ax = plt.subplots(figsize=(10,8))
    cnt = ax.contour(X,Y,Z, vmin=Z.min(), vmax=Z.max(),levels=np.linspace(Z.min(),Z.max(),levels))
    ax.set_aspect('equal', 'box')
    return cnt

In [13]:

contour_plot(F)
contour_plot(F,bound=np.pi)

Out[13]:

<matplotlib.contour.QuadContourSet at 0x7f9780dc4390>

Jacobian and Hessian¶

$$ F(x,y) = 2.575 - 2 \cos(x) \cos(y+\pi) - 0.575 \cos(1.25 \pi - 2x) $$$$ G(x,y) = \begin{pmatrix} 2 \sin(x) \cos(y+\pi) - 1.15 \sin(1.25\pi - 2x) \\ 2 \cos(x) \sin(y+\pi) \\ \end{pmatrix} $$$$ H(x,y) = \begin{pmatrix} 2 \cos(x) \cos(y+\pi) + 2.3 \cos(1.25\pi - 2x) & -2 \sin(x) \sin(y+\pi) \\ -2 \sin(x) \sin(y+\pi) & 2 \cos(x) \cos(y+\pi) \\ \end{pmatrix} $$

In [14]:

G = lambda x, y: [2*np.sin(x)*np.cos(y+np.pi) - 2*0.575 * np.sin(1.25*np.pi - 2*x),
                  2*np.cos(x)*np.sin(y+np.pi)]
H = lambda x, y: [[2*np.cos(x)*np.cos(y+np.pi) - 2*0.575 * np.sin(1.25*np.pi - 2*x),
                   -2*np.sin(x)*np.sin(y+np.pi)],
                  [-2*np.sin(x)*np.sin(y+np.pi),
                    2*np.cos(x)*np.cos(y+np.pi)]]

In [15]:

fig, axs = plt.subplots(1, 2)
contour_plot(lambda x,y: G(x,y)[0],ax=axs[0],bound=np.pi)
contour_plot(lambda x,y: G(x,y)[1],ax=axs[1],bound=np.pi)

Out[15]:

<matplotlib.contour.QuadContourSet at 0x7f9780ec43d0>

In [16]:

fig, axs = plt.subplots(2, 2)
contour_plot(lambda x,y: H(x,y)[0][0],ax=axs[0,0],bound=np.pi)
contour_plot(lambda x,y: H(x,y)[0][1],ax=axs[0,1],bound=np.pi)
contour_plot(lambda x,y: H(x,y)[1][0],ax=axs[1,0],bound=np.pi)
contour_plot(lambda x,y: H(x,y)[1][1],ax=axs[1,1],bound=np.pi)

Out[16]:

<matplotlib.contour.QuadContourSet at 0x7f97581cdcd0>

In [17]:

def newton2(fun,grad,x0,tol=1e-6,maxiter=100,callback=None):
    '''Newton method for solving equation f(x)=0, x is vector of 2 elements,
    with given tolerance and number of iterations.
    Callback function is invoked at each iteration if given.
    '''
    # conversion to array function of array argument
    npfun = lambda x: np.asarray(fun(x[0],x[1]))
    npgrad = lambda x: np.asarray(grad(x[0],x[1]))
    for i in range(maxiter):
        x1 = x0 - np.linalg.inv(npgrad(x0)) @ npfun(x0)  # matrix version
        err = np.amax(np.abs(x1-x0))  # vector version
        if callback != None: callback(iter=i,err=err,x0=x0,x1=x1,fun=fun)
        if err<tol: break
        x0 = x1
    else:
        raise RuntimeError('Failed to converge in %d iterations'%maxiter)
    return x1

In [32]:

def plot_step(**kwargs):
    plot_step.counter += 1
    x0,x1 = kwargs['x0'],kwargs['x1']
    b = max(np.amax(np.abs(x0)),np.amax(np.abs(x1)))+1
    if plot_step.counter == 1 or b>plot_step.bound:
        plot_step.bound=b  # save the bound for later calls
        if plot_step.counter > 1:
            # remove old conrours if resdrawing
            for c in plot_step.contours.collections:
                c.remove()
        plot_step.contours = contour_plot(F,bound=b,ax=ax)
    ax.plot([x0[0],x1[0]],[x0[1],x1[1]],c='r')
    if plot_step.counter == 1:
        ax.scatter(x0[0],x0[1],c='r',edgecolors='r')
plot_step.counter = 0

In [33]:

x0 = np.array([-1.8,-.2])
# x0 = np.array([-.8,-.2])
# x0 = np.array([-.2,1.8])
# x0 = np.array([-.2,-.2])
# x0 = np.array([-.25,1.5])
fig, ax = plt.subplots(1,1)
xs = newton2(G,H,x0,callback=plot_step)
plt.show()
print('Converged in %d steps'%plot_step.counter)
print('x*    = (%1.5f,%1.5f)'%tuple(xs))
print('G(x*) = (%1.5e,%1.5e)'%tuple(G(*tuple(xs))))

Converged in 14 steps
x*    = (-0.56967,-3.14159)
G(x*) = (9.68660e-08,0.00000e+00)

Further learning resources¶

Oscar Veliz videos on Newton method and domains of attraction https://www.youtube.com/watch?v=E24zUEKqgwQ https://www.youtube.com/watch?v=zyXRo8Qjj0A https://www.youtube.com/watch?v=MWD2A0Vg2V0
NFXP estimator https://en.wikipedia.org/wiki/Dynamic_discrete_choice#Estimation_with_nested_fixed_point_algorithm