Lecture 13¶

Optimization¶

In [1]:

from scipy import optimize
import cvxopt # pip install cvxopt
import matplotlib.pyplot as plt
import numpy as np
import sympy
sympy.init_printing()

univariate, multivariate
linear programming, nonlinear programming
unconstrained, constrained
convex problems

Univariate Optimization¶

Minimize the area of a cylinder with unit volume. Here, suitable variables are the radius r and height h of the cylinder, and the objective function is $f ([r, h]) = 2\pi r^2+2\pi rh$, subject to the equality constraint $g([r, h]) = \pi r^2h − 1 = 0$.

In [2]:

def f(r):
    return 2 * np.pi * r**2 + 2 / r
r_min = optimize.brent(f, brack=(0.1, 4))
r_min

Out[2]:

$\displaystyle 0.541926077255714$

In [3]:

f(r_min)

Out[3]:

$\displaystyle 5.53581044593209$

In [4]:

optimize.minimize_scalar(f, bracket=(0.1, 4))

Out[4]:

     fun: 5.535810445932086
 message: '\nOptimization terminated successfully;\nThe returned value satisfies the termination criteria\n(using xtol = 1.48e-08 )'
    nfev: 19
     nit: 15
 success: True
       x: 0.5419260772557135

Unconstrained Multivariate Optimization¶

As in the univariate case, Newton’s method can be viewed as a local quadratic approximation of the function, which when minimized gives an iteration scheme. In the multivariate case, the iteration formula

$$ x_{k+1} = x_k-H^{-1}_f(x_k)\nabla f(x_k)$$

$$ \min_x f(x) = (x_1-1)^4+5(x_2-1)^2-2x_1x_2$$

In [11]:

x1, x2 = sympy.symbols("x_1, x_2")
f_sym = (x1-1)**4 + 5 * (x2-1)**2 - 2*x1*x2
fprime_sym = [f_sym.diff(x_) for x_ in (x1, x2)]

In [7]:

# Gradient
sympy.Matrix(fprime_sym)

Out[7]:

$\displaystyle \left[\begin{matrix}- 2 x_{2} + 4 \left(x_{1} - 1\right)^{3}\\- 2 x_{1} + 10 x_{2} - 10\end{matrix}\right]$

In [46]:

fhess_sym = [[f_sym.diff(x1_, x2_) 
              for x1_ in (x1, x2)] 
             for x2_ in (x1, x2)]

In [9]:

sympy.Matrix(fhess_sym)

Out[9]:

$\displaystyle \left[\begin{matrix}12 \left(x_{1} - 1\right)^{2} & -2\\-2 & 10\end{matrix}\right]$

In [47]:

f_lmbda = sympy.lambdify((x1, x2), f_sym, 'numpy')
fprime_lmbda = sympy.lambdify((x1, x2), 
                              fprime_sym, 'numpy')
fhess_lmbda = sympy.lambdify((x1, x2), 
                             fhess_sym, 'numpy')

In [13]:

def func_XY_to_X_Y(f):
    return lambda X: np.array(f(X[0], X[1]))

f = func_XY_to_X_Y(f_lmbda)
fprime = func_XY_to_X_Y(fprime_lmbda)
fhess = func_XY_to_X_Y(fhess_lmbda)

In [14]:

x_opt = optimize.fmin_ncg(f, (0, 0), fprime=fprime, 
                          fhess=fhess)

Optimization terminated successfully.
         Current function value: -3.867223
         Iterations: 8
         Function evaluations: 10
         Gradient evaluations: 10
         Hessian evaluations: 8

In [15]:

x_opt

Out[15]:

array([1.88292613, 1.37658523])

For such cases, several methods exist to numerically estimate the gradient or the Hessian or both. Methods that approximate the Hessian are known as quasi-Newton methods, and there are also alternative iterative methods that completely avoid using the Hessian. Two popular methods are the Broyden-Fletcher-Goldfarb-Shanno (BFGS) and the conjugate gradient methods (共轭梯度法).

In [16]:

# more iterations
x_opt = optimize.fmin_bfgs(f, (0, 0), fprime=fprime)

Optimization terminated successfully.
         Current function value: -3.867223
         Iterations: 9
         Function evaluations: 13
         Gradient evaluations: 13

In [17]:

x_opt = optimize.fmin_bfgs(f, (0, 0))

Optimization terminated successfully.
         Current function value: -3.867223
         Iterations: 9
         Function evaluations: 39
         Gradient evaluations: 13

In [18]:

x_opt

Out[18]:

array([1.88292644, 1.37658595])

In general, the BFGS method is often a good first approach to try, in particular if neither the gradient nor the Hessian is known. If both the gradient and the Hessian are known, then Newton’s method is the method with the fastest convergence in general.

The methods for multivariate optimization that we have discussed so far all converge to a local minimum in general. For problems with many local minima, this can easily lead to a situation when the solver easily gets stuck in a local minimum, even if a global minimum exists.

Brute Force Search¶

$$ f(x) = 4\sin x\pi+6\sin y\pi+(x-1)^2+(y-1)^2$$

In [50]:

def f(X):
    x, y = X
    return (4 * np.sin(np.pi * x) + 
            6 * np.sin(np.pi * y)) +\
            (x - 1)**2 + (y - 1)**2

In [24]:

# initial point
x_start = optimize.brute(f, (slice(-3, 5, 0.5),
            slice(-3, 5, 0.5)), finish=None)
x_start

Out[24]:

array([1.5, 1.5])

In [25]:

f(x_start)

Out[25]:

$\displaystyle -9.5$

In [26]:

x_opt = optimize.fmin_bfgs(f, x_start)

Optimization terminated successfully.
         Current function value: -9.520229
         Iterations: 4
         Function evaluations: 21
         Gradient evaluations: 7

In [27]:

x_opt

Out[27]:

array([1.47586906, 1.48365787])

In [29]:

# x_opt = optimize.fmin_bfgs(f, x_start)
result = optimize.minimize(f, x_start, method= 'BFGS')
x_opt = result.x
x_opt

Out[29]:

array([1.47586906, 1.48365787])

Constrained Optimization¶

We omit the Nonlinear Least Square Problems

$$ f(x) =(x_1-1)^2-(x_2-1)^2 $$$$2\leq x_1\leq 3$$$$0\leq x_2\leq 2$$

In [51]:

def f(X):
    x, y = X
    return (x - 1)**2 + (y - 1)**2
bnd_x1, bnd_x2 = (2, 3), (0, 2)
x_cons_opt = optimize.minimize(f, [1, 1], 
                               method='L-BFGS-B',
                               bounds=[bnd_x1, bnd_x2]).x
x

Out[51]:

array([[0.25],
       [0.75],
       [3.75]])

Sequential least square programming (序列最小二乘规划) $$ \max f(x) = x_1x_2 x_3 $$

$$ g(x) = 2x_1x_2+2x_0x_2+2x_1x_0-1=0 $$

In [52]:

def f(X):
    return -X[0] * X[1] * X[2]
def g(X):
    return 2 * (X[0]*X[1] + X[1] * X[2] + 
                X[2] * X[0]) - 1

In [53]:

constraint = dict(type='eq', fun=g)
result = optimize.minimize(f, [0.5, 1, 1.5], 
                           method='SLSQP',
                           constraints=[constraint])
result

Out[53]:

     fun: -0.06804136862287297
     jac: array([-0.16666925, -0.16666542, -0.16666526])
 message: 'Optimization terminated successfully'
    nfev: 77
     nit: 18
    njev: 18
  status: 0
 success: True
       x: array([0.40824188, 0.40825127, 0.40825165])

In [54]:

def f(X):
    return (X[0] - 1)**2 + (X[1] - 1)**2
def g(X):
    return X[1] - 1.75 - (X[0] - 0.75)**4
constraints = [dict(type='ineq', fun=g)]
x_cons_opt = optimize.minimize(f, (0, 0), 
                               method='SLSQP',
                               constraints=constraints)
x_cons_opt

Out[54]:

     fun: 0.566916419814267
     jac: array([-0.06284686,  1.50456505])
 message: 'Optimization terminated successfully'
    nfev: 25
     nit: 8
    njev: 8
  status: 0
 success: True
       x: array([0.96857656, 1.75228252])

Linear Programming¶

simplex or interior point

$$ \min_c c^T x\ s.t. Ax\leq b, x\geq 0$$

$$A = \left( \begin{matrix} 1 & 1 & 0\\ -1 & 3 & 0\\ 0 & -1 & 1 \end{matrix} \right) $$$$c =(-1,2,-3),b=(1,2,3)$$

In [39]:

c = np.array([-1.0, 2.0, -3.0])
A = np.array([[ 1.0, 1.0, 0.0],
              [-1.0, 3.0, 0.0],
              [ 0.0, -1.0, 1.0]])
b = np.array([1.0, 2.0, 3.0])

In [41]:

A_ = cvxopt.matrix(A)
b_ = cvxopt.matrix(b)
c_ = cvxopt.matrix(c)

sol = cvxopt.solvers.lp(c_, A_, b_)

Optimal solution found.

In [42]:

sol

Out[42]:

{'x': <3x1 matrix, tc='d'>,
 'y': <0x1 matrix, tc='d'>,
 's': <3x1 matrix, tc='d'>,
 'z': <3x1 matrix, tc='d'>,
 'status': 'optimal',
 'gap': 0.0,
 'relative gap': 0.0,
 'primal objective': -10.0,
 'dual objective': -10.0,
 'primal infeasibility': 0.0,
 'primal slack': -0.0,
 'dual slack': 0.0,
 'dual infeasibility': 1.4835979218054372e-16,
 'residual as primal infeasibility certificate': None,
 'residual as dual infeasibility certificate': None,
 'iterations': 0}

In [44]:

x = np.array(sol['x'])
x

Out[44]:

array([[0.25],
       [0.75],
       [3.75]])

In [45]:

sol['primal objective']

Out[45]:

$\displaystyle -10.0$