from scipy import optimize
import cvxopt # pip install cvxopt
import matplotlib.pyplot as plt
import numpy as np
import sympy
sympy.init_printing()
Minimize the area of a cylinder with unit volume. Here, suitable variables are the radius r and height h of the cylinder, and the objective function is $f ([r, h]) = 2\pi r^2+2\pi rh$, subject to the equality constraint $g([r, h]) = \pi r^2h − 1 = 0$.
def f(r):
return 2 * np.pi * r**2 + 2 / r
r_min = optimize.brent(f, brack=(0.1, 4))
r_min
f(r_min)
optimize.minimize_scalar(f, bracket=(0.1, 4))
fun: 5.535810445932086 message: '\nOptimization terminated successfully;\nThe returned value satisfies the termination criteria\n(using xtol = 1.48e-08 )' nfev: 19 nit: 15 success: True x: 0.5419260772557135
As in the univariate case, Newton’s method can be viewed as a local quadratic approximation of the function, which when minimized gives an iteration scheme. In the multivariate case, the iteration formula
$$ x_{k+1} = x_k-H^{-1}_f(x_k)\nabla f(x_k)$$x1, x2 = sympy.symbols("x_1, x_2")
f_sym = (x1-1)**4 + 5 * (x2-1)**2 - 2*x1*x2
fprime_sym = [f_sym.diff(x_) for x_ in (x1, x2)]
# Gradient
sympy.Matrix(fprime_sym)
fhess_sym = [[f_sym.diff(x1_, x2_)
for x1_ in (x1, x2)]
for x2_ in (x1, x2)]
sympy.Matrix(fhess_sym)
f_lmbda = sympy.lambdify((x1, x2), f_sym, 'numpy')
fprime_lmbda = sympy.lambdify((x1, x2),
fprime_sym, 'numpy')
fhess_lmbda = sympy.lambdify((x1, x2),
fhess_sym, 'numpy')
def func_XY_to_X_Y(f):
return lambda X: np.array(f(X[0], X[1]))
f = func_XY_to_X_Y(f_lmbda)
fprime = func_XY_to_X_Y(fprime_lmbda)
fhess = func_XY_to_X_Y(fhess_lmbda)
x_opt = optimize.fmin_ncg(f, (0, 0), fprime=fprime,
fhess=fhess)
Optimization terminated successfully. Current function value: -3.867223 Iterations: 8 Function evaluations: 10 Gradient evaluations: 10 Hessian evaluations: 8
x_opt
array([1.88292613, 1.37658523])
For such cases, several methods exist to numerically estimate the gradient or the Hessian or both. Methods that approximate the Hessian are known as quasi-Newton methods, and there are also alternative iterative methods that completely avoid using the Hessian. Two popular methods are the Broyden-Fletcher-Goldfarb-Shanno (BFGS) and the conjugate gradient methods (共轭梯度法).
# more iterations
x_opt = optimize.fmin_bfgs(f, (0, 0), fprime=fprime)
Optimization terminated successfully. Current function value: -3.867223 Iterations: 9 Function evaluations: 13 Gradient evaluations: 13
x_opt = optimize.fmin_bfgs(f, (0, 0))
Optimization terminated successfully. Current function value: -3.867223 Iterations: 9 Function evaluations: 39 Gradient evaluations: 13
x_opt
array([1.88292644, 1.37658595])
In general, the BFGS method is often a good first approach to try, in particular if neither the gradient nor the Hessian is known. If both the gradient and the Hessian are known, then Newton’s method is the method with the fastest convergence in general.
The methods for multivariate optimization that we have discussed so far all converge to a local minimum in general. For problems with many local minima, this can easily lead to a situation when the solver easily gets stuck in a local minimum, even if a global minimum exists.
def f(X):
x, y = X
return (4 * np.sin(np.pi * x) +
6 * np.sin(np.pi * y)) +\
(x - 1)**2 + (y - 1)**2
# initial point
x_start = optimize.brute(f, (slice(-3, 5, 0.5),
slice(-3, 5, 0.5)), finish=None)
x_start
array([1.5, 1.5])
f(x_start)
x_opt = optimize.fmin_bfgs(f, x_start)
Optimization terminated successfully. Current function value: -9.520229 Iterations: 4 Function evaluations: 21 Gradient evaluations: 7
x_opt
array([1.47586906, 1.48365787])
# x_opt = optimize.fmin_bfgs(f, x_start)
result = optimize.minimize(f, x_start, method= 'BFGS')
x_opt = result.x
x_opt
array([1.47586906, 1.48365787])
We omit the Nonlinear Least Square Problems
def f(X):
x, y = X
return (x - 1)**2 + (y - 1)**2
bnd_x1, bnd_x2 = (2, 3), (0, 2)
x_cons_opt = optimize.minimize(f, [1, 1],
method='L-BFGS-B',
bounds=[bnd_x1, bnd_x2]).x
x
array([[0.25], [0.75], [3.75]])
Sequential least square programming (序列最小二乘规划) $$ \max f(x) = x_1x_2 x_3 $$
$$ g(x) = 2x_1x_2+2x_0x_2+2x_1x_0-1=0 $$def f(X):
return -X[0] * X[1] * X[2]
def g(X):
return 2 * (X[0]*X[1] + X[1] * X[2] +
X[2] * X[0]) - 1
constraint = dict(type='eq', fun=g)
result = optimize.minimize(f, [0.5, 1, 1.5],
method='SLSQP',
constraints=[constraint])
result
fun: -0.06804136862287297 jac: array([-0.16666925, -0.16666542, -0.16666526]) message: 'Optimization terminated successfully' nfev: 77 nit: 18 njev: 18 status: 0 success: True x: array([0.40824188, 0.40825127, 0.40825165])
def f(X):
return (X[0] - 1)**2 + (X[1] - 1)**2
def g(X):
return X[1] - 1.75 - (X[0] - 0.75)**4
constraints = [dict(type='ineq', fun=g)]
x_cons_opt = optimize.minimize(f, (0, 0),
method='SLSQP',
constraints=constraints)
x_cons_opt
fun: 0.566916419814267 jac: array([-0.06284686, 1.50456505]) message: 'Optimization terminated successfully' nfev: 25 nit: 8 njev: 8 status: 0 success: True x: array([0.96857656, 1.75228252])
simplex or interior point
$$ \min_c c^T x\ s.t. Ax\leq b, x\geq 0$$c = np.array([-1.0, 2.0, -3.0])
A = np.array([[ 1.0, 1.0, 0.0],
[-1.0, 3.0, 0.0],
[ 0.0, -1.0, 1.0]])
b = np.array([1.0, 2.0, 3.0])
A_ = cvxopt.matrix(A)
b_ = cvxopt.matrix(b)
c_ = cvxopt.matrix(c)
sol = cvxopt.solvers.lp(c_, A_, b_)
Optimal solution found.
sol
{'x': <3x1 matrix, tc='d'>, 'y': <0x1 matrix, tc='d'>, 's': <3x1 matrix, tc='d'>, 'z': <3x1 matrix, tc='d'>, 'status': 'optimal', 'gap': 0.0, 'relative gap': 0.0, 'primal objective': -10.0, 'dual objective': -10.0, 'primal infeasibility': 0.0, 'primal slack': -0.0, 'dual slack': 0.0, 'dual infeasibility': 1.4835979218054372e-16, 'residual as primal infeasibility certificate': None, 'residual as dual infeasibility certificate': None, 'iterations': 0}
x = np.array(sol['x'])
x
array([[0.25], [0.75], [3.75]])
sol['primal objective']