In [1]:

```
import addutils.toc ; addutils.toc.js(ipy_notebook=True)
```

Out[1]:

In [2]:

```
import bokeh.plotting as bk
from addutils import css_notebook
css_notebook()
```

Out[2]:

In [3]:

```
bk.output_notebook()
```

The scale of an optimization problem is pretty much set by the dimensionality of the problem, i.e. the number of scalar variables on which the search is performed.

Convex functions can be optimized easily with gradient-descend algorithms. Non-convex functions optimization problems are much more difficult, expecially for higher dimensionality problems. This second class of optimization problems are not analyzed in this brief tutorial but are part of a specialised course.

Convex function

Non Convex Function

In a smooth problem the gradient is defined everywhere: this helps in speeding-up the optimization algorithms.

Smooth function

Non Smooth Function

The `scipy.optimize`

module provides useful algorithms for function minimization (scalar or multi-dimensional), curve fitting and root finding.

Let's find the minimum of this simple smooth non-convex function:

In [4]:

```
import numpy as np
```

In [5]:

```
from scipy import optimize
import numpy as np
def f(x):
return x**2 + 10*np.sin(x)
x = np.arange(-10, 10, 0.1)
fig = bk.figure()
fig.plot_height = 600
fig.plot_width = 600
fig.line(x, f(x))
bk.show(fig)
```

This function has a global minimum around -1.3 and a local minimum around 3.8.

The general and efficient way to find a minimum for this function is to conduct a gradient descent starting from a given initial point. The BFGS (quasi-Newton method of Broyden, Fletcher, Goldfarb, and Shanno) algorithm is a good way of doing this:

In [6]:

```
optimize.fmin_bfgs(f, 0)
```

Out[6]:

A possible issue with this approach is that, if the function has local minima, the algorithm may find these local minima instead of the global minimum. `scipy.optimize.anneal()`

provides an alternative, using simulated annealing. More efficient algorithms for different classes of global optimization problems exist, but this is out of the scope of scipy. Some useful packages for global optimization are OpenOpt, PyGMO and PyEvolve.

To find a root, i.e. a point where `f(x) = 0`

, of the function f above we can use for example `scipy.optimize.fsolve()`

:

In [7]:

```
root = optimize.fsolve(f, 1)
root
```

Out[7]:

Note that only one root is found. Inspecting the plot of f reveals that there is a second root around -2.5. We find the exact value of it by adjusting our initial guess:

In [8]:

```
root2 = optimize.fsolve(f, -2.5)
root2
```

Out[8]:

Suppose we have data sampled from `f`

with some noise:

In [9]:

```
xdata = np.linspace(-10, 10, num=20)
ydata = f(xdata) + np.random.randn(xdata.size)
```

Now if we know the functional form of the function from which the samples were drawn (`x^2 + sin(x)`

in this case) but not the amplitudes of the terms, we can find those by least squares curve fitting. First we have to define the function to fit:

In [10]:

```
def f2(x, a, b):
return a*x**2 + b*np.sin(x)
```

Then we can use `scipy.optimize.curve_fit()`

to find `a`

and `b`

:

In [11]:

```
guess = [2, 2]
params, params_covariance = optimize.curve_fit(f2, xdata, ydata, guess)
grid = (-10, 10, 0.1)
xmin_global = optimize.brute(f, (grid,))
xmin_local = optimize.fminbound(f, 0, 10)
fig = bk.figure(title=None)
fig.line(x, f(x), color='blue', line_width=2, legend="f(x)")
fig.line(x, f2(x, *params), color='red', line_width=2, line_dash=[5], legend="Curve fit result")
xmins = np.array([xmin_global[0], xmin_local])
fig.circle(xmins, f(xmins), size=9, color='green', legend="Minima")
roots = np.array([root[0], root2[0]])
fig.triangle(roots, f(roots), size=9, color="black", legend="Roots")
fig.xaxis[0].axis_label = 'x'
fig.yaxis[0].axis_label = 'f(x)'
bk.show(fig)
```

Box bounds correspond to limiting each of the individual parameters of the optimization. Note that some problems that are not originally written as box bounds can be rewritten as such be a change of variables.

`scipy.optimize.fminbound()`

for 1D-optimization`scipy.optimize.fmin_l_bfgs_b()`

a quasi-Newton method with bound constraints

In [12]:

```
import warnings
warnings.filterwarnings('ignore', category=UnicodeWarning)
```

In [13]:

```
%matplotlib inline
import matplotlib.pyplot as plt
x, y = np.mgrid[-2.9:5.8:.05, -2.5:5:.05]
x = x.T
y = y.T
for i in (1, 2):
# Create 2 figure: only the second one will have the optimization
# path
plt.figure(i, figsize=(3, 2.5))
plt.clf()
plt.axes([0, 0, 1, 1])
contours = plt.contour(np.sqrt((x - 3)**2 + (y - 2)**2),
extent=[-3, 6, -2.5, 5], cmap=plt.cm.gnuplot)
plt.clabel(contours, inline=1, fmt='%1.1f', fontsize=14)
plt.plot([-1.5, -1.5, 1.5, 1.5, -1.5],
[-1.5, 1.5, 1.5, -1.5, -1.5], 'k', linewidth=2)
plt.fill_between([ -1.5, 1.5],
[ -1.5, -1.5],
[ 1.5, 1.5],
color='.8')
plt.axvline(0, color='k')
plt.axhline(0, color='k')
plt.text(-.9, 4.4, '$x_2$', size=20)
plt.text(5.6, -.6, '$x_1$', size=20)
plt.axis('equal')
plt.axis('off')
# And now plot the optimization path
accumulator = list()
def f(x):
# Store the list of function calls
accumulator.append(x)
return np.sqrt((x[0] - 3)**2 + (x[1] - 2)**2)
# We don't use the gradient, as with the gradient, L-BFGS is too fast,
# and finds the optimum without showing us a pretty path
def f_prime(x):
r = np.sqrt((x[0] - 3)**2 + (x[0] - 2)**2)
return np.array(((x[0] - 3)/r, (x[0] - 2)/r))
optimize.fmin_l_bfgs_b(f, np.array([0, 0]), approx_grad=1,
bounds=((-1.5, 1.5), (-1.5, 1.5)))
accumulated = np.array(accumulator)
plt.plot(accumulated[:, 0], accumulated[:, 1])
```

Out[13]:

Visit www.add-for.com for more tutorials and updates.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.