%matplotlib inline
#format the book
import book_format
book_format.set_style()
In the last chapter we discussed the difficulties that nonlinear systems pose. This nonlinearity can appear in two places. It can be in our measurements, such as a radar that is measuring the slant range to an object. Slant range requires you to take a square root to compute the x,y coordinates:
$$x=\sqrt{\text{slant}^2 - \text{altitude}^2}$$The nonlinearity can also occur in the process model - we may be tracking a ball traveling through the air, where the effects of air drag lead to nonlinear behavior. The standard Kalman filter performs poorly or not at all with these sorts of problems.
In the last chapter I showed you a plot like this. I have altered the equation somewhat to emphasize the effects of nonlinearity.
from kf_book.book_plots import set_figsize, figsize
import matplotlib.pyplot as plt
from kf_book.nonlinear_plots import plot_nonlinear_func
from numpy.random import normal
import numpy as np
# create 500,000 samples with mean 0, std 1
gaussian = (0., 1.)
data = normal(loc=gaussian[0], scale=gaussian[1], size=500000)
def f(x):
return (np.cos(4*(x/2 + 0.7))) - 1.3*x
plot_nonlinear_func(data, f)
I generated this by taking 500,000 samples from the input, passing it through the nonlinear transform, and building a histogram of the result. We call these points sigma points. From the output histogram we can compute a mean and standard deviation which would give us an updated, albeit approximated Gaussian.
Let me show you a scatter plot of the data before and after being passed through f(x)
.
N = 30000
plt.subplot(121)
plt.scatter(data[:N], range(N), alpha=.2, s=1)
plt.title('Input')
plt.subplot(122)
plt.title('Output')
plt.scatter(f(data[:N]), range(N), alpha=.2, s=1);
The data itself appears to be Gaussian, which it is. By that I mean it looks like white noise scattered around the mean zero. In contrast g(data)
has a defined structure. There are two bands, with a significant number of points in between. On the outside of the bands there are scattered points, but with many more on the negative side.
It has perhaps occurred to you that this sampling process constitutes a solution to our problem. Suppose for every update we generated 500,000 points, passed them through the function, and then computed the mean and variance of the result. This is called a Monte Carlo approach, and it used by some Kalman filter designs, such as the Ensemble filter and particle filter. Sampling requires no specialized knowledge, and does not require a closed form solution. No matter how nonlinear or poorly behaved the function is, as long as we sample with enough sigma points we will build an accurate output distribution.
"Enough points" is the rub. The graph above was created with 500,000 sigma points, and the output is still not smooth. What's worse, this is only for 1 dimension. The number of points required increases by the power of the number of dimensions. If you only needed 500 points for 1 dimension, you'd need 500 squared, or 250,000 points for two dimensions, 500 cubed, or 125,000,000 points for three dimensions, and so on. So while this approach does work, it is very computationally expensive. Ensemble filters and particle filters use clever techniques to significantly reduce this dimensionality, but the computational burdens are still very large. The unscented Kalman filter uses sigma points but drastically reduces the amount of computation by using a deterministic method to choose the points.
Let's look at the problem in terms of a 2D covariance ellipse. I choose 2D merely because it is easy to plot; this extends to any number of dimensions. Assuming some arbitrary nonlinear function, we will take random points from the first covariance ellipse, pass them through the nonlinear function, and plot their new position. Then we can compute the mean and covariance of the transformed points, and use that as our estimate of the mean and probability distribution.
import kf_book.ukf_internal as ukf_internal
ukf_internal.show_2d_transform()
On the left we show an ellipse depicting the $1\sigma$ distribution of two state variables. The arrows show how several randomly sampled points might be transformed by some arbitrary nonlinear function to a new distribution. The ellipse on the right is drawn semi-transparently to indicate that it is an estimate of the mean and variance of this collection of points.
Let's write a function which passes 10,000 points randomly drawn from the Gaussian
$$\mu = \begin{bmatrix}0\\0\end{bmatrix}, \Sigma=\begin{bmatrix}32&15\\15&40\end{bmatrix}$$through the nonlinear system:
$$\begin{cases}\begin{aligned}\bar x&=x+y\\ \bar y&= 0.1x^2 + y^2\end{aligned} \end{cases}$$import numpy as np
from numpy.random import multivariate_normal
from kf_book.nonlinear_plots import plot_monte_carlo_mean
def f_nonlinear_xy(x, y):
return np.array([x + y, .1*x**2 + y*y])
mean = (0., 0.)
p = np.array([[32., 15.], [15., 40.]])
# Compute linearized mean
mean_fx = f_nonlinear_xy(*mean)
#generate random points
xs, ys = multivariate_normal(mean=mean, cov=p, size=10000).T
plot_monte_carlo_mean(xs, ys, f_nonlinear_xy, mean_fx, 'Linearized Mean');
Difference in mean x=-0.055, y=43.117