We consider a one-dimensional cart position tracking problem, see Faragher (2012).
The hidden states are the position $z_t$ and velocity $\dot z_t$. We can apply an external acceleration/breaking force $u_t$. (Noisy) observations are represented by $x_t$.
The equations of motions are given by
In this lesson, we consider models where the sequence order of observations matters.
Consider the ordered observation sequence $x^T \triangleq \left(x_1,x_2,\ldots,x_T\right)$.
that 'explains' the time series $x^T$.
which is true for any model $p(x^T)$.
limits the dependencies to the past $K$ samples.
which is usually accompanied by an initial state distribution $p(z_{1k}=1) = \pi_k$.
As we have seen, inference tasks in linear Gaussian state space models can be analytically solved.
However, these derivations quickly become cumbersome and prone to errors.
Alternatively, we could specify the generative model in a (Forney-style) factor graph and use automated message passing to infer the posterior over the hidden variables. Here follows some examples.
Filtering, a.k.a. state estimation: estimation of a state (at time step $t$), based on past and current (at $t$) observations.
where we assumed that the previous state estimate is given by $$\begin{align*} p(z_{t-1}\,|\,x^{t-1}) = \mathcal{N}(z_{t-1} \,|\, \mu_{t-1}, \sigma_{t-1}^2) \qquad \text{(prior)} \end{align*}$$
$$\mathcal{N}(x\,|\,cz,\sigma^2) = \frac{1}{c}\mathcal{N}\left(z \,\middle|\,\frac{x}{c},\left(\frac{\sigma}{c}\right)^2\right)\,.$$
where the posterior mean $\mu_t$ and posterior variance $\sigma^2_t$ can be evaluated as follows: $$\begin{align*} \rho_t^2 &= a^2 \sigma_{t-1}^2 + \sigma_z^2 \qquad &&\text{(predicted variance)}\\ K_t &= \frac{c \rho_t^2}{c^2 \rho_t^2 + \sigma_x^2} \qquad &&\text{(Kalman gain)} \\ \mu_t &= a \mu_{t-1} + K_t \cdot \left( x_t - c a \mu_{t-1}\right) \qquad &&\text{(posterior mean)}\\ \sigma_t^2 &= \left( 1 - c\cdot K_t \right) \rho_t^2 \qquad &&\text{(posterior variance)} \end{align*}$$
Kalman filtering consists of computing/updating these last four equations for each new observation ($x_t$). This is a very efficient recursive algorithm to estimate the state $z_t$ from all observations (until $t$).
The above derivation also evaluates the "instant" evidence $$p(x_t|x^{t-1}) = \mathcal{N}\left(x_t \,|\, ca \mu_{t-1}, \sigma_x^2 + c^2(\sigma_z^2+a^2\sigma_{t-1}^2) \right)$$
Note that, for observed $x^t$, the evidence $p(x_t|x^{t-1})$ is a scalar number that scores how well the model predicts $x^t$, based on past observations $x^{t-1}$.
Exam guide: the above derivation of the Kalman filter is too long and error-prone to be asked at an exam. You should be able to follow the derivation in detail, but will not be requested to reproduce the full derivation without some guidance. The complexity of the derivation underlines why inference should be automated by a toolbox (like RxInfer).
the Kalman filter update equations for the posterior $p(z_t |x^t) = \mathcal{N}\left(z_t \bigm| \mu_t, V_t \right)$ are given by (see Bishop, pg.639) $$\begin{align*} P_t &= A V_{t-1} A^T + \Gamma \qquad &&\text{(predicted variance)}\\ K_t &= P_t C^T \cdot \left(C P_t C^T + \Sigma \right)^{-1} \qquad &&\text{(Kalman gain)} \\ \mu_t &= A \mu_{t-1} + K_t\cdot\left(x_t - C A \mu_{t-1} \right) \qquad &&\text{(posterior state mean)}\\ V_t &= \left(I-K_t C \right) P_{t} \qquad &&\text{(posterior state variance)} \end{align*}$$
using Pkg; Pkg.activate("../."); Pkg.instantiate();
using IJulia; try IJulia.clear_output(); catch _ end
Activating project at `~/Documents/biaslab/5SSD0 - BMLIP/BMLIP/lessons`
using RxInfer, LinearAlgebra, Plots
include("scripts/cart_tracking_helpers.jl")
# Specify the model parameters
Δt = 1.0 # assume the time steps to be equal in size
A = [1.0 Δt;
0.0 1.0]
b = [0.5*Δt^2; Δt]
Σz = convert(Matrix,Diagonal([0.2*Δt; 0.1*Δt])) # process noise covariance
Σx = convert(Matrix,Diagonal([1.0; 2.0])) # observation noise covariance;
# Generate noisy observations
n = 10 # perform 10 timesteps
z_start = [10.0; 2.0] # initial state
u = 0.2 * ones(n) # constant input u
noisy_x = generateNoisyMeasurements(z_start, u, A, b, Σz, Σx);
m_z = noisy_x[1] # initial predictive mean
V_z = A * (1e8*Diagonal(I,2) * A') + Σz # initial predictive covariance
for t = 2:n
global m_z, V_z, m_pred_z, V_pred_z
#predict
m_pred_z = A * m_z + b * u[t] # predictive mean
V_pred_z = A * V_z * A' + Σz # predictive covariance
#update
gain = V_pred_z * inv(V_pred_z + Σx) # Kalman gain
m_z = m_pred_z + gain * (noisy_x[t] - m_pred_z) # posterior mean update
V_z = (Diagonal(I,2)-gain)*V_pred_z # posterior covariance update
end
println("Prediction: ",MvNormalMeanCovariance(m_pred_z,V_pred_z))
println("Measurement: ", MvNormalMeanCovariance(noisy_x[n],Σx))
println("Posterior: ", MvNormalMeanCovariance(m_z,V_z))
plotCartPrediction(m_pred_z[1], V_pred_z[1], m_z[1], V_z[1], noisy_x[n][1], Σx[1][1])
Prediction: MvNormalMeanCovariance( μ: [41.547677453253606, 4.359437556520038] Σ: [1.29587873356633 0.39215729555837747; 0.39215729552375644 0.3415636711872436] ) Measurement: MvNormalMeanCovariance( μ: [40.28467983904122, 2.4288277653174806] Σ: [1.0 0.0; 0.0 2.0] ) Posterior: MvNormalMeanCovariance( μ: [40.70601642826498, 3.9366915877624713] Σ: [0.5516100295248194 0.1501897218006391; 0.15018972178737977 0.24143326064425966] )
@model function cart_tracking(n, A, b, Σz, Σx, z_prev_m_0, z_prev_v_0, u)
# We create constvar references for better efficiency
cA = constvar(A)
cB = constvar(b)
cΣz = constvar(Σz)
cΣx = constvar(Σx)
znodes = Vector{Any}(undef, n)
# `z` is a sequence of hidden states
z = randomvar(n)
# `x` is a sequence of "clamped" observations
x = datavar(Vector{Float64}, n)
z_prior ~ MvNormalMeanCovariance(z_prev_m_0, z_prev_v_0)
z_prev = z_prior
for i in 1:n
znodes[i],z[i] ~ MvNormalMeanCovariance(cA * z_prev + cB*u[i], cΣz)
x[i] ~ MvNormalMeanCovariance(z[i], cΣx)
z_prev = z[i]
end
return z, x, znodes
end
Now that we've built the model, we can perform Kalman filtering by inserting measurement data into the model and performing message passing.
import RxInfer.ReactiveMP: messageout, getinterface, materialize!
import RxInfer.Rocket: getrecent
z_prev_m_0 = noisy_x[1]
z_prev_v_0 = A * (1e8*Diagonal(I,2) * A') + Σz ;
result = inference(model=cart_tracking(n, A,b, Σz, Σx, z_prev_m_0, z_prev_v_0,u), data=(x=noisy_x,), free_energy=true);
μz_posterior, Σz_posterior = mean.(result.posteriors[:z])[end], cov.(result.posteriors[:z])[end];
prediction_z_1 = messageout(getinterface(result.returnval[end][end], :out))
prediction = materialize!(getrecent(prediction_z_1));
println("Prediction: ",MvNormalMeanCovariance(mean(prediction), cov(prediction)))
println("Measurement: ", MvNormalMeanCovariance(noisy_x[n], Σx))
println("Posterior: ", MvNormalMeanCovariance(μz_posterior, Σz_posterior))
plotCartPrediction(mean(prediction)[1], cov(prediction)[1], μz_posterior[1], Σz_posterior[1], noisy_x[n][1], Σx[1][1])
Prediction: MvNormalMeanCovariance( μ: [41.508790206410616, 4.3605961539085225] Σ: [1.2934227334046857 0.3916229823498387; 0.3916229823498387 0.3414332606222485] ) Measurement: MvNormalMeanCovariance( μ: [40.28467983904122, 2.4288277653174806] Σ: [1.0 0.0; 0.0 2.0] ) Posterior: MvNormalMeanCovariance( μ: [40.6890959466951, 3.9436176815844357] Σ: [0.551150997075792 0.1501469959500068; 0.1501469959500068 0.24141815274489328] )
open("../../styles/aipstyle.html") do f display("text/html", read(f, String)) end