where $k=1,2,\ldots,n$ is the time step counter; $z_k$ is an unobserved state sequence; $x_k$ is an observed sequence; $w_k \sim \mathcal{N}(0,\Sigma_w)$ and $v_k \sim \mathcal{N}(0,\Sigma_v)$ are (unobserved) state and observation noise sequences respectively; $z_0 \sim \mathcal{N}(0,\Sigma_0)$ is the initial state and $A$, $C$, $\Sigma_v$,$\Sigma_w$ and $\Sigma_0$ are known parameters. The Forney-style factor graph (FFG) for one time step is depicted here:
(a) Rewrite the state-space equations as a set of conditional probability distributions. $$\begin{align*} p(z_k|z_{k-1},A,\Sigma_w) &= \ldots \\ p(x_k|z_k,C,\Sigma_v) &= \ldots \\ p(z_0|\Sigma_0) &= \ldots \end{align*}$$ (b) Define $z^n \triangleq (z_0,z_1,\ldots,z_n)$, $x^n \triangleq (x_1,\ldots,x_n)$ and $\theta=\{A,C,\Sigma_w,\Sigma_v\}$. Now write out the generative model $p(x^n,z^n|\theta)$ as a product of factors. (c) We are interested in estimating $z_k$ from a given estimate for $z_{k-1}$ and the current observation $x_k$, i.e., we are interested in computing $p(z_k|z_{k-1},x_k,\theta)$. Can $p(z_k|z_{k-1},x_k,\theta)$ be expressed as a Gaussian distribution? Explain why or why not in one sentence. (d) Copy the graph onto your exam paper and draw the message passing schedule for computing $p(z_k|z_{k-1},x_k,\theta)$ by drawing arrows in the factor graph. Indicate the order of the messages by assigning numbers to the arrows. (e) Now assume that our belief about parameter $\Sigma_v$ is instead given by a distribution $p(\Sigma_v)$ (rather than a known value). Adapt the factor graph drawing of the previous answer to reflects our belief about $\Sigma_v$.