
BMLIP

lessons

exercises
Bayesian Machine Learning¶
 [1] (#) (a) Explain shortly the relation between machine learning and Bayes rule.
(b) How are Maximum a Posteriori (MAP) and Maximum Likelihood (ML) estimation related to Bayes rule and machine learning?
 [2] (#) What are the four stages of the Bayesian design approach?
 [3] (##) The Bayes estimate is a summary of a posterior distribution by a delta distribution on its mean, i.e.,
$$
\hat \theta_{bayes} = \int \theta \, p\left( \theta D \right)
\,\mathrm{d}{\theta}
$$
Proof that the Bayes estimate minimizes the expected meansquared error, i.e., proof that
$$
\hat \theta_{bayes} = \arg\min_{\hat \theta} \int_\theta (\hat \theta \theta)^2 p \left( \theta D \right) \,\mathrm{d}{\theta}
$$
 [4] (###) We make $N$ IID observations $D=\{x_1 \dots x_N\}$ and assume the following model
$$
x_k = A + \epsilon_k
$$
where $\epsilon_k = \mathcal{N}(\epsilon_k  0,\sigma^2)$ with known $\sigma^2=1$. We are interested in deriving an estimator for $A$.
(a) Make a reasonable assumption for a prior on $A$ and derive a Bayesian (posterior) estimate.
(b) (##) Derive the Maximum Likelihood estimate for $A$.
(c) Derive the MAP estimates for $A$.
(d) Now assume that we do not know the variance of the noise term? Describe the procedure for Bayesian estimation of both $A$ and $\sigma^2$ (No need to fully work out to closedform estimates).
 [5] (##) We consider the coin toss example from the notebook and use a conjugate prior for a Bernoulli likelihood function.
(a) Derive the Maximum Likelihood estimate.
(b) Derive the MAP estimate.
(c) Do these two estimates ever coincide (if so under what circumstances)?
 [6] (###) Given a single observation $x_0$ from a uniform distribution $\mathrm{Unif}[0,1/\theta]$, where $\theta > 0$.
(a) Show that $\mathbb{E}[g(x_0)] = \theta$ if and only if $\int_0^{1/\theta} g(u) du =1$.
(b) Show that there is no function $g$ that satisfies the condition for all $\theta > 0$.