View the assignment description here

$\newcommand{\vect}[1]{\boldsymbol{\mathbf{#1}}}$ The cost function is defined by: $$J(\vect x_u) = \sum _{i=1} ^n c_{ui} \left( p_{ui} - \vect x_u \vect y_i \right)^2 + \lambda \| \vect x_u \| _2 ^2$$ where $\vect x_u \in \mathbb R^{1 \times f}$ and $\vect y_i \in \mathbb R^{f \times 1}$.

The first term can be rewritten as $$\sum _{i=1} ^n c_{ui} \left( \vect p_u - \vect x_u \vect Y \right)_i \left( \vect p_u - \vect x_u \vect Y \right)_i$$ where $\vect p_u \in \mathbb R^{1 \times n}$ and $\vect Y \in \mathbb R^{f \times n}$, so that $\left( \vect p_u - \vect x_u \vect Y \right) \in \mathbb R^{1 \times n}$.

Using the definition of the dot product we can then rewrite the first term as $$\left( \vect p_u - \vect x_u \vect Y \right) \left[ \begin{array}{c} c_{u1}\left( \vect p_u - \vect x_u \vect Y \right)_1 \\ \vdots \\ c_{un}\left( \vect p_u - \vect x_u \vect Y \right)_n \end{array} \right]$$

And finally by using the definition of matrix multiplication the first term become $$\left( \vect p_u - \vect x_u \vect Y \right) \vect C_u \left( \vect p_u - \vect x_u \vect Y \right)^T$$ where $\vect C_u = \left[ \begin{array}{ccc} c_{u1} & 0 & 0 \\ 0 & \ddots & 0 \\ 0 & 0 & c_{un} \end{array} \right]$.

The second term can be rewritten as $$\lambda \| \vect x_u \| _2 ^2 = \lambda\sum_{i=1}^n x_{ui}^2 = \lambda \vect x_u \vect x_u^T$$

So we have showed that $$J(\vect x_u) = \left( \vect p_u - \vect x_u \vect Y \right) \vect C_u \left( \vect p_u - \vect x_u \vect Y \right)^T

- \lambda \vect x_u \vect x_u^T$$

To find $\hat{\vect x}_u$ we differentiate $J(\vect x_u)$ w.r.t. $\vect x_u$ and set it equal to zero. $$J(\vect x_u) = \vect p_u \vect C_u \vect p_u^T - 2\vect p_u \vect C_u \vect Y^T \vect x_u^T

- \vect x_u \vect Y \vect C_u \vect Y^T \vect x_u^T
- \lambda \vect x_u \vect x_u^T$$

$$\frac \partial {\partial\vect x_u} J(\vect x_u) =
-2 \vect p_u \vect C_u \vect Y^T + 2 \vect x_u \vect Y \vect C_u \vect Y^T
+ 2 \lambda \vect x_u = 0$$

$$\hat{\vect x}_u = \vect p_u \vect C_u \vect Y^T \left( \vect Y \vect C_u \vect Y^T
+ \lambda I\right)^{-1}$$