高斯混合模型\begin{align*} \\& P \left( y | \theta \right) = \sum_{k=1}^{K} \alpha_{k} \phi \left( y | \theta_{k} \right) \end{align*}
其中,$\alpha_{k}$是系数,$\alpha_{k} \geq 0 $,$\sum_{k=1}^{K} \alpha_{k} = 1$; $\phi \left( y | \theta_{k} \right)$是高斯分布密度,$\theta_{k} = \left( \mu_{k} , \sigma_{k}^{2} \right)$,\begin{align*} \\& \phi \left( y | \theta_{k} \right) = \dfrac{1}{\sqrt{2 \pi} \sigma_{k}} \exp \left( - \dfrac{\left( y - \mu_{k} \right)^2}{2 \sigma_{k}^{2}} \right)\end{align*} 称为第$k$个分模型。
假设观测数据$\left( y_{1}, y_{2}, \cdots, y_{N} \right)$由高斯混合模型\begin{align*} \\& P \left( y | \theta \right) = \sum_{k=1}^{K} \alpha_{k} \phi \left( y | \theta_{k} \right) \end{align*}
生成,其中,$\theta = \left( \alpha_{1}, \alpha_{2}, \cdots, \alpha_{K}; \theta_{1}, \theta_{2}, \cdots, \theta_{K}\right)$是模型参数。
隐变量$\gamma_{jk}$是0-1变量,表示观测数据$y_{j}$来自第$k$个分模型\begin{align*} \\& \gamma_{jk} = \begin{cases} 1,第j个观测数据来自第k个分模型\\ 0,否则\end{cases} \quad \quad \quad \quad \quad \left( j = 1, 2, \cdots, N; k = 1, 2, \cdots, K \right)\end{align*}
完全数据\begin{align*} \\& \left( y_{j}, \gamma_{j1}, \gamma_{j2}, \cdots, \gamma_{jk}\right) \quad j = 1,2, \cdots, N\end{align*}
完全数据似然函数\begin{align*} \\& P \left( y, \gamma | \theta \right) = \prod_{j=1}^{N} P \left( y_{j}, \gamma_{j1}, \gamma_{j2}, \cdots, \gamma_{jK} | \theta \right) \\ & = \prod_{k=1}^{K} \prod_{j=1}^{N} \left[ \alpha_{k} \phi \left( y_{j} | \theta_{k} \right)\right]^{\gamma_{jk}} \\ & = \prod_{k=1}^{K} \alpha_{k}^{n_{k}}\prod_{j=1}^{N} \left[ \phi \left( y_{j} | \theta_{k} \right)\right]^{\gamma_{jk}} \\& = \prod_{k=1}^{K} \alpha_{k}^{n_{k}}\prod_{j=1}^{N} \left[ \dfrac{1}{\sqrt{2 \pi} \sigma_{k}} \exp \left( - \dfrac{\left( y - \mu_{k} \right)^2}{2 \sigma_{k}^{2}} \right) \right]^{\gamma_{jk}} \end{align*}
式中,$n_{k} = \sum_{j=1}^{N} \gamma_{jk}$。
完全数据的对数似然函数\begin{align*} \\& \log P \left( y, \gamma | \theta \right) = \sum_{k=1}^{K} \left\{ \sum_{j=1}^{K} \gamma_{jk} \log \alpha_{k} + \sum_{j=1}^{K} \gamma_{jk}\left[ \log \left( \dfrac{1}{ \sqrt{2 \pi} } \right) - \log \sigma_{k} - \dfrac{1}{ 2 \sigma_{k}^{2} } \left( y_{j} - \mu_{k} \right)^{2} \right]\right\} \end{align*}
$Q\left( \theta, \theta^{\left( i \right)} \right)$函数 \begin{align*} \\& Q \left( \theta , \theta^{\left( i \right)} \right) = E \left[ \log P \left( y, \gamma | \theta \right) | y, \theta^{ \left( i \right) }\right] \\ & = E \left\{ \sum_{k=1}^{K} \left\{ \sum_{j=1}^{K} \gamma_{jk} \log \alpha_{k} + \sum_{j=1}^{K} \gamma_{jk}\left[ \log \left( \dfrac{1}{ \sqrt{2 \pi} } \right) - \log \sigma_{k} - \dfrac{1}{ 2 \sigma_{k}^{2} } \left( y_{j} - \mu_{k} \right)^{2} \right]\right\}\right\} \\ & = \sum_{k=1}^{K} \left\{ \sum_{j=1}^{K} E \left( \gamma_{jk} \right) \log \alpha_{k} + \sum_{j=1}^{K} E \left( \gamma_{jk} \right)\left[ \log \left( \dfrac{1}{ \sqrt{2 \pi} } \right) - \log \sigma_{k} - \dfrac{1}{ 2 \sigma_{k}^{2} } \left( y_{j} - \mu_{k} \right)^{2} \right]\right\} \\ & =\sum_{k=1}^{K} \left\{ \sum_{j=1}^{K} \hat{\gamma}_{jk} \log \alpha_{k} + \sum_{j=1}^{K} \hat{\gamma}_{jk}\left[ \log \left( \dfrac{1}{ \sqrt{2 \pi} } \right) - \log \sigma_{k} - \dfrac{1}{ 2 \sigma_{k}^{2} } \left( y_{j} - \mu_{k} \right)^{2} \right]\right\} \end{align*}
其中,分模型$k$对观测数据$y_{j}$的响应度$\hat{\gamma}_{jk}$是在当前模型参数下第$j$个观测数据来自第$k$个分模型的概率。 \begin{align*} \\& \hat{\gamma}_{jk} = E \left( \gamma_{jk} | y, \theta \right) = P \left( \gamma_{jk} = 1 | y, \theta \right) \\ & = \dfrac{P \left( \gamma_{jk} = 1, y_{j} | \theta \right)}{ \sum_{k=1}^{K} P \left( \gamma_{jk} = 1, y_{j} | \theta \right)} \\ & = \dfrac{\alpha_{k} \phi \left( y | \theta_{k} \right) }{\sum_{k=1}^{K} \alpha_{k} \phi \left( y | \theta_{k} \right) } \quad \quad \quad \left( j = 1, 2, \cdots, N; k = 1, 2, \cdots, K \right) \end{align*}
求$Q\left( \theta, \theta^{\left( i \right)} \right)$函数对$\theta$的极大值
\begin{align*} \theta^{\left( i+1 \right)} = \arg \max Q\left(\theta, \theta^\left( i \right) \right) \end{align*}
得 \begin{align*} \\ & \hat{\mu}_{k} = \dfrac{\sum_{j=1}^{N} \hat{\gamma}_{jk} y_{j}}{\sum_{j=1}^{N} \hat{\gamma}_{jk}}, \quad k = 1, 2, \cdots, K
\\ & \hat{\sigma}_{k}^2 = \dfrac{\sum_{j=1}^{N} \hat{\gamma}_{jk} \left( y_{j} - \mu_{k}\right)^2}{\sum_{j=1}^{N} \hat{\gamma}_{jk}}, \quad k = 1, 2, \cdots, K
\\ & \hat{\alpha}_{k} = \dfrac{\sum_{j=1}^{N} \hat{\gamma}_{jk} }{N}, \quad k = 1, 2, \cdots, K\end{align*}
高斯混合模型参数估计得EM算法:
输入:观测数据$y_{1}, y_{2}, \cdots, y_{N}$,高斯混合模型;
输出:高斯混合模型参数
3. $M$步:计算新迭代的模型参数
\begin{align*} \\ & \hat{\mu}_{k} = \dfrac{\sum_{j=1}^{N} \hat{\gamma}_{jk} y_{j}}{\sum_{j=1}^{N} \hat{\gamma}_{jk}}, \quad k = 1, 2, \cdots, K
\\ & \hat{\sigma}_{k}^2 = \dfrac{\sum_{j=1}^{N} \hat{\gamma}_{jk} \left( y_{j} - \mu_{k}\right)^2}{\sum_{j=1}^{N} \hat{\gamma}_{jk}}, \quad k = 1, 2, \cdots, K
\\ & \hat{\alpha}_{k} = \dfrac{\sum_{j=1}^{N} \hat{\gamma}_{jk} }{N}, \quad k = 1, 2, \cdots, K\end{align*}
4. 重复2.步和3.步,直到收敛。