Option 1: $x \in \{1,2,\ldots,K\}$.
Option 2: $x=(x_1,\ldots,x_K)^T$ with binary selection variables $$ x_k = \begin{cases} 1 & \text{if die landed on $k$th face}\\ 0 & \text{otherwise} \end{cases} $$
We need a prior for the parameters $\mu = (\mu_1,\mu_2,\ldots,\mu_K)$. In the binary coin toss example, we used a beta distribution that was conjugate with the binomial and forced us to choose prior pseudo-counts.
The generalization of the beta prior to the $K$ parameters $\{\mu_k\}$ is the Dirichlet distribution: $$ p(\mu|\alpha) = \mathrm{Dir}(\mu|\alpha) = \frac{\Gamma\left(\sum_k \alpha_k\right)}{\Gamma(\alpha_1)\cdots \Gamma(\alpha_K)} \prod_{k=1}^K \mu_k^{\alpha_k-1} $$ where $\Gamma(\dot)$ is the Gamma function. The Gamma function can be interpreted as a generalization of the factorial function (e.g., $3! = 3\cdot 2 \cdot 1$) to the real ($\mathbb{R}$) numbers.
As before for the Beta distribution in the coin toss experiment, you can interpret $\alpha_k$ as the prior number of (pseudo-)observations that the die landed on the $k$th face.
where $m = (m_1,m_2,\ldots,m_K)^T$.
(You can find the mean of the Dirichlet distribution at its Wikipedia site).
This result is simply a generalization of Laplace's rule of succession.
Of course, we shouldn't have to go through the full Bayesian framework to get the maximum likelihood estimate. Alternatively, we can find the maximum of the likelihood directly.
The log-likelihood for the multinomial distribution is given by
Given $N$ IID observations $D=\{x_1,\dotsc,x_N\}$.
open("../../styles/aipstyle.html") do f
display("text/html", read(f, String))
end