Let \(y_i\), \(i=1,\dots,n\) denote observations.
A general mixture of \(K\) distributions from the same
parametric family is given by:
$$y_i \sim \sum_{k=1}^{K}\pi_k p(\cdot|\theta_k)$$
with \(\sum_{k=1}^{K}\pi_k=1\) and \(\pi_k\geq 0\), \(k=1, ...,K\).
The exact number of components does not have to be known a priori
when using an SFM MCMC approach. Rather, an upper bound is specified for the
number of components and the weights of superfluous components are shrunk
towards zero during estimation. Following malsiner-walli_model-based_2016;textualBayesMultiMode
a symmetric Dirichlet prior is used for the mixture weights:
$$\pi_k \sim \text{Dirichlet}(e_0,\dots,e_0),$$
where a Gamma hyperprior is used on the concentration parameter \(e_0\):
$$e_0 \sim \text{Gamma}\left(a_0, A_0\right).$$
Mixture of Normal distributions
Normal components take the form:
$$p(y_i|\mu_k,\sigma_k) = \frac{1}{\sqrt{2 \pi} \
\sigma_k} \exp\left( - \, \frac{1}{2} \left( \frac{y_i -
\mu_k}{\sigma_k} \right)^2 \right).$$
Independent conjugate priors are used for \(\mu_k\) and \(\sigma^2_k\)
(see for instance Malsiner-Walli et al. 2016):
$$\mu_k \sim \text{Normal}( \text{b}_0, \text{B}_0),$$
$$\sigma^{-2}_k \sim \text{Gamma}( \text{c}_0, \text{C}_0),$$
$$C_0 \sim \text{Gamma}( \text{g}_0, \text{G}_0).$$
Mixture of skew-Normal distributions
We use the skew-Normal of azzalini_1985;textualBayesMultiMode which takes the form:
$$p(y_i| \xi_k,\omega_k,\alpha_k) = \frac{1}{\omega_k\sqrt{2\pi}} \ \exp\left( - \,
\frac{1}{2} \left( \frac{y_i - \xi_k}{\omega_k} \right)^2\right) \
\left(1 + \text{erf}\left( \alpha_k\left(\frac{y_i - \xi_k}{\omega_k\sqrt{2}}\right)\right)\right),$$
where \(\xi_k\) is a location parameter, \(\omega_k\) a scale parameter and \(\alpha_k\)
the shape parameter introducing skewness. For Bayesian estimation, we adopt the approach of
fruhwirth-schnatter_bayesian_2010;textualBayesMultiMode and use the following reparameterised random-effect model:
$$z_i \sim TN_{[0,\infty)}(0, 1),$$
$$y_i|(S_i = k) = \xi_k + \psi_k z_i + \epsilon_i, \quad \epsilon_i \sim N(0, \sigma^2_k),$$
where the parameters of the skew-Normal are recovered with
$$\omega_k = \frac{\psi_k}{\sigma_k}, \qquad \omega^2_k = \sigma^2_k + \psi^2_k.$$
By defining a regressor \(x_i = (1, z_i)'\), the skew-Normal mixture can be seen as
random effect model and sampled using standard techniques. Thus we use priors similar to
the Normal mixture model:
$$(\xi_k, \psi_k)' \sim \text{Normal}(\text{b}_0, \text{B}_0),$$
$$\sigma^{-2}_k \sim \text{Gamma}(\text{c}_0, \text{C}_0),$$
$$\text{C}_0 \sim \text{Gamma}( \text{g}_0, \text{G}_0).$$
We set $$\text{b}_0 = (\text{median}(y), 0)'$$ and $$\text{B}_0 = \text{diag}(\text{D}\_\text{xi}, \text{D}\_\text{psi})$$ with D_xi = D_psi = 1.
Mixture of Poisson distributions
Poisson components take the form:
$$p(y_i|\lambda_k) = \frac{1}{y_i!} \, \lambda^{y_i}_k \,\exp(-\lambda_k).$$
The prior for \(\lambda_k\) follows from viallefont2002bayesian;textualBayesMultiMode:
$$\lambda_k \sim \text{Gamma}(\text{l}_0,\text{L}_0).$$
Mixture of shifted-Poisson distributions
Shifted-Poisson components take the form
$$p(y_i |\lambda_k, \kappa_k) = \frac{1}{(y_i - \kappa_k)!} \,
\lambda^{(y_i - \kappa_k)!}_k \,\exp(-\lambda_k)$$
where \(\kappa_k\) is a location or shift parameter with uniform prior, see Cross2024;textualBayesMultiMode.