Discover clusters in multidimensional data using a multivariate normal mixture model with a determinantal point process prior.
dppmix_mvnorm(
X,
hparams = NULL,
store = NULL,
control = NULL,
fixed = NULL,
verbose = TRUE
)
N x J
data matrix of N
observations and
J
features
a list of hyperparameter values:
delta, a0, b0, theta, sigma_prop_mu
a vector of character strings specifying additional vars of
interest; a value of NA
indicates that
samples of all parameters in the model will be stored
a list of control parameters:
niter, burnin, thin
a list of fixed parameter values
whether to emit verbose message
a dppmix_mcmc
object containing posterior samples of
the parameters
A determinantal point process (DPP) prior is a repulsive prior. Compare to mixture models using independent priors, a DPP mixutre model will often discover a parsimonious set of mixture components (clusters).
Model fitting is done by sampling parameters from the posterior distribution using a reversible jump Markov chain Monte Carlo sampling approach.
Given \(X = [x_i]\), where each \(x_i\) is a D-dimensional real vector, we seek the posterior distribution the latent variable \(z = [z_i]\), where each \(z_i\) is an integer representing cluster membership.
$$ x_i \mid z_i \sim Normal(\mu_k, \Sigma_k) $$ $$ z_i \sim Categorical(w) $$ $$ w \sim Dirichlet([\delta ... \delta]) $$ $$ \mu_k \sim DPP(C) $$
where \(C\) is the covariance function that evaluates the distances among the data points:
$$ C(x_1, x_2) = exp( - \sum_d \frac{ (x_1 - x_2)^2 }{ \theta^2 } ) $$
We also define \(\Sigma_k = E_k \Lambda_k E_k^\top\), where \(E_k\) is an orthonormal matrix whose column represents eigenvectors. We further assume that \(E_k = E\) is fixed across all cluster components so that \(E\) can be estimated as the eigenvectors of the covariance matrix of the data matrix \(X\). Finally, we put a prior on the entries of the \(\Lambda_k\) diagonal matrix:
$$ \lambda_{kd}^{-1} \sim Gamma( a_0, b_0 ) $$
Hence, the hyperameters of the model include:
delta, a0, b0, theta
, as well as sampling hyperparameter
sigma_pro_mu
, which controls the spread of the Gaussian
proposal distribution for the random-walk Metropolis-Hastings update of
the \(\mu\) parameter.
The parameters (and their dimensions) in the model include:
K
, z (N x 1)
, w (K x 1)
, lambda (K x J)
,
mu (K x J)
, Sigma (J x J x K)
.
If any parameter is fixed, then K
must be fixed as well.
Yanxun Xu, Peter Mueller, Donatello Telesca. Bayesian Inference for Latent Biologic Structure with Determinantal Point Processes. Biometrics. 2016;72(3):955-64.
# NOT RUN {
set.seed(1)
ns <- c(3, 3)
means <- list(c(-6, -3), c(0, 4))
d <- rmvnorm_clusters(ns, means)
mcmc <- dppmix_mvnorm(d$X, verbose=FALSE)
res <- estimate(mcmc)
table(d$cl, res$z)
# }
Run the code above in your browser using DataLab