Learn R Programming

COMIX (version 1.0.2)

summarizeChain: This function provides post-hoc estimates of the model parameters.

Description

This function provides post-hoc estimates of the model parameters.

Usage

summarizeChain(res)

Value

A named list:

  • xi0: a ncol(res$data$Y) \(\times\) res$prior$K matrix storing the posterior mean of the group location parameter.

  • psi: a ncol(res$data$Y) \(\times\) res$prior$K matrix storing the posterior mean of the multivariate skew normal kernels skewness parameter (in the parameterization used in the sampler).

  • alpha: a ncol(res$data$Y) \(\times\) res$prior$K matrix storing the posterior mean of the multivariate skew normal kernels skewness parameter.

  • W: a length(unique(res$data$C)) \(\times\) res$prior$K matrix storing the posterior mean of the mixture weights for each sample and cluster.

  • xi: an length(unique(res$data$C)) \(\times\) ncol(res$data$Y) \(\times\) res$prior$K array storing the the posterior mean of the multivariate skew normal kernels location parameter for each sample and cluster.

  • Sigma: an ncol(res$data$Y) \(\times\) ncol(res$data$Y) \(\times\) res$prior$K array storing the the posterior mean of the scaling matrix of the multivariate skew normal kernels for each cluster.

  • G: an ncol(res$data$Y) \(\times\) ncol(res$data$Y) \(\times\) res$prior$K array storing the the posterior mean of the scaling matrix of the multivariate skew normal kernels for each cluster (in the parameterization used in the sampler).

  • E: an ncol(res$data$Y) \(\times\) ncol(res$data$Y) \(\times\) res$prior$K array storing the the posterior mean of the covariance matrix of the multivariate normal distributions for each cluster form which the sample specific location parameters are drawn.

  • meanvec: an length(unique(res$data$C)) \(\times\) ncol(res$data$Y) \(\times\) res$prior$K array storing the the posterior mean of the multivariate skew normal kernels mean parameter for each sample and cluster.

  • meanvec0: a ncol(res$data$Y) \(\times\) res$prior$K matrix storing the posterior mean of the group mean parameter.

  • t: Vector of length nrow(x$data$Y). Each element is the mode of the posterior distribution of cluster labels.

  • eta: scalar, the mean of the posterior distribution of the estimated Dirichlet Process Mixture concentration parameter.

Arguments

res

An object of class COMIX.

Examples

Run this code
library(COMIX)
# Number of observations for each sample (row) and cluster (column):
njk <- 
  matrix(
    c(
      150, 300,
      250, 200
    ),
    nrow = 2,
    byrow = TRUE
  )

# Dimension of data:
p <- 3

# Scale and skew parameters for first cluster:
Sigma1 <- matrix(0.5, nrow = p, ncol = p) + diag(0.5, nrow = p)
alpha1 <- rep(0, p)
alpha1[1] <- -5
# location parameter for first cluster in first sample:
xi11 <- rep(0, p)
# location parameter for first cluster in second sample (aligned with first):
xi21 <- rep(0, p)

# Scale and skew parameters for second cluster:
Sigma2 <- matrix(-1/3, nrow = p, ncol = p) + diag(1 + 1/3, nrow = p)
alpha2 <- rep(0, p)
alpha2[2] <- 5
# location parameter for second cluster in first sample:
xi12 <- rep(3, p)
# location parameter for second cluster in second sample (misaligned with first):
xi22 <- rep(4, p)

# Sample data:
set.seed(1)
Y <- 
  rbind(
    sn::rmsn(njk[1, 1], xi = xi11, Omega = Sigma1, alpha = alpha1),
    sn::rmsn(njk[1, 2], xi = xi12, Omega = Sigma2, alpha = alpha2),
    sn::rmsn(njk[2, 1], xi = xi21, Omega = Sigma1, alpha = alpha1),
    sn::rmsn(njk[2, 2], xi = xi22, Omega = Sigma2, alpha = alpha2)
  )

C <- c(rep(1, rowSums(njk)[1]), rep(2, rowSums(njk)[2]))

prior <- list(zeta = 1, K = 10)
pmc <- list(naprt = 5, nburn = 200, nsave = 200) # Reasonable usage
pmc <- list(naprt = 5, nburn = 2, nsave = 5) # Minimal usage for documentation
# Fit the model:
res <- comix(Y, C, pmc = pmc, prior = prior)

# Relabel to resolve potential label switching issues:
res_relab <- relabelChain(res)

# Generate calibrated data:
cal <- calibrateNoDist(res_relab)

# Compare raw and calibrated data: (see plot in vignette)
# par(mfrow=c(1, 2))
# plot(Y, col = C, xlim = range(Y[,1]), ylim = range(Y[,2]) )

# Get posterior estimates for the model parameters:
res_summary <- summarizeChain(res_relab)
# Check for instance, the cluster assignment labels:
table(res_summary$t)
# Indeed the same as 
colSums(njk)

# Or examine the skewness parameter for the non-trivial clusters:
res_summary$alpha[ , unique(res_summary$t)]
# And compare those to
cbind(alpha1, alpha2)

# (see vignette for a more detailed example)

Run the code above in your browser using DataLab