NMixMCMC: MCMC estimation of (multivariate) normal mixtures with possibly censored data.

Description

This function runs MCMC for a model in which unknown density is specified as a normal mixture with either known or unknown number of components. With a prespecified number of components, MCMC is implemented through Gibbs sampling (see Diebolt and Robert, 1994) and dimension of the data can be arbitrary. With unknown number of components, currently only univariate case is implemented using the reversible jump MCMC (Richardson and Green, 1997). Further, the data are allowed to be censored in which case additional Gibbs step is used within the MCMC algorithm

Usage

NMixMCMC(y0, y1, censor, x_w, scale, prior,
         init, init2, RJMCMC,
         nMCMC = c(burn = 10, keep = 10, thin = 1, info = 10),
         PED, keep.chains = TRUE, onlyInit = FALSE, dens.zero = 1e-300,
         parallel = FALSE, cltype)
## S3 method for class 'NMixMCMC':
print(x, dic, \dots)
## S3 method for class 'NMixMCMClist':
print(x, ped, dic, \dots)

Arguments

numeric vector of length $n$ or $n\times p$ matrix with observed data. It contains exactly observed, right-censored, left-censored data and lower limits for interval-censored data.

numeric vector of length $n$ or $n\times p$ matrix with upper limits for interval-censored data. Elements corresponding to exactly observed, right-censored or left-censored data are ignored and can be filled arbitrarily (by NA

censor

numeric vector of length $n$ or $n\times p$ matrix with censoring indicators. The following values indicate:

[object Object],[object Object],[object Object],[object Object]

If it is not supplied then it is assumed that all values are exactly

x_w

optional vector providing a categorical covariate that may influence the mixture weights. Internally, it is converted into a factor.

Added in version 4.0 (03/2015).

scale

a list specifying how to scale the data before running MCMC. It should have two components:

[object Object],[object Object]

If there is no censoring, and argument scale is missing then the data are scaled to have zero mean

prior

a list with the parameters of the prior distribution. It should have the following components (for some of them, the program can assign default values and the user does not have to specify them if he/she wishes to use the defaults): [objec

init

a list with the initial values for the MCMC. All initials can be determined by the program if they are not specified. The list may have the following components: [object Object],[object Object],[object Object],[object Object],[object Object],[

init2

a list with the initial values for the second chain needed to estimate the penalized expected deviance of Plummer (2008). The list init2 has the same structure as the list init. All initials in init2 can

RJMCMC

a list with the parameters needed to run reversible jump MCMC for mixtures with varying number of components. It does not have to be specified if the number of components is fixed. Most of the parameters can be determined by the program if the

nMCMC

numeric vector of length 4 giving parameters of the MCMC simulation. Its components may be named (ordering is then unimportant) as: [object Object],[object Object],[object Object],[object Object] In total $(M_{burn} + M_{keep}) \cdot M_{thin}$

PED

a logical value which indicates whether the penalized expected deviance (see Plummer, 2008 for more details) is to be computed (which requires two parallel chains). If not specified, PED is set to TRUE for models

keep.chains

logical. If FALSE, only summary statistics are returned in the resulting object. This might be useful in the model searching step to save some memory.

onlyInit

logical. If TRUE then the function only determines parameters of the prior distribution, initial values, values of scale and parameters for the reversible jump MCMC.

dens.zero

a small value used instead of zero when computing deviance related quantities.

an object of class NMixMCMC or NMixMCMClist to be printed.

dic

logical which indicates whether DIC should be printed. By default, DIC is printed only for models with a fixed number of mixture components.

ped

logical which indicates whether PED should be printed. By default, PED is printed only for models with a fixed number of mixture components.

parallel

a logical value which indicates whether parallel computation (based on a package parallel) should be used when running two chains for the purpose of PED calculation

cltype

optional argument applicable if parallel is TRUE. If cltype is given, it is passed as the type argument into the call to makeCluster.

...

additional arguments passed to the default print method.

Value

An object of class NMixMCMC or class NMixMCMClist. Object of class NMixMCMC is returned if PED is FALSE. Object of class NMixMCMClist is returned if PED is TRUE.

Details

See accompanying paper (Komárek, 2009). In the rest of the helpfile, the same notation is used as in the paper, namely, $n$ denotes the number of observations, $p$ is dimension of the data, $K$ is the number of mixture components, $w_1,\dots,w_K$ are mixture weights, $\boldsymbol{\mu}_1,\dots,\boldsymbol{\mu}_K$ are mixture means, $\boldsymbol{\Sigma}_1,\dots,\boldsymbol{\Sigma}_K$ are mixture variance-covariance matrices, $\boldsymbol{Q}_1,\dots,\boldsymbol{Q}_K$ are their inverses.

For the data $\boldsymbol{y}_1,\dots,\boldsymbol{y}_n$ the following $g_y(\boldsymbol{y})$ density is assumed $$g_y(\boldsymbol{y}) = |\boldsymbol{S}|^{-1} \sum_{j=1}^K w_j \varphi\bigl(\boldsymbol{S}^{-1}(\boldsymbol{y} - \boldsymbol{m}\,|\,\boldsymbol{\mu}_j,\,\boldsymbol{\Sigma}_j)\bigr),$$ where $\varphi(\cdot\,|\,\boldsymbol{\mu},\,\boldsymbol{\Sigma})$ denotes a density of the (multivariate) normal distribution with mean $\boldsymbol{\mu}$ and a~variance-covariance matrix $\boldsymbol{\Sigma}$. Finally, $\boldsymbol{S}$ is a pre-specified diagonal scale matrix and $\boldsymbol{m}$ is a pre-specified shift vector. Sometimes, by setting $\boldsymbol{m}$ to sample means of components of $\boldsymbol{y}$ and diagonal of $\boldsymbol{S}$ to sample standard deviations of $\boldsymbol{y}$ (considerable) improvement of the MCMC algorithm is achieved.

References

Celeux, G., Forbes, F., Robert, C. P., and Titterington, D. M. (2006). Deviance information criteria for missing data models. Bayesian Analysis, 1(4), 651--674. Cappé, Robert and Rydén (2003). Reversible jump, birth-and-death and more general continuous time Markov chain Monte Carlo samplers. Journal of the Royal Statistical Society, Series B, 65(3), 679--700. Diebolt, J. and Robert, C. P. (1994). Estimation of finite mixture distributions through Bayesian sampling. Journal of the Royal Statistical Society, Series B, 56(2), 363--375.

Jasra, A., Holmes, C. C., and Stephens, D. A. (2005). Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture modelling. Statistical Science, 20(1), 50--67. Komárek, A. (2009). A new R package for Bayesian estimation of multivariate normal mixtures allowing for selection of the number of components and interval-censored data. Computational Statistics and Data Analysis, 53(12), 3932--3947.

Plummer, M. (2008). Penalized loss functions for Bayesian model comparison. Biostatistics, 9(3), 523--539. Richardson, S. and Green, P. J. (1997). On Bayesian analysis of mixtures with unknown number of components (with Discussion). Journal of the Royal Statistical Society, Series B, 59(4), 731--792.

Spiegelhalter, D. J.,Best, N. G., Carlin, B. P., and van der Linde, A. (2002). Bayesian measures of model complexity and fit (with Discussion). Journal of the Royal Statistical Society, Series B, 64(4), 583--639.

Examples

Run this code

## See also additional material available in 
## YOUR_R_DIR/library/mixAK/doc/
## or YOUR_R_DIR/site-library/mixAK/doc/
## - files Galaxy.R, Faithful.R, Tandmob.R and
## http://www.karlin.mff.cuni.cz/~komarek/software/mixAK/Galaxy.pdf
## http://www.karlin.mff.cuni.cz/~komarek/software/mixAK/Faithful.pdf
## http://www.karlin.mff.cuni.cz/~komarek/software/mixAK/Tandmob.pdf
##

## ==============================================

## Simple analysis of Anderson's iris data
## ==============================================
library("colorspace")

data(iris, package="datasets")
summary(iris)
VARS <- names(iris)[1:4]
#COLS <- rainbow_hcl(3, start = 60, end = 240)
COLS <- c("red", "darkblue", "darkgreen")
names(COLS) <- levels(iris[, "Species"])

### Prior distribution and the length of MCMC
Prior <- list(priorK = "fixed", Kmax = 3)
nMCMC <- c(burn=5000, keep=10000, thin=5, info=1000)

### Run MCMC
set.seed(20091230)
fit <- NMixMCMC(y0 = iris[, VARS], prior = Prior, nMCMC = nMCMC)

### Basic posterior summary
print(fit)

### Univariate marginal posterior predictive densities
### based on chain #1
pdens1 <- NMixPredDensMarg(fit[[1]], lgrid=150)
plot(pdens1)
plot(pdens1, main=VARS, xlab=VARS)

### Bivariate (for each pair of margins) predictive densities
### based on chain #1
pdens2a <- NMixPredDensJoint2(fit[[1]])
plot(pdens2a)

plot(pdens2a, xylab=VARS)
plot(pdens2a, xylab=VARS, contour=TRUE)

### Determine the grid to compute bivariate densities
grid <- list(Sepal.Length=seq(3.5, 8.5, length=75),
             Sepal.Width=seq(1.8, 4.5, length=75),
             Petal.Length=seq(0, 7, length=75),
             Petal.Width=seq(-0.2, 3, length=75))
pdens2b <- NMixPredDensJoint2(fit[[1]], grid=grid)
plot(pdens2b, xylab=VARS)

### Plot with contours
ICOL <- rev(heat_hcl(20, c=c(80, 30), l=c(30, 90), power=c(1/5, 2)))
oldPar <- par(mfrow=c(2, 3), bty="n")
for (i in 1:3){
  for (j in (i+1):4){
    NAME <- paste(i, "-", j, sep="")
    MAIN <- paste(VARS[i], "x", VARS[j])
    image(pdens2b$x[[i]], pdens2b$x[[j]], pdens2b$dens[[NAME]], col=ICOL,
          xlab=VARS[i], ylab=VARS[j], main=MAIN)
    contour(pdens2b$x[[i]], pdens2b$x[[j]], pdens2b$dens[[NAME]], add=TRUE, col="brown4")
  }  
}  

### Plot with data
for (i in 1:3){
  for (j in (i+1):4){
    NAME <- paste(i, "-", j, sep="")
    MAIN <- paste(VARS[i], "x", VARS[j])
    image(pdens2b$x[[i]], pdens2b$x[[j]], pdens2b$dens[[NAME]], col=ICOL,
          xlab=VARS[i], ylab=VARS[j], main=MAIN)
    for (spec in levels(iris[, "Species"])){
      Data <- subset(iris, Species==spec)
      points(Data[,i], Data[,j], pch=16, col=COLS[spec])
    }  
  }  
}  

### Set the graphical parameters back to their original values
par(oldPar)

### Clustering based on posterior summary statistics of component allocations
### or on the posterior distribution of component allocations
### (these are two equivalent estimators of probabilities of belonging
###  to each mixture components for each observation)
p1 <- fit[[1]]$poster.comp.prob_u
p2 <- fit[[1]]$poster.comp.prob_b

### Clustering based on posterior summary statistics of mixture weight, means, variances
p3 <- NMixPlugDA(fit[[1]], iris[, VARS])
p3 <- p3[, paste("prob", 1:3, sep="")]

  ### Observations from "setosa" species (all would be allocated in component 1)
apply(p1[1:50,], 2, quantile, prob=seq(0, 1, by=0.1))
apply(p2[1:50,], 2, quantile, prob=seq(0, 1, by=0.1))
apply(p3[1:50,], 2, quantile, prob=seq(0, 1, by=0.1))

  ### Observations from "versicolor" species (almost all would be allocated in component 2)
apply(p1[51:100,], 2, quantile, prob=seq(0, 1, by=0.1))
apply(p2[51:100,], 2, quantile, prob=seq(0, 1, by=0.1))
apply(p3[51:100,], 2, quantile, prob=seq(0, 1, by=0.1))

  ### Observations from "virginica" species (all would be allocated in component 3)
apply(p1[101:150,], 2, quantile, prob=seq(0, 1, by=0.1))
apply(p2[101:150,], 2, quantile, prob=seq(0, 1, by=0.1))
apply(p3[101:150,], 2, quantile, prob=seq(0, 1, by=0.1))

Run the code above in your browser using DataLab