MixNRMI2cens: Normalized Random Measures Mixture of Type II for censored data

Description

Bayesian nonparametric estimation based on normalized measures driven mixtures for locations and scales.

Usage

MixNRMI2cens(xleft, xright, probs = c(0.025, 0.5, 0.975), Alpha = 1,
  Kappa = 0, Gama = 0.4, distr.k = 1, distr.py0 = 1,
  distr.pz0 = 2, mu.pz0 = 3, sigma.pz0 = sqrt(10), delta = 4,
  kappa = 2, Delta = 2, Meps = 0.01, Nx = 150, Nit = 1500,
  Pbi = 0.1, epsilon = NULL, printtime = TRUE, extras = TRUE)

Arguments

xleft

Numeric vector. Lower limit of interval censoring. For exact data the same as xright

xright

Numeric vector. Upper limit of interval censoring. For exact data the same as xleft.

probs

Numeric vector. Desired quantiles of the density estimates.

Alpha

Numeric constant. Total mass of the centering measure. See details.

Kappa

Numeric positive constant. See details.

Gama

Numeric constant. $0 \leq Gama \leq 1$. See details.

distr.k

Integer number identifying the mixture kernel: 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal.

distr.py0

Integer number identifying the centering measure for locations: 1 = Normal; 2 = Gamma; 3 = Beta.

distr.pz0

Integer number identifying the centering measure for scales: 2 = Gamma, 5 = Lognormal, 6 = Half Cauchy, 7 = Half Normal, 8 = Half Student-t, 9 = Uniform, 10 = Truncated Normal.

mu.pz0

Numeric constant. Prior mean of the centering measure for scales.

sigma.pz0

Numeric constant. Prior standard deviation of the centering measure for scales.

delta

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the scales.

kappa

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the location parameters.

Delta

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the latent U.

Meps

Numeric constant. Relative error of the jump sizes in the continuous component of the process. Smaller values imply larger number of jumps.

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

epsilon

Numeric constant. Extension to the evaluation grid range. See details.

printtime

Logical. If TRUE, prints out the execution time.

extras

Logical. If TRUE, gives additional objects: means, sigmas, weights and Js.

Value

The function returns a list with the following components:

Numeric vector. Evaluation grid.

Numeric array. Matrix of dimension $\texttt{Nx} \times (\texttt{length(probs)} + 1)$ with the posterior mean and the desired quantiles input in probs.

cpo

Numeric vector of length(x) with conditional predictive ordinates.

Numeric vector of length(Nit*(1-Pbi)) with the number of mixtures components (clusters).

Numeric vector of length(Nit*(1-Pbi)) with the values of the latent variable U.

Allocs

List of length(Nit*(1-Pbi)) with the clustering allocations.

means

List of length(Nit*(1-Pbi)) with the cluster means (locations). Only if extras = TRUE.

sigmas

Numeric vector of length(Nit*(1-Pbi)) with the cluster standard deviations. Only if extras = TRUE.

weights

List of length(Nit*(1-Pbi)) with the mixture weights. Only if extras = TRUE.

List of length(Nit*(1-Pbi)) with the unnormalized weights (jump sizes). Only if extras = TRUE.

Integer constant. Number of jumps of the continuous component of the unnormalized process.

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

procTime

Numeric vector with execution time provided by proc.time function.

distr.k

Integer corresponding to the kernel chosen for the mixture

data

Data used for the fit

NRMI_params

A named list with the parameters of the NRMI process

Warning

The function is computing intensive. Be patient.

Details

This generic function fits a normalized random measure (NRMI) mixture model for density estimation (James et al. 2009). Specifically, the model assumes a normalized generalized gamma (NGG) prior for both, locations (means) and standard deviations, of the mixture kernel, leading to a fully nonparametric mixture model.

The details of the model are: $$X_i|Y_i,Z_i \sim k(\cdot|Y_i,Z_i)$$ $$(Y_i,Z_i)|P \sim P, i=1,\dots,n$$ $$P \sim \textrm{NGG}(\texttt{Alpha, Kappa, Gama; P\_0})$$ where, $X_i$'s are the observed data, $(Y_i,Z_i)$'s are bivariate latent (location and scale) vectors, k is a parametric kernel parameterized in terms of mean and standard deviation, (Alpha, Kappa, Gama; P_0) are the parameters of the NGG prior with a bivariate P_0 being the centering measure with independent components, that is, $P_0(Y,Z) = P_0(Y)*P_0(Z)$. The parameters of P_0(Y) are assigned vague hyper prior distributions and (mu.pz0,sigma.pz0) are the hyper-parameters of P_0(Z). In particular, NGG(Alpha, 1, 0; P_0) defines a Dirichlet process; NGG(1, Kappa, 1/2;P_0) defines a Normalized inverse Gaussian process; and NGG(1, 0, Gama; P_0) defines a normalized stable process. The evaluation grid ranges from min(x) - epsilon to max(x) + epsilon. By default epsilon=sd(x)/4.

References

1.- Barrios, E., Lijoi, A., Nieto-Barajas, L. E. and Pr<U+00FC>enster, I. (2013). Modeling with Normalized Random Measure Mixture Models. Statistical Science. Vol. 28, No. 3, 313-334.

2.- James, L.F., Lijoi, A. and Pr<U+00FC>enster, I. (2009). Posterior analysis for normalized random measure with independent increments. Scand. J. Statist 36, 76-97.

3.- Kon Kam King, G., Arbel, J. and Pr<U+00FC>enster, I. (2016). Species Sensitivity Distribution revisited: a Bayesian nonparametric approach. In preparation.

Examples

Run this code

# NOT RUN {
# }
# NOT RUN {
### Example 1
# Data
data(acidity)
x <- acidity
# Fitting the model under default specifications
out <- MixNRMI2cens(x, x)
# Plotting density estimate + 95% credible interval
attach(out)
m <- ncol(qx)
ymax <- max(qx[, m])
par(mfrow = c(1, 1))
hist(x, probability = TRUE, breaks = 20, col = grey(.9), ylim = c(0, ymax))
lines(xx, qx[, 1], lwd = 2)
lines(xx, qx[, 2], lty = 3, col = 4)
lines(xx, qx[, m], lty = 3, col = 4)
detach()
# }
# NOT RUN {
# }
# NOT RUN {
### Example 2
# Data
data(salinity)
# Fitting the model under special specifications
out <- MixNRMI2cens(
  xleft = salinity$left, xright = salinity$right, Nit = 5000, distr.pz0 = 10,
  mu.pz0 = 1, sigma.pz0 = 2
)
# Plotting density estimate + 95% credible interval
attach(out)
m <- ncol(qx)
ymax <- max(qx[, m])
par(mfrow = c(1, 1))
plot(xx, qx$"q0.5", lwd = 2, type = "l", ylab = "Density", xlab = "Data")
lines(xx, qx[, 2], lty = 3, col = 4)
lines(xx, qx[, m], lty = 3, col = 4)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
detach()
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab