MixNRMI1cens: Normalized Random Measures Mixture of Type I for censored data

Description

Bayesian nonparametric estimation based on normalized measures driven mixtures for locations.

Usage

MixNRMI1cens(
  xleft,
  xright,
  probs = c(0.025, 0.5, 0.975),
  Alpha = 1,
  Kappa = 0,
  Gama = 0.4,
  distr.k = "normal",
  distr.p0 = "normal",
  asigma = 0.5,
  bsigma = 0.5,
  delta_S = 3,
  delta_U = 2,
  Meps = 0.01,
  Nx = 150,
  Nit = 1500,
  Pbi = 0.1,
  epsilon = NULL,
  printtime = TRUE,
  extras = TRUE,
  adaptive = FALSE
)

Value

The function returns a list with the following components:

xx: Numeric vector. Evaluation grid.
qx: Numeric array. Matrix of dimension $\texttt{Nx} \times (\texttt{length(probs)} + 1)$ with the posterior mean and the desired quantiles input in probs.
cpo: Numeric vector of length(x) with conditional predictive ordinates.
R: Numeric vector of length(Nit*(1-Pbi)) with the number of mixtures components (clusters).
S: Numeric vector of length(Nit*(1-Pbi)) with the values of common standard deviation sigma.
U: Numeric vector of length(Nit*(1-Pbi)) with the values of the latent variable U.
Allocs: List of length(Nit*(1-Pbi)) with the clustering allocations.
means: List of length(Nit*(1-Pbi)) with the cluster means (locations). Only if extras = TRUE.
weights: List of length(Nit*(1-Pbi)) with the mixture weights. Only if extras = TRUE.
Js: List of length(Nit*(1-Pbi)) with the unnormalized weights (jump sizes). Only if extras = TRUE.
Nm: Integer constant. Number of jumps of the continuous component of the unnormalized process.
Nx: Integer constant. Number of grid points for the evaluation of the density estimate.
Nit: Integer constant. Number of MCMC iterations.
Pbi: Numeric constant. Burn-in period proportion of Nit.
procTime: Numeric vector with execution time provided by proc.time function.
distr.k: Integer corresponding to the kernel chosen for the mixture
data: Data used for the fit
NRMI_params: A named list with the parameters of the NRMI process

Arguments

xleft: Numeric vector. Lower limit of interval censoring. For exact data the same as xright
xright: Numeric vector. Upper limit of interval censoring. For exact data the same as xleft.
probs: Numeric vector. Desired quantiles of the density estimates.
Alpha: Numeric constant. Total mass of the centering measure. See details.
Kappa: Numeric positive constant. See details.
Gama: Numeric constant. $0\leq \texttt{Gama} \leq 1$. See details.
distr.k: The distribution name for the kernel. Allowed names are "normal", "gamma", "beta", "double exponential", "lognormal" or their common abbreviations "norm", "exp", or an integer number identifying the mixture kernel: 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal.
distr.p0: The distribution name for the centering measure. Allowed names are "normal", "gamma", "beta", or their common abbreviations "norm", "exp", or an integer number identifying the centering measure: 1 = Normal; 2 = Gamma; 3 = Beta.
asigma: Numeric positive constant. Shape parameter of the gamma prior on the standard deviation of the mixture kernel distr.k.
bsigma: Numeric positive constant. Rate parameter of the gamma prior on the standard deviation of the mixture kernel distr.k.
delta_S: Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling sigma.
delta_U: Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the latent U.
Meps: Numeric constant. Relative error of the jump sizes in the continuous component of the process. Smaller values imply larger number of jumps.
Nx: Integer constant. Number of grid points for the evaluation of the density estimate.
Nit: Integer constant. Number of MCMC iterations.
Pbi: Numeric constant. Burn-in period proportion of Nit.
epsilon: Numeric constant. Extension to the evaluation grid range. See details.
printtime: Logical. If TRUE, prints out the execution time.
extras: Logical. If TRUE, gives additional objects: means, weights and Js.
adaptive: Logical. If TRUE, uses an adaptive MCMC strategy to sample the latent U (adaptive delta_U).

Warning

The function is computing intensive. Be patient.

Author

Barrios, E., Kon Kam King, G. and Nieto-Barajas, L.E.

Details

This generic function fits a normalized random measure (NRMI) mixture model for density estimation (James et al. 2009) with censored data. Specifically, the model assumes a normalized generalized gamma (NGG) prior for the locations (means) of the mixture kernel and a parametric prior for the common smoothing parameter sigma, leading to a semiparametric mixture model.

This function coincides with MixNRMI1 when the lower (xleft) and upper (xright) censoring limits correspond to the same exact value.

The details of the model are: $$X_i|Y_i,\sigma \sim k(\cdot |Y_i,\sigma)$$ $$Y_i|P \sim P,\quad i=1,\dots,n$$ $$P \sim \textrm{NGG(\texttt{Alpha, Kappa, Gama; P\_0})}$$ $$\sigma \sim \textrm{Gamma(asigma, bsigma)}$$ where $X_i$'s are the observed data, $Y_i$'s are latent (location) variables, sigma is the smoothing parameter, k is a parametric kernel parameterized in terms of mean and standard deviation, (Alpha, Kappa, Gama; P_0) are the parameters of the NGG prior with P_0 being the centering measure whose parameters are assigned vague hyper prior distributions, and (asigma,bsigma) are the hyper-parameters of the gamma prior on the smoothing parameter sigma. In particular: NGG(Alpha, 1, 0; P_0) defines a Dirichlet process; NGG(1, Kappa, 1/2; P_0) defines a Normalized inverse Gaussian process; and NGG(1, 0, Gama; P_0) defines a normalized stable process.

The evaluation grid ranges from min(x) - epsilon to max(x) + epsilon. By default epsilon=sd(x)/4.

References

1.- Barrios, E., Lijoi, A., Nieto-Barajas, L. E. and Prünster, I. (2013). Modeling with Normalized Random Measure Mixture Models. Statistical Science. Vol. 28, No. 3, 313-334.

2.- James, L.F., Lijoi, A. and Prünster, I. (2009). Posterior analysis for normalized random measure with independent increments. Scand. J. Statist 36, 76-97.

3.- Kon Kam King, G., Arbel, J. and Prünster, I. (2016). Species Sensitivity Distribution revisited: a Bayesian nonparametric approach. In preparation.

Examples

Run this code


### Example 1
if (FALSE) {
# Data
data(acidity)
x <- acidity
# Fitting the model under default specifications
out <- MixNRMI1cens(x, x)
# Plotting density estimate + 95% credible interval
plot(out)
}

if (FALSE) {
### Example 2
# Data
data(salinity)
# Fitting the model under default specifications
out <- MixNRMI1cens(xleft = salinity$left, xright = salinity$right, Nit = 5000)
# Plotting density estimate + 95% credible interval
attach(out)
plot(out)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
detach()
}

Run the code above in your browser using DataLab