hzip: Fit a Hierarchical Zero-Inflated Poisson (HZIP) Model

Description

hzip() fits a longitudinal/clustered zero-inflated Poisson model with subject-level random effects by maximizing a (marginal) likelihood approximated. The model uses a two-part Formula: $y ~ \text{zero part} \mid \text{count part}$, where the count intensity (Poisson mean) and the zero-inflation probability are linked to (possibly different) sets of covariates. Initial values are obtained from pscl::zeroinfl(..., dist = "poisson", link = "cloglog").

Usage

hzip(
  formula,
  data,
  hessian = TRUE,
  method = "BFGS",
  Q = 15,
  lower = -Inf,
  upper = Inf,
  control = NULL,
  ...
)

Value

An object of class "HZIP", a list with elements:

call: The matched call.
formula: The model Formula.
coefficients_zero: Estimated coefficients for the zero-inflation part.
coefficients_count: Estimated coefficients for the count part.
scale_zero: Estimated scale (zero part).
scale_count: Estimated scale (count part).
loglik: Optimized objective value returned by optim. (Note: depending on lvero, this may be the negative log-likelihood.)
convergence: optim convergence code.
n: Number of observations or subjects (see Note).
m: Cluster sizes per subject (vector ordered by Ind).
ep: Approximate standard errors (square roots of the diagonal of the inverse Hessian).
iter: Number of optim iterations.
method: Optimization method.
optim: Raw optim output.
data: The input data.

Arguments

formula: A two-part Formula of the form y ~w_zero + ... | x_count + ... , where the right-hand side before the bar specifies covariates for the zero-inflation component and the right-hand side after the bar specifies covariates for the Poisson mean.
data: A data.frame containing all variables used in formula and a subject identifier named Ind (one row per observation).
hessian: Logical; if TRUE (default) the observed Hessian at the optimum is returned and used to compute standard-error estimates.
method: Character string passed to optim (default "BFGS").
Q: Integer; number of Gauss–Hermite nodes for quadrature (default 15). Larger values improve accuracy at higher computational cost.
lower: Bounds on the variables for the "L-BFGS-B" method, or bounds in which to search for method "Brent" (arguments passed to optim).
upper: method, or bounds in which to search for method "Brent" (arguments passed to optim).
control: Optional list passed to optim's control= argument (e.g., list(maxit = 500)).
...: Further arguments passed to optim.

Details

Let $y_{ij}$ denote the count response for subject $i$ at occasion $j$. The HZIP model assumes $$P(y_{ij}=0 \mid u_i) = \pi_{ij}(u_i) + \{1-\pi_{ij}(u_i)\}\exp\{-\mu_{ij}(u_i)\},$$ $$P(y_{ij}=k \mid u_i) = \{1-\pi_{ij}(u_i)\}\frac{\mu_{ij}(u_i)^k e^{-\mu_{ij}(u_i)}}{k!},\quad k\ge 1,$$ with linear predictors for the count and zero parts (links typically log for the Poisson mean and cloglog for the zero-inflation). Subject-specific random effects $u_i$ induce within-subject dependence; the marginal likelihood is approximated by Gauss–Hermite quadrature with Q nodes.

References

Min, Y., & Agresti, A. (2005). Random effect models for repeated measures of zero-inflated count data. Statistical Modelling, 5(1), 1–19.

Jackman, S. (2020). pscl: Classes and Methods for R Developed in the Political Science Computational Laboratory. R package version 1.5.5.

Zeileis, A., & Croissant, Y. (2010). Extended model formulas in R: Journal of Statistical Software, 34(1), 1–13. (Formula)

Examples

Run this code

# \donttest{
fit.salamander <- hzip(y ~ mined|mined+spp,data = salamanders)
summary(fit.salamander)
# }

Run the code above in your browser using DataLab