Fits a zero-inflated negative binomial distribution by full maximum likelihood estimation.
zinegbinomial(zero = "size",
type.fitted = c("mean", "munb", "pobs0", "pstr0",
"onempstr0"),
mds.min = 1e-3, nsimEIM = 500, cutoff.prob = 0.999,
eps.trig = 1e-7, max.support = 4000, max.chunk.MB = 30,
lpstr0 = "logit", lmunb = "loge", lsize = "loge",
imethod = 1, ipstr0 = NULL, imunb = NULL,
iprobs.y = NULL, isize = NULL,
gprobs.y = (0:9)/10,
gsize.mux = exp(c(-30, -20, -15, -10, -6:3)))
zinegbinomialff(lmunb = "loge", lsize = "loge", lonempstr0 = "logit",
type.fitted = c("mean", "munb", "pobs0", "pstr0",
"onempstr0"), imunb = NULL, isize = NULL, ionempstr0 =
NULL, zero = c("size", "onempstr0"), imethod = 1,
iprobs.y = NULL, cutoff.prob = 0.999,
eps.trig = 1e-7, max.support = 4000, max.chunk.MB = 30,
gprobs.y = (0:9)/10, gsize.mux = exp((-12:6)/2),
mds.min = 1e-3, nsimEIM = 500)
Link functions for the parameters \(\phi\),
the mean and \(k\); see negbinomial
for details,
and Links
for more choices.
For the zero-deflated model see below.
See CommonVGAMffArguments
and fittedvlm
for more information.
Optional initial values for \(\phi\) and \(k\) and \(\mu\). The default is to compute an initial value internally for both. If a vector then recycling is used.
Corresponding arguments for the other parameterization. See details below.
An integer with value 1
or 2
or 3
which
specifies the initialization method for the mean parameter.
If failure to converge occurs try another value.
See CommonVGAMffArguments
for more information.
Specifies which linear/additive predictors are to be modelled
as intercept-only. They can be such that their absolute values are
either 1 or 2 or 3.
The default is the \(\phi\) and \(k\) parameters
(both for each response).
See CommonVGAMffArguments
for more information.
See CommonVGAMffArguments
for information.
See negbinomial
and/or posnegbinomial
for details.
See negbinomial
for details.
These arguments relate to grid searching in the initialization process.
See negbinomial
and/or posnegbinomial
for details.
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions such as vglm
,
and vgam
.
This model can be difficult to fit to data,
and this family function is fragile.
The model is especially difficult to fit reliably when
the estimated \(k\) parameter is very large (so the model
approaches a zero-inflated Poisson distribution) or
much less than 1
(and gets more difficult as it approaches 0).
Numerical problems can also occur, e.g., when the probability of
a zero is actually less than, and not more than, the nominal
probability of zero.
Similarly, numerical problems can occur if there is little
or no 0-inflation, or when the sample size is small.
Half-stepping is not uncommon.
Successful convergence is sensitive to the initial values, therefore
if failure to converge occurs, try using combinations of arguments
stepsize
(in vglm.control
),
imethod
,
imunb
,
ipstr0
,
isize
, and/or
zero
if there are explanatory variables.
Else try fitting an ordinary negbinomial
model
or a zipoisson
model.
This VGAM family function can be computationally expensive
and can run slowly;
setting trace = TRUE
is useful for monitoring convergence.
These functions are based on
$$P(Y=0) = \phi + (1-\phi) (k/(k+\mu))^k,$$
and for \(y=1,2,\ldots\),
$$P(Y=y) = (1-\phi) \, dnbinom(y, \mu, k).$$
The parameter \(\phi\) satisfies \(0 < \phi < 1\).
The mean of \(Y\) is \((1-\phi) \mu\)
(returned as the fitted values).
By default, the three linear/additive predictors
for zinegbinomial()
are \((logit(\phi), \log(\mu), \log(k))^T\).
See negbinomial
, another VGAM family function,
for the formula of the probability density function and other details
of the negative binomial distribution.
Independent multiple responses are handled.
If so then arguments ipstr0
and isize
may be vectors
with length equal to the number of responses.
The VGAM family function zinegbinomialff()
has a few
changes compared to zinegbinomial()
.
These are:
(i) the order of the linear/additive predictors is switched so the
NB mean comes first;
(ii) onempstr0
is now 1 minus the probability of a structural 0,
i.e., the probability of the parent (NB) component,
i.e., onempstr0
is 1-pstr0
;
(iii) argument zero
has a new default so that the onempstr0
is intercept-only by default.
Now zinegbinomialff()
is generally recommended over
zinegbinomial()
.
Both functions implement Fisher scoring and can handle
multiple responses.
# NOT RUN { # Example 1 ndata <- data.frame(x2 = runif(nn <- 1000)) ndata <- transform(ndata, pstr0 = logit(-0.5 + 1 * x2, inverse = TRUE), munb = exp( 3 + 1 * x2), size = exp( 0 + 2 * x2)) ndata <- transform(ndata, y1 = rzinegbin(nn, mu = munb, size = size, pstr0 = pstr0)) with(ndata, table(y1)["0"] / sum(table(y1))) nfit <- vglm(y1 ~ x2, zinegbinomial(zero = NULL), data = ndata) coef(nfit, matrix = TRUE) summary(nfit) head(cbind(fitted(nfit), with(ndata, (1 - pstr0) * munb))) round(vcov(nfit), 3) # Example 2: RR-ZINB could also be called a COZIVGLM-ZINB-2 ndata <- data.frame(x2 = runif(nn <- 2000)) ndata <- transform(ndata, x3 = runif(nn)) ndata <- transform(ndata, eta1 = 3 + 1 * x2 + 2 * x3) ndata <- transform(ndata, pstr0 = logit(-1.5 + 0.5 * eta1, inverse = TRUE), munb = exp(eta1), size = exp(4)) ndata <- transform(ndata, y1 = rzinegbin(nn, pstr0 = pstr0, mu = munb, size = size)) with(ndata, table(y1)["0"] / sum(table(y1))) rrzinb <- rrvglm(y1 ~ x2 + x3, zinegbinomial(zero = NULL), data = ndata, Index.corner = 2, str0 = 3, trace = TRUE) coef(rrzinb, matrix = TRUE) Coef(rrzinb) # }
Run the code above in your browser using DataCamp Workspace