# zipoisson

##### Zero-Inflated Poisson Distribution Family Function

Fits a zero-inflated or zero-deflated Poisson distribution by full maximum likelihood estimation.

- Keywords
- models, regression

##### Usage

```
zipoisson(lpstr0 = "logitlink", llambda = "loglink", type.fitted =
c("mean", "lambda", "pobs0", "pstr0", "onempstr0"), ipstr0 =
NULL, ilambda = NULL, gpstr0 = NULL, imethod = 1,
ishrinkage = 0.95, probs.y = 0.35, parallel = FALSE, zero = NULL)
zipoissonff(llambda = "loglink", lonempstr0 = "logitlink", type.fitted =
c("mean", "lambda", "pobs0", "pstr0", "onempstr0"),
ilambda = NULL, ionempstr0 = NULL, gonempstr0 = NULL,
imethod = 1, ishrinkage = 0.95, probs.y = 0.35, zero =
"onempstr0")
```

##### Arguments

- lpstr0, llambda
Link function for the parameter \(\phi\) and the usual \(\lambda\) parameter. See

`Links`

for more choices; see`CommonVGAMffArguments`

for more information. For the zero-*deflated*model see below.- ipstr0, ilambda
Optional initial values for \(\phi\), whose values must lie between 0 and 1. Optional initial values for \(\lambda\), whose values must be positive. The defaults are to compute an initial value internally for each. If a vector then recycling is used.

- lonempstr0, ionempstr0
Corresponding arguments for the other parameterization. See details below.

- type.fitted
Character. The type of fitted value to be returned. The first choice (the expected value) is the default. The estimated probability of an observed 0 is an alternative, else the estimated probability of a structural 0, or one minus the estimated probability of a structural 0. See

`CommonVGAMffArguments`

and`fittedvlm`

for more information.- imethod
An integer with value

`1`

or`2`

which specifies the initialization method for \(\lambda\). If failure to converge occurs try another value and/or else specify a value for`ishrinkage`

and/or else specify a value for`ipstr0`

. See`CommonVGAMffArguments`

for more information.- ishrinkage
How much shrinkage is used when initializing \(\lambda\). The value must be between 0 and 1 inclusive, and a value of 0 means the individual response values are used, and a value of 1 means the median or mean is used. This argument is used in conjunction with

`imethod`

. See`CommonVGAMffArguments`

for more information.- zero
Specifies which linear/additive predictors are to be modelled as intercept-only. If given, the value can be either 1 or 2, and the default is none of them. Setting

`zero = 1`

makes \(\phi\) a single parameter. See`CommonVGAMffArguments`

for more information.- gpstr0, gonempstr0, probs.y
Details at

`CommonVGAMffArguments`

.- parallel
Details at

`CommonVGAMffArguments`

, but unlikely to be practically used actually.

##### Details

These models are a mixture of a Poisson distribution and the value 0;
it has value 0 with probability \(\phi\) else is
Poisson(\(\lambda\)) distributed.
Thus there are two sources for zero values, and \(\phi\)
is the probability of a *structural zero*.
The model for `zipoisson()`

can be written
$$P(Y = 0) = \phi + (1-\phi) \exp(-\lambda),$$
and for \(y=1,2,\ldots\),
$$P(Y = y) = (1-\phi) \exp(-\lambda) \lambda^y / y!.$$
Here, the parameter \(\phi\) satisfies
\(0 < \phi < 1\).
The mean of \(Y\) is \((1-\phi) \lambda\) and these
are returned as the fitted values,
by default.
The variance of \(Y\) is \((1-\phi) \lambda (1 + \phi \lambda)\).
By default, the two linear/additive predictors
of `zipoisson()`

are \((logit(\phi), \log(\lambda))^T\).

The VGAM family function `zipoissonff()`

has a few
changes compared to `zipoisson()`

.
These are:
(i) the order of the linear/additive predictors is switched so the
Poisson mean comes first;
(ii) `onempstr0`

is now 1 minus the probability of a structural 0,
i.e., the probability of the parent (Poisson) component,
i.e., `onempstr0`

is `1-pstr0`

;
(iii) argument `zero`

has a new default so that the `onempstr0`

is intercept-only by default.
Now `zipoissonff()`

is generally recommended over
`zipoisson()`

(and definitely recommended over `yip88`

).
Both functions implement Fisher scoring and can handle
multiple responses.

##### Value

An object of class `"vglmff"`

(see `vglmff-class`

).
The object is used by modelling functions such as `vglm`

,
`rrvglm`

and `vgam`

.

##### Note

This family function can be used to estimate the 0-*deflated* model,
hence `pstr0`

is not to be interpreted as a probability.
One should set, e.g., `lpstr0 = "identitylink"`

.
Likewise, the functions in `Zipois`

can handle the zero-deflated Poisson distribution too.
Although the iterations
might fall outside the parameter space, the `validparams`

slot
should keep them inside.
A (somewhat) similar alternative for
zero-deflation is to try the zero-altered Poisson model
(see `zapoisson`

).

The use of this VGAM family function with `rrvglm`

can result in a so-called COZIGAM or COZIGLM.
That is, a reduced-rank zero-inflated Poisson model (RR-ZIP)
is a constrained zero-inflated generalized linear model.
See COZIGAM.
A RR-ZINB model can also be fitted easily;
see `zinegbinomial`

.
Jargon-wise, a COZIGLM might be better described as a
COZIVGLM-ZIP.

##### Warning

Numerical problems can occur, e.g., when the probability of
zero is actually less than, not more than, the nominal
probability of zero.
For example, in the Angers and Biswas (2003) data below,
replacing 182 by 1 results in nonconvergence.
Half-stepping is not uncommon.
If failure to converge occurs, try using combinations of
`imethod`

,
`ishrinkage`

,
`ipstr0`

, and/or
`zipoisson(zero = 1)`

if there are explanatory variables.
The default for `zipoissonff()`

is to model the
structural zero probability as an intercept-only.

##### References

Thas, O. and Rayner, J. C. W. (2005)
Smooth tests for the zero-inflated Poisson distribution.
*Biometrics*,
**61**, 808--815.

Data: Angers, J-F. and Biswas, A. (2003)
A Bayesian analysis of zero-inflated generalized Poisson model.
*Computational Statistics & Data Analysis*,
**42**, 37--46.

Cameron, A. C. and Trivedi, P. K. (1998)
*Regression Analysis of Count Data*.
Cambridge University Press: Cambridge.

Yee, T. W. (2014)
Reduced-rank vector generalized linear models with two linear predictors.
*Computational Statistics and Data Analysis*,
**71**, 889--902.

##### See Also

`genapoisson`

,
`zapoisson`

,
`Zipois`

,
`yip88`

,
`rrvglm`

,
`zipebcom`

,
`rpois`

,
`simulate.vlm`

,
`hdeff.vglm`

.

##### Examples

```
# NOT RUN {
# Example 1: simulated ZIP data
zdata <- data.frame(x2 = runif(nn <- 1000))
zdata <- transform(zdata, pstr01 = logitlink(-0.5 + 1*x2, inverse = TRUE),
pstr02 = logitlink( 0.5 - 1*x2, inverse = TRUE),
Ps01 = logitlink(-0.5 , inverse = TRUE),
Ps02 = logitlink( 0.5 , inverse = TRUE),
lambda1 = loglink(-0.5 + 2*x2, inverse = TRUE),
lambda2 = loglink( 0.5 + 2*x2, inverse = TRUE))
zdata <- transform(zdata, y1 = rzipois(nn, lambda = lambda1, pstr0 = Ps01),
y2 = rzipois(nn, lambda = lambda2, pstr0 = Ps02))
with(zdata, table(y1)) # Eyeball the data
with(zdata, table(y2))
fit1 <- vglm(y1 ~ x2, zipoisson(zero = 1), data = zdata, crit = "coef")
fit2 <- vglm(y2 ~ x2, zipoisson(zero = 1), data = zdata, crit = "coef")
coef(fit1, matrix = TRUE) # These should agree with the above values
coef(fit2, matrix = TRUE) # These should agree with the above values
# Fit all two simultaneously, using a different parameterization:
fit12 <- vglm(cbind(y1, y2) ~ x2, zipoissonff, data = zdata, crit = "coef")
coef(fit12, matrix = TRUE) # These should agree with the above values
# For the first observation compute the probability that y1 is
# due to a structural zero.
(fitted(fit1, type = "pstr0") / fitted(fit1, type = "pobs0"))[1]
# Example 2: McKendrick (1926). Data from 223 Indian village households
cholera <- data.frame(ncases = 0:4, # Number of cholera cases,
wfreq = c(168, 32, 16, 6, 1)) # Frequencies
fit <- vglm(ncases ~ 1, zipoisson, wei = wfreq, cholera, trace = TRUE)
coef(fit, matrix = TRUE)
with(cholera, cbind(actual = wfreq,
fitted = round(dzipois(ncases, lambda = Coef(fit)[2],
pstr0 = Coef(fit)[1]) *
sum(wfreq), digits = 2)))
# Example 3: data from Angers and Biswas (2003)
abdata <- data.frame(y = 0:7, w = c(182, 41, 12, 2, 2, 0, 0, 1))
abdata <- subset(abdata, w > 0)
fit <- vglm(y ~ 1, zipoisson(lpstr0 = probitlink, ipstr0 = 0.8),
data = abdata, weight = w, trace = TRUE)
fitted(fit, type = "pobs0") # Estimate of P(Y = 0)
coef(fit, matrix = TRUE)
Coef(fit) # Estimate of pstr0 and lambda
fitted(fit)
with(abdata, weighted.mean(y, w)) # Compare this with fitted(fit)
summary(fit)
# Example 4: zero-deflated model for intercept-only data
zdata <- transform(zdata, lambda3 = loglink(0.0, inverse = TRUE))
zdata <- transform(zdata, deflat.limit = -1 / expm1(lambda3)) # Boundary
# The 'pstr0' parameter is negative and in parameter space:
zdata <- transform(zdata, usepstr0 = deflat.limit / 2) # Not too near the boundary
zdata <- transform(zdata, y3 = rzipois(nn, lambda3, pstr0 = usepstr0))
head(zdata)
with(zdata, table(y3)) # A lot of deflation
fit3 <- vglm(y3 ~ 1, zipoisson(zero = -1, lpstr0 = "identitylink"),
data = zdata, trace = TRUE, crit = "coef")
coef(fit3, matrix = TRUE)
# Check how accurate it was:
zdata[1, "usepstr0"] # Answer
coef(fit3)[1] # Estimate
Coef(fit3)
vcov(fit3) # Is positive-definite
# Example 5: This RR-ZIP is known as a COZIGAM or COZIVGLM-ZIP
set.seed(123)
rrzip <- rrvglm(Alopacce ~ sm.bs(WaterCon, df = 3), zipoisson(zero = NULL),
data = hspider, trace = TRUE, Index.corner = 2)
coef(rrzip, matrix = TRUE)
Coef(rrzip)
summary(rrzip)
# }
# NOT RUN {
plotvgam(rrzip, lcol = "blue")
# }
```

*Documentation reproduced from package VGAM, version 1.1-1, License: GPL-3*