geometric: Geometric (Truncated and Untruncated) Distributions

Description

Maximum likelihood estimation for the geometric and truncated geometric distributions.

Usage

geometric(link = "logitlink", expected = TRUE, imethod = 1,
          iprob = NULL, zero = NULL)
truncgeometric(upper.limit = Inf,
               link = "logitlink", expected = TRUE, imethod = 1,
               iprob = NULL, zero = NULL)

Arguments

link

Parameter link function applied to the probability parameter \(p\), which lies in the unit interval. See Links for more choices.

expected

Logical. Fisher scoring is used if expected = TRUE, else Newton-Raphson.

iprob, imethod, zero

See CommonVGAMffArguments for details.

upper.limit

Numeric. Upper values. As a vector, it is recycled across responses first. The default value means both family functions should give the same result.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

Details

A random variable \(Y\) has a 1-parameter geometric distribution if \(P(Y=y) = p (1-p)^y\) for \(y=0,1,2,\ldots\). Here, \(p\) is the probability of success, and \(Y\) is the number of (independent) trials that are fails until a success occurs. Thus the response \(Y\) should be a non-negative integer. The mean of \(Y\) is \(E(Y) = (1-p)/p\) and its variance is \(Var(Y) = (1-p)/p^2\). The geometric distribution is a special case of the negative binomial distribution (see negbinomial). The geometric distribution is also a special case of the Borel distribution, which is a Lagrangian distribution. If \(Y\) has a geometric distribution with parameter \(p\) then \(Y+1\) has a positive-geometric distribution with the same parameter. Multiple responses are permitted.

For truncgeometric(), the (upper) truncated geometric distribution can have response integer values from 0 to upper.limit. It has density prob * (1 - prob)^y / [1-(1-prob)^(1+upper.limit)].

For a generalized truncated geometric distribution with integer values \(L\) to \(U\), say, subtract \(L\) from the response and feed in \(U-L\) as the upper limit.

References

Forbes, C., Evans, M., Hastings, N. and Peacock, B. (2011) Statistical Distributions, Hoboken, NJ, USA: John Wiley and Sons, Fourth edition.

Examples

Run this code

# NOT RUN {
gdata <- data.frame(x2 = runif(nn <- 1000) - 0.5)
gdata <- transform(gdata, x3 = runif(nn) - 0.5,
                          x4 = runif(nn) - 0.5)
gdata <- transform(gdata, eta  = -1.0 - 1.0 * x2 + 2.0 * x3)
gdata <- transform(gdata, prob = logitlink(eta, inverse = TRUE))
gdata <- transform(gdata, y1 = rgeom(nn, prob))
with(gdata, table(y1))
fit1 <- vglm(y1 ~ x2 + x3 + x4, geometric, data = gdata, trace = TRUE)
coef(fit1, matrix = TRUE)
summary(fit1)

# Truncated geometric (between 0 and upper.limit)
upper.limit <- 5
tdata <- subset(gdata, y1 <= upper.limit)
nrow(tdata)  # Less than nn
fit2 <- vglm(y1 ~ x2 + x3 + x4, truncgeometric(upper.limit),
             data = tdata, trace = TRUE)
coef(fit2, matrix = TRUE)

# Generalized truncated geometric (between lower.limit and upper.limit)
lower.limit <- 1
upper.limit <- 8
gtdata <- subset(gdata, lower.limit <= y1 & y1 <= upper.limit)
with(gtdata, table(y1))
nrow(gtdata)  # Less than nn
fit3 <- vglm(y1 - lower.limit ~ x2 + x3 + x4,
             truncgeometric(upper.limit - lower.limit),
             data = gtdata, trace = TRUE)
coef(fit3, matrix = TRUE)
# }

Run the code above in your browser using DataLab