aod (version 1.3.1)

quasibin: Quasi-Likelihood Model for Proportions

Description

The function fits the generalized linear model “II” proposed by Williams (1982) accounting for overdispersion in clustered binomial data \((n, y)\).

Usage

quasibin(formula, data, link = c("logit", "cloglog"), phi = NULL, tol =  0.001)

Arguments

formula

A formula for the fixed effects. The left-hand side of the formula must be of the form cbind(y, n - y) where the modelled probability is y/n.

link

The link function for the mean \(p\): “logit” or “cloglog”.

data

A data frame containing the response (n and y) and explanatory variable(s).

phi

When phi is NULL (the default), the overdispersion parameter \(\phi\) is estimated from the data. Otherwise, its value is considered as fixed.

tol

A positive scalar (default to 0.001). The algorithm stops at iteration \(r + 1\) when the condition \(\chi{^2}[r+1] - \chi{^2}[r] <= tol\) is met by the \(\chi^2\) statistics .

Value

An object of formal class “glimQL”: see glimQL-class for details.

Details

For a given cluster \((n, y)\), the model is: $$y~|~\lambda \sim Binomial(n,~\lambda)$$ with \(\lambda\) a random variable of mean \(E[\lambda] = p\) and variance \(Var[\lambda] = \phi * p * (1 - p)\). The marginal mean and variance are: $$E[y] = p$$ $$Var[y] = p * (1 - p) * [1 + (n - 1) * \phi]$$ The overdispersion parameter \(\phi\) corresponds to the intra-cluster correlation coefficient, which is here restricted to be positive. The function uses the function glm and the parameterization: \(p = h(X b) = h(\eta)\), where \(h\) is the inverse of a given link function, \(X\) is a design-matrix, \(b\) is a vector of fixed effects and \(\eta = X b\) is the linear predictor. The estimate of \(b\) maximizes the quasi log-likelihood of the marginal model. The parameter \(\phi\) is estimated with the moment method or can be set to a constant (a regular glim is fitted when \(\phi\) is set to zero). The literature recommends to estimate \(\phi\) from the saturated model. Several explanatory variables are allowed in \(b\). None is allowed in \(\phi\).

References

Moore, D.F., 1987, Modelling the extraneous variance in the presence of extra-binomial variation. Appl. Statist. 36, 8-14. Williams, D.A., 1982, Extra-binomial variation in logistic linear models. Appl. Statist. 31, 144-148.

See Also

glm, geese in the contributed package geepack, glm.binomial.disp in the contributed package dispmod.

Examples

Run this code
# NOT RUN {
  data(orob2) 
  fm1 <- glm(cbind(y, n - y) ~ seed * root,
             family = binomial, data = orob2)
  fm2 <- quasibin(cbind(y, n - y) ~ seed * root,
                  data = orob2, phi = 0)
  fm3 <- quasibin(cbind(y, n - y) ~ seed * root,
                  data = orob2)
  rbind(fm1 = coef(fm1), fm2 = coef(fm2), fm3 = coef(fm3))
  # show the model
  fm3
  # dispersion parameter and goodness-of-fit statistic
  c(phi = fm3@phi, 
    X2 = sum(residuals(fm3, type = "pearson")^2))
  # model predictions
  predfm1 <- predict(fm1, type = "response", se = TRUE)
  predfm3 <- predict(fm3, type = "response", se = TRUE)
  New <- expand.grid(seed = levels(orob2$seed),
                     root = levels(orob2$root))
  predict(fm3, New, se = TRUE, type = "response")
  data.frame(orob2, p1 = predfm1$fit, 
                    se.p1 = predfm1$se.fit,
                    p3 = predfm3$fit,
                    se.p3 = predfm3$se.fit)
  fm4 <- quasibin(cbind(y, n - y) ~ seed + root,
                  data = orob2, phi = fm3@phi)
  # Pearson's chi-squared goodness-of-fit statistic
  # compare with fm3's X2
  sum(residuals(fm4, type = "pearson")^2)
  
# }

Run the code above in your browser using DataCamp Workspace