betabin: Beta-Binomial Model for Proportions

Description

Fits a beta-binomial generalized linear model accounting for overdispersion in clustered binomial data $(n, y)$.

Usage

betabin(formula, random, data, link = c("logit", "cloglog"), phi.ini = NULL,
          warnings = FALSE, na.action = na.omit, fixpar = list(),
          hessian = TRUE, control = list(maxit = 2000), ...)

Arguments

formula

A formula for the fixed effects b. The left-hand side of the formula must be of the form cbind(y, n - y) where the modelled probability is y/n.

random

A right-hand formula for the overdispersion parameter(s) $\phi$.

link

The link function for the mean $p$: logit or cloglog.

data

A data frame containing the response (n and y) and explanatory variable(s).

phi.ini

Initial values for the overdispersion parameter(s) $\phi$. Default to 0.1.

warnings

Logical to control the printing of warnings occurring during log-likelihood maximization. Default to FALSE (no printing).

na.action

A function name: which action should be taken in the case of missing value(s).

fixpar

A list with 2 components (scalars or vectors) of the same size, indicating which parameters are fixed (i.e., not optimized) in the global parameter vector $(b, \phi)$ and the corresponding fixed values. For example, fixpar = list(c(4, 5), c(0,

hessian

A logical. When set to FALSE, the hessian and the variances-covariances matrices of the parameters are not computed.

control

A list to control the optimization parameters. See optim. By default, set the maximum number of iterations to 2000.

...

Further arguments passed to optim.

Value

An object of formal class glimML: see glimML-class for details.

Details

For a given cluster $(n, y)$, the model is: $$y~|~\lambda \sim Binomial(n,~\lambda)$$ with $\lambda$ following a Beta distribution $Beta(a1,~a2)$. If $B$ denotes the beta function, then: $$P(\lambda) = \frac{\lambda^{a1~-~1} * (1~-~\lambda)^{a2 - 1}}{B(a1,~a2)}$$ $$E[\lambda] = \frac{a1}{a1 + a2}$$ $$Var[\lambda] = \frac{a1 * a2}{(a1 + a2 + 1) * (a1 + a2)^2}$$ The marginal beta-binomial distribution is: $$P(y) = \frac{C(n,~y) * B(a1 + y, a2 + n - y)}{B(a1,~a2)}$$ The function uses the parameterization $p = \frac{a1}{a1 + a2} = h(X b) = h(\eta)$ and $\phi = \frac{1}{a1 + a2 + 1}$, where $h$ is the inverse of the link function (logit or complementary log-log), $X$ is a design-matrix, $b$ is a vector of fixed effects, $\eta = X b$ is the linear predictor and $\phi$ is the overdispersion parameter (i.e., the intracluster correlation coefficient, which is here restricted to be positive). The marginal mean and variance are: $$E[y] = n * p$$ $$Var[y] = n * p * (1 - p) * [1 + (n - 1) * \phi]$$ The parameters $b$ and $\phi$ are estimated by maximizing the log-likelihood of the marginal model (using the function optim). Several explanatory variables are allowed in $b$, only one in $\phi$.

References

Crowder, M.J., 1978. Beta-binomial anova for proportions. Appl. Statist. 27, 34-37. Griffiths, D.A., 1973. Maximum likelihood estimation for the beta-binomial distribution and an application to the household distribution of the total number of cases of disease. Biometrics 29, 637-648. Prentice, R.L., 1986. Binary regression using an extended beta-binomial distribution, with discussion of correlation induced by covariate measurement errors. J.A.S.A. 81, 321-327. Williams, D.A., 1975. The analysis of binary responses from toxicological experiments involving reproduction and teratogenicity. Biometrics 31, 949-952.

Examples

Run this code

data(orob2)
  fm1 <- betabin(cbind(y, n - y) ~ seed, ~ 1, data = orob2)
  fm2 <- betabin(cbind(y, n - y) ~ seed + root, ~ 1, data = orob2)
  fm3 <- betabin(cbind(y, n - y) ~ seed * root, ~ 1, data = orob2)
  # show the model
  fm1; fm2; fm3
  # AIC
  AIC(fm1, fm2, fm3)
  summary(AIC(fm1, fm2, fm3), which = "AICc")
  # Wald test for root effect
  wald.test(b = coef(fm3), Sigma = vcov(fm3), Terms = 3:4)
  # likelihood ratio test for root effect
  anova(fm1, fm3)
  # model predictions
  New <- expand.grid(seed = levels(orob2$seed),
                     root = levels(orob2$root))
  data.frame(New, predict(fm3, New, se = TRUE, type = "response"))
  # Djallonke sheep data
  data(dja)
  betabin(cbind(y, n - y) ~ group, ~ 1, dja)
  # heterogeneous phi
  betabin(cbind(y, n - y) ~ group, ~ group, dja,
          control = list(maxit = 1000))
  # phi fixed to zero in group TREAT
   betabin(cbind(y, n - y) ~ group, ~ group, dja,
    fixpar = list(4, 0))
  # glim without overdispersion
  summary(glm(cbind(y, n - y) ~ group,
    family = binomial, data = dja))
  # phi fixed to zero in both groups
  betabin(cbind(y, n - y) ~ group, ~ group, dja,
    fixpar = list(c(3, 4), c(0, 0)))

Run the code above in your browser using DataLab