This function estimates overdispersed binomial logit models using the approach discussed by Williams (1982).
glm.binomial.disp(object, maxit = 30, verbose = TRUE)
an object of class "glm"
providing a fitted binomial logistic regression model; see glm
.
integer giving the maximal number of iterations for the model fitting procedure.
logical, if TRUE
information are printed during each step of the algorithm.
The function returns an object of class "glm"
with the usual information and the added components:
the estimated dispersion parameter.
the final weights used to fit the model.
Extra-binomial variation in logistic linear models is discussed, among others, in Collett (1991). Williams (1982) proposed a quasi-likelihood approach for handling overdispersion in logistic regression models.
Suppose we observe the number of successes \(y_i\) in \(m_i\) trials, for \(i = 1, \ldots, n\), such that
$$y_i \mid p_i \sim \mathrm{Binomial}(m_i, p_i)$$ $$p_i \sim \mathrm{Beta}(\gamma, \delta)$$
Under this model, each of the \(n\) binomial observations has a different probability of success \(p_i\), where \(p_i\) is a random draw from a Beta distribution. Thus,
$$E(p_i) = \frac{\gamma}{\gamma+\delta} = \theta$$ $$V(p_i) = \phi\theta(1-\theta)$$
Assuming \(\gamma > 1\) and \(\delta > 1\), the Beta density is zero at the extreme values of zero and one, and thus \(0 < \phi \le 1/3\). From this, the unconditional mean and variance can be calculated:
$$E(y_i) = m_i \theta$$ $$V(y_i) = m_i \theta (1-\theta)(1+(m_i-1)\phi)$$ so unless \(m_i = 1\) or \(\phi = 0\), the unconditional variance of \(y_i\) is larger than binomial variance.
Identical expressions for the mean and variance of \(y_i\) can be obtained if we assume that the \(m_i\) counts on the i-th unit are dependent, with the same correlation \(\phi\). In this case, \(-1/(m_i - 1) < \phi \le 1\).
The method proposed by Williams uses an iterative algorithm for estimating the dispersion parameter \(\phi\) and hence the necessary weights \(1/(1 + \phi(m_i - 1))\) (for details see Williams, 1982).
Collett, D. (1991), Modelling Binary Data, London: Chapman and Hall.
Williams, D. A. (1982), Extra-binomial variation in logistic linear models, Applied Statistics, 31, 144--148.
# NOT RUN {
data(orobanche)
mod <- glm(cbind(germinated, seeds-germinated) ~ host*variety, data = orobanche,
family = binomial(logit))
summary(mod)
mod.disp <- glm.binomial.disp(mod)
summary(mod.disp)
mod.disp$dispersion
# }
Run the code above in your browser using DataLab