Family function for fitting generalized linear models to binomial responses
binomialff(link = "logitlink", multiple.responses = FALSE,
parallel = FALSE, zero = NULL, bred = FALSE, earg.link = FALSE)
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions such as
vglm
,
vgam
,
rrvglm
,
cqo
,
and cao
.
Link function;
see Links
and
CommonVGAMffArguments
for more information.
Multivariate response? If TRUE
, then the response is
interpreted
as weights
argument should have
If FALSE
and the response is a (2-column) matrix, then the
number of successes is given in the first column, and the second
column is the number of failures.
A logical or formula. Used only if multiple.responses
is TRUE
. This
argument allows for the parallelism assumption whereby the regression
coefficients for a variable is constrained to be equal over the parallel = TRUE
then the constraint is not applied to the
intercepts.
An integer-valued vector specifying which linear/additive predictors
are modelled as intercepts only. The values must be from the set
{1,2,...,CommonVGAMffArguments
for more information.
Details at CommonVGAMffArguments
.
Details at CommonVGAMffArguments
.
Setting bred = TRUE
should work for
multiple responses (multiple.responses = TRUE
) and
all VGAM link functions;
it has been tested for
logitlink
only (and it gives similar
results to brglm but not identical),
and further testing is required.
One result from fitting bias reduced binary regression
is that finite regression coefficients occur when
the data is separable (see example below).
Currently hdeff.vglm
does not work when
bred = TRUE
.
Thomas W. Yee
See the above note regarding bred
.
The maximum likelihood estimate will not exist if the data is
completely separable or quasi-completely separable.
See Chapter 10 of Altman et al. (2004) for more details,
and safeBinaryRegression
and hdeff.vglm
.
Yet to do: add a sepcheck = TRUE
, say, argument to
further detect this problem and give an appropriate warning.
This function is largely to
mimic binomial
,
however there are some differences.
When used with cqo
and cao
, it may be
preferable to use the clogloglink
link.
McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. London: Chapman & Hall.
Altman, M. and Gill, J. and McDonald, M. P. (2004). Numerical Issues in Statistical Computing for the Social Scientist, Hoboken, NJ, USA: Wiley-Interscience.
Ridout, M. S. (1990). Non-convergence of Fisher's method of scoring---a simple example. GLIM Newsletter, 20(6).
hdeff.vglm
,
Links
,
wsdm
,
alogitlink
,
asinlink
,
N1binomial
,
rrvglm
,
cqo
,
cao
,
betabinomial
,
posbinomial
,
zibinomial
,
binom2.or
,
binom3.or
,
double.expbinomial
,
seq2binomial
,
amlbinomial
,
simplex
,
binomial
,
simulate.vlm
,
safeBinaryRegression,
residualsvglm
.
shunua <- hunua[sort.list(with(hunua, altitude)), ] # Sort by altitude
fit <- vglm(agaaus ~ poly(altitude, 2), binomialff(link = clogloglink),
data = shunua)
if (FALSE) {
plot(agaaus ~ jitter(altitude), shunua, ylab = "Pr(Agaaus = 1)",
main = "Presence/absence of Agathis australis", col = 4, las = 1)
with(shunua, lines(altitude, fitted(fit), col = "orange", lwd = 2)) }
# Fit two species simultaneously
fit2 <- vgam(cbind(agaaus, kniexc) ~ s(altitude),
binomialff(multiple.responses = TRUE), data = shunua)
if (FALSE) {
with(shunua, matplot(altitude, fitted(fit2), type = "l",
main = "Two species response curves", las = 1)) }
# Shows that Fisher scoring can sometime fail. See Ridout (1990).
ridout <- data.frame(v = c(1000, 100, 10), r = c(4, 3, 3), n = rep(5, 3))
(ridout <- transform(ridout, logv = log(v)))
# The iterations oscillates between two local solutions:
glm.fail <- glm(r / n ~ offset(logv) + 1, weight = n,
binomial(link = 'cloglog'), ridout, trace = TRUE)
coef(glm.fail)
# vglm()'s half-stepping ensures the MLE of -5.4007 is obtained:
vglm.ok <- vglm(cbind(r, n-r) ~ offset(logv) + 1,
binomialff(link = clogloglink), ridout, trace = TRUE)
coef(vglm.ok)
wsdm(vglm.ok)
# Separable data
set.seed(123)
threshold <- 0
bdata <- data.frame(x2 = sort(rnorm(nn <- 100)))
bdata <- transform(bdata, y1 = ifelse(x2 < threshold, 0, 1))
fit <- vglm(y1 ~ x2, binomialff(bred = TRUE),
data = bdata, criter = "coef", trace = TRUE)
coef(fit, matrix = TRUE) # Finite!!
summary(fit, wsdm = TRUE, hdiff = 0.0001)
if (FALSE) plot(depvar(fit) ~ x2, data = bdata, col = "blue", las = 1)
lines(fitted(fit) ~ x2, data = bdata, col = "orange")
abline(v = threshold, col = "gray", lty = "dashed")
Run the code above in your browser using DataLab