
glm
that tests for data separation and
finds which parameters have infinite maximum likelihood estimates
in generalized linear models with binomial responsesdetect_separation
is a method for glm
that tests for the occurrence of complete or quasi-complete
separation in datasets for binomial response generalized linear
models, and finds which of the parameters will have infinite
maximum likelihood estimates. detect_separation
relies on the linear programming methods developed in Konis (2007).
detect_separation(x, y, weights = rep(1, nobs), start = NULL,
etastart = NULL, mustart = NULL, offset = rep(0, nobs),
family = gaussian(), control = list(), intercept = TRUE,
singular.ok = TRUE)detectSeparation(x, y, weights = rep(1, nobs), start = NULL,
etastart = NULL, mustart = NULL, offset = rep(0, nobs),
family = gaussian(), control = list(), intercept = TRUE,
singular.ok = TRUE)
x
is a design matrix of dimension n * p
.
y
is a vector of observations of length n
.
an optional vector of ‘prior weights’ to be used
in the fitting process. Should be NULL
or a numeric vector.
currently not used.
currently not used.
currently not used.
this can be used to specify an a priori known
component to be included in the linear predictor during fitting.
This should be NULL
or a numeric vector of length equal to
the number of cases. One or more offset
terms can be
included in the formula instead or as well, and if more than one is
specified their sum is used. See model.offset
.
a description of the error distribution and link
function to be used in the model. For glm
this can be a
character string naming a family function, a family function or the
result of a call to a family function. For glm.fit
only the
third option is supported. (See family
for details of
family functions.)
a list of parameters controlling separation
detection. See detect_separation_control
for
details.
logical. Should an intercept be included in the null model?
logical. If FALSE
, a singular model is an
error.
arguments to be used to form the default 'control' argument if it is not supplied directly.
For the definition of complete and quasi-complete separation, see Albert and Anderson (1984).
detect_separation
is a wrapper to the separator
function from the **safeBinaryRegression** R package, that can be
passed directly as a method to the glm
function. See,
examples.
The interface to separator
was designed by Ioannis Kosmidis
after correspondence with Kjell Konis, and a port of
separator
has been included in **brglm2** under the
permission of Kjell Konis.
detectSeparation
is an alias for detect_separation
.
Kjell K. (2007). *Linear Programming Algorithms for Detecting Separated Data in Binary Logistic Regression Models*. DPhil. University of Oxford. https://ora.ox.ac.uk/objects/uuid:8f9ee0d0-d78e-4101-9ab4-f9cbceed2a2a
Kjell K. (2013). safeBinaryRegression: Safe Binary Regression. R package version 0.1-3. https://CRAN.R-project.org/package=safeBinaryRegression
# NOT RUN {
## endometrial data from Heinze \& Schemper (2002) (see ?endometrial)
data("endometrial", package = "brglm2")
endometrial_sep <- glm(HG ~ NV + PI + EH, data = endometrial,
family = binomial("logit"),
method = "detect_separation")
endometrial_sep
## The maximum likelihood estimate for NV is infinite
summary(update(endometrial_sep, method = "glm.fit"))
# }
# NOT RUN {
## Example inspired by unpublished microeconometrics lecture notes by
## Achim Zeileis https://eeecon.uibk.ac.at/~zeileis/
## The maximum likelihood estimate of sourhernyes is infinite
data("MurderRates", package = "AER")
murder_sep <- glm(I(executions > 0) ~ time + income +
noncauc + lfp + southern, data = MurderRates,
family = binomial(), method = "detect_separation")
murder_sep
## which is also evident by the large estimated standard error for NV
murder_glm <- update(murder_sep, method = "glm.fit")
summary(murder_glm)
## and is also reveal by the divergence of the NV column of the
## result from the more computationally intensive check
check_infinite_estimates(murder_glm)
## Mean bias reduction via adjusted scores results in finite estimates
update(murder_glm, method = "brglm_fit")
# }
Run the code above in your browser using DataLab