Separation occurs in binomial response models when a combination of the predictor variables perfectly predict a level of the response. In such a case the estimates of the coefficients for these variables diverge to (+/-)infinity, and the numerical algorithms typically fail. To anticipate such a problem, the fitting functions in spaMM
try to check for separation by default, using the lpSolveAPI
package which can also detect some borderline cases (“quasi-separation”). If this package is not available, spaMM
tries to use the e1071
package (in a way which will not detect quasi-separation). The check may take much time, and is skipped if the “problem size” exceeds a threshold defined by spaMM.options(separation_max=<.>)
, in which case a message will tell users by how much they should increase separation_max
to force the check (the definition of the “problem size” differs betwwen the two methods and may be complicated).
is_separated
is a convenient interface to procedures from the lpSolveAPI
package, which can be called explicitly by the user to check bootstrap samples (see Example in anova
).
is_separated(x, y, verbose = TRUE)
Design matrix for fixed effects.
Numeric response vector
Whether to print some messages or not.
Returns a boolean; TRUE
means there is (quasi-)separation.
Konis, K. 2007. Linear Programming Algorithms for Detecting Separated Data in Binary Logistic Regression Models. DPhil Thesis, Univ. Oxford.
See also the 'safeBinaryRegression' package.