Low-level interface to all-variable-subsets selection in ordinary linear regression.
lmSubsets_fit(x, y, weights = NULL, offset = NULL, include = NULL,
exclude = NULL, nmin = NULL, nmax = NULL,
tolerance = 0, nbest = 1, ..., pradius = NULL)double[,]---the model matrix
double[]---the model response
double[]---the model weights
double[]---the model offset
logical[], integer[],
character[]---the regressors to force in
logical[], integer[],
character[]---the regressors to force out
integer---the minimum number of regressors
integer---the maximum number of regressors
double[]---the approximation tolerances
integer---the number of best subsets
ignored
integer---the preordering radius
A list with the following components:
integer---number of observations in model (before
weights processing)
integer---number of observations in model (after
weights processing)
integer---number of regressors in model
double[]---model weights
logical---is TRUE if model contains an
intercept term, FALSE otherwise
logical[]---regressors forced into the
regression
logical[]---regressors forced out of the
regression
integer[]---subset sizes
double[]---approximation tolerances
integer---number of best subsets
"data.frame"---submodel information
"data.frame"---variable subsets
The best variable-subset model for every subset size is determined, where the "best" model is the one with the lowest residual sum of squares (RSS).
The regression data is specified with the x, y,
weights, and offset parameters. See
lm.fit() for further details.
To force regressors into or out of the regression, a list of
regressors can be passed as an argument to the include or
exclude parameters, respectively.
The scope of the search can be limited to a range of subset sizes by
setting nmin and nmax, the minimum and maximum number of
regressors allowed in the regression, respectively.
A tolerance vector can be specified to speed up the search,
where tolerance[j] is the approximation tolerance applied to
subset models of size j.
The number of submodels returned for each subset size is determined by
the nbest parameter.
The preordering radius is given with the pradius parameter.
Hofmann M, Gatu C, Kontoghiorghes EJ, Colubi A, Zeileis A (2020). lmSubsets: Exact variable-subset selection in linear regression for R. Journal of Statistical Software, 93, 1--21. 10.18637/jss.v093.i03.
lmSubsets() for the high-level
interface
lmSelect_fit() for best-subset
regression
# NOT RUN {
data("AirPollution", package = "lmSubsets")
x <- as.matrix(AirPollution[, names(AirPollution) != "mortality"])
y <- AirPollution[, names(AirPollution) == "mortality"]
f <- lmSubsets_fit(x, y)
f
# }
Run the code above in your browser using DataLab