Low-level interface to all-variable-subsets selection in ordinary linear regression.
lmSubsets_fit(x, y, weights = NULL, offset = NULL, include = NULL,
exclude = NULL, nmin = NULL, nmax = NULL,
tolerance = 0, nbest = 1, ..., pradius = NULL)
double[,]
---the model matrix
double[]
---the model response
double[]
---the model weights
double[]
---the model offset
logical[]
, integer[]
,
character[]
---the regressors to force in
logical[]
, integer[]
,
character[]
---the regressors to force out
integer
---the minimum number of regressors
integer
---the maximum number of regressors
double[]
---the approximation tolerances
integer
---the number of best subsets
ignored
integer
---the preordering radius
A list
with the following components:
integer
---number of observations in model (before
weights
processing)
integer
---number of observations in model (after
weights
processing)
integer
---number of regressors in model
double[]
---model weights
logical
---is TRUE
if model contains an
intercept term, FALSE
otherwise
logical[]
---regressors forced into the
regression
logical[]
---regressors forced out of the
regression
integer[]
---subset sizes
double[]
---approximation tolerances
integer
---number of best subsets
"data.frame"
---submodel information
"data.frame"
---variable subsets
The best variable-subset model for every subset size is determined, where the "best" model is the one with the lowest residual sum of squares (RSS).
The regression data is specified with the x
, y
,
weights
, and offset
parameters. See
lm.fit()
for further details.
To force regressors into or out of the regression, a list of
regressors can be passed as an argument to the include
or
exclude
parameters, respectively.
The scope of the search can be limited to a range of subset sizes by
setting nmin
and nmax
, the minimum and maximum number of
regressors allowed in the regression, respectively.
A tolerance
vector can be specified to speed up the search,
where tolerance[j]
is the approximation tolerance applied to
subset models of size j
.
The number of submodels returned for each subset size is determined by
the nbest
parameter.
The preordering radius is given with the pradius
parameter.
Hofmann M, Gatu C, Kontoghiorghes EJ, Colubi A, Zeileis A (2020). lmSubsets: Exact variable-subset selection in linear regression for R. Journal of Statistical Software, 93, 1--21. 10.18637/jss.v093.i03.
lmSubsets()
for the high-level
interface
lmSelect_fit()
for best-subset
regression
# NOT RUN {
data("AirPollution", package = "lmSubsets")
x <- as.matrix(AirPollution[, names(AirPollution) != "mortality"])
y <- AirPollution[, names(AirPollution) == "mortality"]
f <- lmSubsets_fit(x, y)
f
# }
Run the code above in your browser using DataLab