lmSubsets_fit: All-subsets regression

Description

Low-level interface to all-variable-subsets selection in ordinary linear regression.

Usage

lmSubsets_fit(x, y, weights = NULL, offset = NULL, include = NULL,
              exclude = NULL, nmin = NULL, nmax = NULL,
              tolerance = 0, nbest = 1, ..., pradius = NULL)

Arguments

double[,]---the model matrix

double[]---the model response

weights

double[]---the model weights

offset

double[]---the model offset

include

logical[], integer[], character[]---the regressors to force in

exclude

logical[], integer[], character[]---the regressors to force out

nmin

integer---the minimum number of regressors

nmax

integer---the maximum number of regressors

tolerance

double[]---the approximation tolerances

nbest

integer---the number of best subsets

...

ignored

pradius

integer---the preordering radius

Value

A list with the following components:

NOBS

integer---number of observations in model (before weights processing)

nobs

integer---number of observations in model (after weights processing)

nvar

integer---number of regressors in model

weights

double[]---model weights

intercept

logical---is TRUE if model contains an intercept term, FALSE otherwise

include

logical[]---regressors forced into the regression

exclude

logical[]---regressors forced out of the regression

size

integer[]---subset sizes

tolerance

double[]---approximation tolerances

nbest

integer---number of best subsets

submodel

"data.frame"---submodel information

subset

"data.frame"---variable subsets

Details

The best variable-subset model for every subset size is determined, where the "best" model is the one with the lowest residual sum of squares (RSS).

The regression data is specified with the x, y, weights, and offset parameters. See lm.fit() for further details.

To force regressors into or out of the regression, a list of regressors can be passed as an argument to the include or exclude parameters, respectively.

The scope of the search can be limited to a range of subset sizes by setting nmin and nmax, the minimum and maximum number of regressors allowed in the regression, respectively.

A tolerance vector can be specified to speed up the search, where tolerance[j] is the approximation tolerance applied to subset models of size j.

The number of submodels returned for each subset size is determined by the nbest parameter.

The preordering radius is given with the pradius parameter.

References

Hofmann M, Gatu C, Kontoghiorghes EJ, Colubi A, Zeileis A (2020). lmSubsets: Exact variable-subset selection in linear regression for R. Journal of Statistical Software, 93, 1--21. 10.18637/jss.v093.i03.

Examples

Run this code

# NOT RUN {
data("AirPollution", package = "lmSubsets")

x <- as.matrix(AirPollution[, names(AirPollution) != "mortality"])
y <-           AirPollution[, names(AirPollution) == "mortality"]

f <- lmSubsets_fit(x, y)
f
# }

Run the code above in your browser using DataLab