lmSubsets (version 0.5-2)

lmSelect: Best-subset regression

Description

Best-variable-subset selection in ordinary linear regression.

Usage

lmSelect(formula, ...)

# S3 method for default lmSelect(formula, data, subset, weights, na.action, model = TRUE, x = FALSE, y = FALSE, contrasts = NULL, offset, ...)

Arguments

formula, data, subset, weights, na.action, model, x, y, contrasts, offset

standard formula interface

...

forwarded to lmSelect_fit()

Value

"lmSelect"---a list containing the components returned by lmSelect_fit()

Further components include call, na.action, weights, offset, contrasts, xlevels, terms, mf, x, and y. See lm() for more information.

Details

The lmSelect() generic provides various methods to conveniently specify the regressor and response variables. The standard formula interface (see lm()) can be used, or the model information can be extracted from an already fitted "lm" object. The model matrix and response can also be passed in directly.

After processing the arguments, the call is forwarded to lmSelect_fit().

See Also

Examples

Run this code
# NOT RUN {
## load data
data("AirPollution", package = "lmSubsets")


###################
##  basic usage  ##
###################

## fit 20 best subsets (BIC)
lm_best <- lmSelect(mortality ~ ., data = AirPollution, nbest = 20)
lm_best

## summary statistics
summary(lm_best)

## visualize
plot(lm_best)


########################
##  custom criterion  ##
########################

## the same as above, but with a custom criterion:
M <- nrow(AirPollution)

ll <- function (rss) {
  -M/2 * (log(2 * pi) - log(M) + log(rss) + 1)
}

aic <- function (size, rss, k = 2) {
  -2 * ll(rss) + k * (size + 1)
}

bic <- function (size, rss) {
  aic(size, rss, k = log(M))
}

lm_cust <- lmSelect(mortality ~ ., data = AirPollution,
                    penalty = bic, nbest = 20)
lm_cust
# }

Run the code above in your browser using DataLab