# lmSubsets

##### All-Subsets Regression

All-subsets regression for linear models estimated by ordinary least squares (OLS).

- Keywords
- regression

##### Usage

`lmSubsets(formula, …)`# S3 method for default
lmSubsets(formula, data, subset, weights, na.action, model = TRUE,
x = FALSE, y = FALSE, contrasts = NULL, offset, …)

# S3 method for matrix
lmSubsets(formula, y, intercept = TRUE, …)

lmSubsets_fit(x, y, weights = NULL, offset = NULL, include = NULL,
exclude = NULL, nmin = NULL, nmax = NULL,
tolerance = 0, nbest = 1, …, pradius = NULL)

##### Arguments

- formula, data, subset, weights, na.action, model, x, y, contrasts, offset
Standard formula interface.

- intercept
Include intercept.

- include, exclude
Force regressors in or out.

- nmin, nmax
Minimum and maximum number of regressors.

- tolerance
Approximation tolerance (vector).

- nbest
Number of best subsets.

- …
Forwarded to

`lmSubsets.default`

and`lmSubsets_fit`

.- pradius
Preordering radius.

##### Details

The `lmSubsets`

generic provides various methods to conveniently
specify the regressor and response variables. The standard
`formula`

interface (see `lm`

) can be used, or
the information can be extracted from an already fitted `"lm"`

object. The regressor matrix and response variable can also be passed
in directly (see Examples).

The call is forwarded to `lmSubsets_fit`

, which provides a
low-level matrix interface.

The `nbest`

best subset models for every subset size are
computed, where the "best" models are the models with the lowest
residual sum of squares (RSS). The scope of the search can be limited
to a range of subset sizes by setting `nmin`

and `nmax`

. A
tolerance vector (expanded if necessary) may be specified to speed up
the search, where `tolerance[j]`

is the tolerance applied to
subset models of size `j`

.

By way of `include`

and `exclude`

, variables may be forced
in to or out of the regression, respectively.

The extent to which variables are preordered is controlled with the
`pradius`

parameter.

A set of standard extractor functions for fitted model objects is
available for objects of class `"lmSubsets"`

. See
`methods`

for more details.

The `summary`

method can be called to obtain summary statistics.

##### Value

An object of class `"lmSubsets"`

, i.e., a list with the
following components:

Number of observations, of variables.

`TRUE`

if model has intercept term;
`FALSE`

otherwise.

Included, excluded regressors.

Subset sizes.

Approximation tolerance (vector).

Number of best subsets.

Submodel information.

Selected variables.

Further components include call, na.action, weights, offset, contrasts, xlevels, terms, mf, x, and y. See lm for more information.

##### References

Hofmann M, Gatu C, Kontoghiorghes EJ, Colubi A, Zeileis A (2020).
lmSubsets: Exact Variable-Subset Selection in Linear Regression for
R. *Journal of Statistical Software*. **93**, 1--21.
doi:10.18637/jss.v093.i03.

Hofmann M, Gatu C, Kontoghiorghes EJ (2007). Efficient Algorithms for
Computing the Best Subset Regression Models for Large-Scale Problems.
*Computational Statistics \& Data Analysis*, **52**, 16--29.
doi:10.1016/j.csda.2007.03.017.

Gatu C, Kontoghiorghes EJ (2006). Branch-and-Bound Algorithms for
Computing the Best Subset Regression Models. *Journal of
Computational and Graphical Statistics*, **15**, 139--156.
doi:10.1198/106186006x100290.

##### See Also

##### Examples

```
# NOT RUN {
## load data (with logs for relative potentials)
data("AirPollution", package = "lmSubsets")
###################
## basic usage ##
###################
## canonical example: fit all subsets
lm_all <- lmSubsets(mortality ~ ., data = AirPollution, nbest = 5)
lm_all
## plot RSS and BIC
plot(lm_all)
## summary statistics
summary(lm_all)
############################
## forced in-/exclusion ##
############################
lm_force <- lmSubsets(lm_all, include = c("nox", "so2"),
exclude = "whitecollar")
lm_force
########################
## matrix interface ##
########################
## same as above
x <- as.matrix(AirPollution)
lm_mat <- lmSubsets(x, y = "mortality")
lm_mat
# }
```

*Documentation reproduced from package lmSubsets, version 0.5-1, License: GPL (>= 3)*