regsubsets: functions for model selection

Description

Model selection by exhaustive search, forward or backward stepwise, or sequential replacement

Usage

regsubsets(x=, ...)
# S3 method for formula
regsubsets(x=, data=, weights=NULL, nbest=1, nvmax=8,
 force.in=NULL, force.out=NULL, intercept=TRUE,
 method=c("exhaustive", "backward", "forward", "seqrep"),
 really.big=FALSE,
 nested=(nbest==1),...)
# S3 method for default
regsubsets(x=, y=, weights=rep(1, length(y)), nbest=1, nvmax=8,
force.in=NULL, force.out=NULL, intercept=TRUE,
 method=c("exhaustive","backward", "forward", "seqrep"),
really.big=FALSE,nested=(nbest==1),...)
# S3 method for biglm
regsubsets(x,nbest=1,nvmax=8,force.in=NULL,
method=c("exhaustive","backward", "forward", "seqrep"),
really.big=FALSE,nested=(nbest==1),...)
# S3 method for regsubsets
summary(object,all.best=TRUE,matrix=TRUE,matrix.logical=FALSE,df=NULL,...)
# S3 method for regsubsets
coef(object,id,vcov=FALSE,...)
# S3 method for regsubsets
vcov(object,id,...)

Value

regsubsets returns an object of class "regsubsets" containing no user-serviceable parts. It is designed to be processed by summary.regsubsets.

summary.regsubsets returns an object with elements

which: A logical matrix indicating which elements are in each model
rsq: The r-squared for each model
rss: Residual sum of squares for each model
adjr2: Adjusted r-squared
cp: Mallows' Cp
bic: Schwartz's information criterion, BIC
outmat: A version of the which component that is formatted for printing
obj: A copy of the regsubsets object

The coef method returns a coefficient vector or list of vectors, the vcov method returns a matrix or list of matrices.

Arguments

x: design matrix or model formula for full model, or biglm object
data: Optional data frame
y: response vector
weights: weight vector
nbest: number of subsets of each size to record
nvmax: maximum size of subsets to examine
force.in: index to columns of design matrix that should be in all models
force.out: index to columns of design matrix that should be in no models
intercept: Add an intercept?
method: Use exhaustive search, forward selection, backward selection or sequential replacement to search.
really.big: Must be TRUE to perform exhaustive search on more than 50 variables.
nested: See the Note below: if nested=FALSE, models with columns 1, 1 and 2, 1-3, and so on, will also be considered
object: regsubsets object
all.best: Show all the best subsets or just one of each size
matrix: Show a matrix of the variables in each model or just summary statistics
matrix.logical: With matrix=TRUE, the matrix is logical TRUE/FALSE or string "*"/" "
df: Specify a number of degrees of freedom for the summary statistics. The default is n-1
id: Which model or models (ordered as in the summary output) to return coefficients and variance matrix for
vcov: If TRUE, return the variance-covariance matrix as an attribute
...: Other arguments for future methods

Details

Since this function returns separate best models of all sizes up to nvmax and since different model selection criteria such as AIC, BIC, CIC, DIC, ... differ only in how models of different sizes are compared, the results do not depend on the choice of cost-complexity tradeoff.

When x is a biglm object it is assumed to be the full model, so force.out is not relevant. If there is an intercept it is forced in by default; specify a force.in as a logical vector with FALSE as the first element to allow the intercept to be dropped.

The model search does not actually fit each model, so the returned object does not contain coefficients or standard errors. Coefficients and the variance-covariance matrix for one or model models can be obtained with the coef and vcov methods.

Examples

Run this code

data(swiss)
a<-regsubsets(as.matrix(swiss[,-1]),swiss[,1])
summary(a)
b<-regsubsets(Fertility~.,data=swiss,nbest=2)
summary(b)

coef(a, 1:3)
vcov(a, 3)