selection: Selecting a subset of `q` variables

Description

Main function for selecting the best subset of $q$ variables. Note that the selection procedure can be used with lm, glm or gam functions.

Usage

selection(x, y, q, criterion = "deviance", method = "lm", family = "gaussian", 
seconds = FALSE, nmodels = 1)

Arguments

A data frame containing all the covariates.

A vector with the response values.

An integer specifying the size of the subset of variables to be selected.

criterion

The cross-validation-based information criterion to be used. Default is the deviance. Other functions provided are the coefficient of determination ("R2") and residual variance ("variance").

method

A character string specifying which regression method is used, i.e., linear models ("lm"), generalized additive models ("glm") or generalized additive models ("gam").

family

This is a family object specifying the distribution and link to use in fitting: "gaussian", "binomial" or "poisson".

seconds

A logical value. By default, FALSE. If TRUE then, rather than returning the single best model only, the function returns a few of the best models (equivalent).

nmodels

Number of secondary models to be returned.

Value

Best modelThe best model. If seconds=TRUE, it returns also the best alternative models.
Variable nameNames of the variable.
Variable numberNumber of the variables.
Information criterionInformation criterion used and its value.
PredictionThe prediction of the best model.

Examples

Run this code

library(FWDselect)
data(pollution)
x=pollution[,-19]
y=pollution[,19]
obj1=selection(x,y,q=1,method="lm",criterion="deviance")
obj1

obj11=selection(x,y,q=1,method="lm",criterion="deviance",seconds=TRUE,nmodels=2)
obj11

Run the code above in your browser using DataLab