Learn R Programming

FWDselect (version 1.1)

selection: Selecting a subset of q variables

Description

Main function for selecting the best subset of $q$ variables. Note that the selection procedure can be used with lm, glm or gam functions.

Usage

selection(x, y, q, criterion = "deviance", method = "lm", family = "gaussian", 
seconds = FALSE, nmodels = 1)

Arguments

x
A data frame containing all the covariates.
y
A vector with the response values.
q
An integer specifying the size of the subset of variables to be selected.
criterion
The cross-validation-based information criterion to be used. Default is the deviance. Other functions provided are the coefficient of determination ("R2") and residual variance ("variance").
method
A character string specifying which regression method is used, i.e., linear models ("lm"), generalized additive models ("glm") or generalized additive models ("gam").
family
This is a family object specifying the distribution and link to use in fitting: "gaussian", "binomial" or "poisson".
seconds
A logical value. By default, FALSE. If TRUE then, rather than returning the single best model only, the function returns a few of the best models (equivalent).
nmodels
Number of secondary models to be returned.

Value

  • Best modelThe best model. If seconds=TRUE, it returns also the best alternative models.
  • Variable nameNames of the variable.
  • Variable numberNumber of the variables.
  • Information criterionInformation criterion used and its value.
  • PredictionThe prediction of the best model.

Examples

Run this code
library(FWDselect)
data(pollution)
x=pollution[,-19]
y=pollution[,19]
obj1=selection(x,y,q=1,method="lm",criterion="deviance")
obj1

obj11=selection(x,y,q=1,method="lm",criterion="deviance",seconds=TRUE,nmodels=2)
obj11

Run the code above in your browser using DataLab