stabs (version 0.6-3)

Fitting Functions: Fit Functions for Stability Selection

Description

Functions that fit a model until \(q\) variables are selected and that returns the indices (and names) of the selected variables.

Usage

## package lars:
lars.lasso(x, y, q, ...)
lars.stepwise(x, y, q, ...)

## package glmnet: glmnet.lasso(x, y, q, type = c("conservative", "anticonservative"), ...) glmnet.lasso_maxCoef(x, y, q, ...)

Arguments

x

a matrix containing the predictors or an object of class "mboost".

y

a vector or matrix containing the outcome.

q

number of (unique) selected variables (or groups of variables depending on the model) that are selected on each subsample.

type

a charachter vector specifying if the number of selected variables per subsample is \(\leq q\) (type = "conservative") or \(\geq q\) (type = "anticonservative"). The conservative version ensures that the PFER is controlled.

additional arguments passed to the underlying fitting function. See the example on glmnet.lasso_maxCoef in stabsel for the specification of additional arguments via stabsel.

Value

A named list with elements

selected

logical. A vector that indicates which variable was selected.

path

logical. A matrix that indicates which variable was selected in which step. Each row represents one variable, the columns represent the steps.

Details

All fitting functions are named after the package and the type of model that is fitted: package_name.model, e.g., glmnet.lasso stands for a lasso model that is fitted using the package glmnet.

glmnet.lasso_maxCoef fits a lasso model with a given penalty parameter and returns the q largest coefficients. If one wants to use glmnet.lasso_maxCoef, one must specify the penalty parameter lambda (via the argument) or in stabsel via args.fitfun(lambda = ). Note that usually, the penalty parameter cannot be specified but is chosen such that q variables are selected. For an example on how to use glmnet.lasso_maxCoef see stabsel.

See Also

stabsel for stability selection itself, and quic.graphical_model for stability selection for graphical models.

Examples

Run this code
# NOT RUN {
  if (require("TH.data")) {
      ## make data set available
      data("bodyfat", package = "TH.data")
  } else {
      ## simulate some data if TH.data not available. 
      ## Note that results are non-sense with this data.
      bodyfat <- matrix(rnorm(720), nrow = 72, ncol = 10)
  }
  
  if (require("lars")) {
      ## selected variables
      lars.lasso(bodyfat[, -2], bodyfat[,2], q = 3)$selected
      lars.stepwise(bodyfat[, -2], bodyfat[,2], q = 3)$selected
  }
  
  if (require("glmnet")) {
      glmnet.lasso(bodyfat[, -2], bodyfat[,2], q = 3)$selected
      ## selection path
      glmnet.lasso(bodyfat[, -2], bodyfat[,2], q = 3)$path
  
      ## Using the anticonservative glmnet.lasso (see args.fitfun):
      stab.glmnet <- stabsel(x = bodyfat[, -2], y = bodyfat[,2],
                             fitfun = glmnet.lasso, 
                             args.fitfun = list(type = "anticonservative"), 
                             cutoff = 0.75, PFER = 1)
  }
# }

Run the code above in your browser using DataCamp Workspace