glmnet: Formula interface for elastic net modelling with glmnet

Description

Formula interface for elastic net modelling with glmnet

Usage

glmnet(x, ...)
# S3 method for default
glmnet(x, y, ...)
# S3 method for formula
glmnet(
  formula,
  data,
  family = c("gaussian", "binomial", "poisson", "multinomial", "cox", "mgaussian"),
  alpha = 1,
  ...,
  weights = NULL,
  offset = NULL,
  subset = NULL,
  na.action = getOption("na.action"),
  drop.unused.levels = FALSE,
  xlev = NULL,
  sparse = FALSE,
  use.model.frame = FALSE,
  relax = FALSE
)
# S3 method for glmnet.formula
predict(object, newdata, offset = NULL, na.action = na.pass, ...)
# S3 method for glmnet.formula
coef(object, ...)
# S3 method for glmnet.formula
print(
  x,
  digits = max(3, getOption("digits") - 3),
  print.deviance.ratios = FALSE,
  ...
)
# S3 method for relaxed.formula
print(
  x,
  digits = max(3, getOption("digits") - 3),
  print.deviance.ratios = FALSE,
  ...
)
# S3 method for relaxed.formula
predict(object, newdata, offset = NULL, na.action = na.pass, ...)
# S3 method for relaxed.formula
coef(object, ...)

Value

For glmnet.formula, an object of class either glmnet.formula or relaxed.formula, based on the value of the relax argument. This is basically the same object created by glmnet::glmnet, but with extra components to allow formula usage.

Arguments

x: For the default method, a matrix of predictor variables.
...: For glmnet.formula and glmnet.default, other arguments to be passed to glmnet::glmnet; for the predict and coef methods, arguments to be passed to their counterparts in package glmnet.
y: For the default method, a response vector or matrix (for a multinomial response).
formula: A model formula; interaction terms are allowed and will be expanded per the usual rules for linear models.
data: A data frame or matrix containing the variables in the formula.
family: The model family. See glmnet::glmnet for how to specify this argument.
alpha: The elastic net mixing parameter. See glmnet::glmnet for more details.
weights: An optional vector of case weights to be used in the fitting process. If missing, defaults to an unweighted fit.
offset: An optional vector of offsets, an a priori known component to be included in the linear predictor.
subset: An optional vector specifying the subset of observations to be used to fit the model.
na.action: A function which indicates what should happen when the data contains missing values. For the predict method, na.action = na.pass will predict missing values with NA; na.omit or na.exclude will drop them.
drop.unused.levels: Should factors have unused levels dropped? Defaults to FALSE.
xlev: A named list of character vectors giving the full set of levels to be assumed for each factor.
sparse: Should the model matrix be in sparse format? This can save memory when dealing with many factor variables, each with many levels.
use.model.frame: Should the base model.frame function be used when constructing the model matrix? This is the standard method that most R modelling functions use, but has some disadvantages. The default is to avoid model.frame and construct the model matrix term-by-term; see discussion.
relax: For glmnet.formula, whether to perform a relaxed (non-regularised) fit after the regularised one. Requires glmnet 3.0 or later.
object: For the predict and coef methods, an object of class glmnet.formula.
newdata: For the predict method, a data frame containing the observations for which to calculate predictions.
digits: Significant digits in printed output.
print.deviance.ratios: Whether to print the table of deviance ratios, as per glmnet::print.glmnet.

Details

The glmnet function in this package is an S3 generic with a formula and a default method. The former calls the latter, and the latter is simply a direct call to the glmnet function in package glmnet. All the arguments to glmnet::glmnet are (or should be) supported.

There are two ways in which the matrix of predictors can be generated. The default, with use.model.frame = FALSE, is to process the additive terms in the formula independently. With wide datasets, this is much faster and more memory-efficient than the standard R approach which uses the model.frame and model.matrix functions. However, the resulting model object is not exactly the same as if the standard approach had been used; in particular, it lacks a bona fide terms object. If you require interoperability with other packages that assume the standard model object structure, set use.model.frame = TRUE. See discussion for more information on this topic.

The predict and coef methods are wrappers for the corresponding methods in the glmnet package. The former constructs a predictor model matrix from its newdata argument and passes that as the newx argument to glmnet:::predict.glmnet.

Examples

Run this code

glmnet(mpg ~ ., data=mtcars)

glmnet(Species ~ ., data=iris, family="multinomial")

if (FALSE) {

# Leukemia example dataset from Trevor Hastie's website
download.file("https://web.stanford.edu/~hastie/glmnet/glmnetData/Leukemia.RData",
              "Leukemia.RData")
load("Leukemia.Rdata")
leuk <- do.call(data.frame, Leukemia)
glmnet(y ~ ., leuk, family="binomial")
}