glintnet: fit a GLM interaction model with group lasso or group elastic-net regularization

Description

This function is an implementation of the glinternet model of Lim and Hastie, for fitting interactions between pairs of variables in a model. The method creates interaction matrices and enforces hierarchy using the overlap group lasso. Once the augmented model matrix is set up, glintnet uses grpnet to fit the overlap group lasso path. It hence inherits all the capabilities of grpnet, and in particular can fit interaction models for all the GLM families.

Usage

glintnet(
  X,
  glm,
  offsets = NULL,
  intr_keys = NULL,
  intr_values,
  levels = NULL,
  n_threads = 1,
  save.X = FALSE,
  ...
)

Value

A list of class "glintnet", which inherits from class "grpnet". This has a a few additional components such as pairs, groups and levels. Users typically use methods like predict(), print(), plot() etc to examine the object.

Arguments

X: A dense matrix, which can include factors with levels coded as non-negative integers starting at 0.
glm: GLM family/response object. This is an expression that represents the family, the reponse and other arguments such as weights, if present. The choices are glm.gaussian(), glm.binomial(), glm.poisson(), glm.multinomial(), glm.cox(), glm.multinomial(), and glm.multigaussian(). This is a required argument, and there is no default. In the simple example below, we use glm.gaussian(y).
offsets: Offsets, default is NULL. If present, this is a fixed vector or matrix corresponding to the shape of the natural parameter, and is added to the fit.
intr_keys: List of feature indices. This is a list of all features with which interactions can be formed. Default is 1:p where p is the number of columns in X.
intr_values: List of integer vectors of feature indices. For each of the m <= p indices listed in intr_keys, there is a vector of indices indicating which columns are candidates for interaction with that feature. If a vector is NULL, that means all other features are candidates for interactions. The default is a list of length m where each element is NULL; that is rep(list(NULL), m.
levels: Number of levels for each of the columns of mat, with 1 representing a quantitative feature. A factor with K levels should be represented by the numbers 0,1,...,K-1.
n_threads: Number of threads, default 1.
save.X: Logical flag, default FALSE. If TRUE, the internally constructed X matrix is returned.
...: Additional named arguments to grpnet.

Author

James Yang, Trevor Hastie, and Balasubramanian Narasimhan
Maintainer: Trevor Hastie hastie@stanford.edu

Details

The input matrix can be composed of quantitative variables or columns representing factors. The argument levels indicates which are quantitative, and which are factors. The later are represented by numbers starting at 0, up to one less than the number of levels (sorry!) Each of the factors are converted to "one-hot" matrices, and hence a group of columns are created for each of these. This is done using the matrix utility function matrix.one_hot(). In addition interaction matrices are created. For each pair of variables for which an interaction is considered, a matrix is created consisting of the cross-product of each of the constituent matrices, as described in the "glinternet" reference. Once this much bigger matrix is established, the model is handed to grpnet to produce the fit.

References

Lim, Michael and Hastie, Trevor (2015) Learning interactions via hierarchical group-lasso regularization, JCGS tools:::Rd_expr_doi("10.1080/10618600.2014.938812")
Yang, James and Hastie, Trevor. (2024) A Fast and Scalable Pathwise-Solver for Group Lasso and Elastic Net Penalized Regression via Block-Coordinate Descent. arXiv tools:::Rd_expr_doi("10.48550/arXiv.2405.08631").
Friedman, J., Hastie, T. and Tibshirani, R. (2008) Regularization Paths for Generalized Linear Models via Coordinate Descent (2010), Journal of Statistical Software, Vol. 33(1), 1-22, tools:::Rd_expr_doi("10.18637/jss.v033.i01").
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol. 39(5), 1-13, tools:::Rd_expr_doi("10.18637/jss.v039.i05").
Tibshirani,Robert, Bien, J., Friedman, J., Hastie, T.,Simon, N.,Taylor, J. and Tibshirani, Ryan. (2012) Strong Rules for Discarding Predictors in Lasso-type Problems, JRSSB, Vol. 74(2), 245-266, https://arxiv.org/abs/1011.2234.

Examples

Run this code

set.seed(0)
n=500
d_cont = 5     # number of continuous features
d_disc = 5     # number of categorical features
Z_cont = matrix(rnorm(n*d_cont), n, d_cont)
levels = sample(2:5,d_disc, replace = TRUE)
Z_disc = matrix(0,n,d_disc)
for(i in seq(d_disc))Z_disc[,i] = sample(0:(levels[i]-1),n,replace=TRUE)
Z = cbind(Z_cont,Z_disc)
levels = c(rep(1,d_cont),levels)

xmat = model.matrix(~Z_cont[,1]*factor(Z_disc[,2]))
nc=ncol(xmat)
beta = rnorm(nc)
y = xmat%*%beta+rnorm(n)*1.5

fit <- glintnet(Z, glm.gaussian(y), levels=levels, intr_keys = 1)
print(fit)

Run the code above in your browser using DataLab