Fits regularization paths for sparse group-lasso penalized learning problems at a
sequence of regularization parameters lambda.
Note that the objective function for least squares is
$$RSS/(2n) + \lambda penalty$$
Users can also tweak the penalty by choosing a different penalty factor.
sparsegl(
x,
y,
group = NULL,
family = c("gaussian", "binomial"),
nlambda = 100,
lambda.factor = ifelse(nobs < nvars, 0.01, 1e-04),
lambda = NULL,
pf_group = sqrt(bs),
pf_sparse = rep(1, nvars),
intercept = TRUE,
asparse = 0.05,
standardize = TRUE,
lower_bnd = -Inf,
upper_bnd = Inf,
weights = NULL,
offset = NULL,
warm = NULL,
trace_it = 0,
dfmax = as.integer(max(group)) + 1L,
pmax = min(dfmax * 1.2, as.integer(max(group))),
eps = 1e-08,
maxit = 3e+06
)An object with S3 class "sparsegl". Among the list components:
call The call that produced this object.
b0 Intercept sequence of length length(lambda).
beta A p x length(lambda) sparse matrix of coefficients.
df The number of features with nonzero coefficients for each value of
lambda.
dim Dimension of coefficient matrix.
lambda The actual sequence of lambda values used.
npasses Total number of iterations summed over all lambda values.
jerr Error flag, for warnings and errors, 0 if no error.
group A vector of consecutive integers describing the grouping of the
coefficients.
nobs The number of observations used to estimate the model.
If sparsegl() was called with a stats::family() method, this may also
contain information about the deviance and the family used in fitting.
Double. A matrix of predictors, of dimension
\(n \times p\); each row
is a vector of measurements and each column is a feature. Objects of class
Matrix::sparseMatrix are supported.
Double/Integer/Factor. The response variable.
Quantitative for family="gaussian" and for other exponential families.
If family="binomial" should be either a factor with two levels or
a vector of integers taking 2 unique values. For a factor, the last level
in alphabetical order is the target class.
Integer. A vector of consecutive integers describing the grouping of the coefficients (see example below).
Character or function. Specifies the generalized linear model to use. Valid options are:
"gaussian" - least squares loss (regression, the default),
"binomial" - logistic loss (classification)
For any other type, a valid stats::family() object may be passed. Note
that these will generally be much slower to estimate than the built-in
options passed as strings. So for example, family = "gaussian" and
family = gaussian() will produce the same results, but the first
will be much faster.
The number of lambda values - default is 100.
A multiplicative factor for the minimal lambda in the
lambda sequence, where min(lambda) = lambda.factor * max(lambda).
max(lambda) is the smallest value of lambda for which all coefficients
are zero. The default depends on the relationship between \(n\)
(the number of rows in the matrix of predictors) and \(p\)
(the number of predictors). If \(n \geq p\), the
default is 0.0001. If \(n < p\), the default is 0.01.
A very small value of lambda.factor will lead to a
saturated fit. This argument has no effect if there is user-defined
lambda sequence.
A user supplied lambda sequence. The default, NULL
results in an automatic computation based on nlambda, the smallest value
of lambda that would give the null model (all coefficient estimates equal
to zero), and lambda.factor. Supplying a value of lambda overrides
this behaviour. It is likely better to supply a
decreasing sequence of lambda values than a single (small) value. If
supplied, the user-defined lambda sequence is automatically sorted in
decreasing order.
Penalty factor on the groups, a vector of the same
length as the total number of groups. Separate penalty weights can be applied
to each group of \(\beta\)s to allow differential shrinkage.
Can be 0 for some
groups, which implies no shrinkage, and results in that group always being
included in the model (depending on pf_sparse). Default value for each
entry is the square-root of the corresponding size of each group.
Because this default is typical, these penalties are not rescaled.
Penalty factor on l1-norm, a vector the same length as the
total number of columns in x. Each value corresponds to one predictor
Can be 0 for some predictors, which
implies that predictor will be receive only the group penalty.
Note that these are internally rescaled so that the sum is the same as
the number of predictors.
Whether to include intercept in the model. Default is TRUE.
The relative weight to put on the \(\ell_1\)-norm in
sparse group lasso. Default is 0.05 (resulting in 0.95 on the
\(\ell_2\)-norm).
Logical flag for variable standardization (scaling) prior to fitting the model. Default is TRUE.
Lower bound for coefficient values, a vector in length of 1
or of length the number of groups. Must be non-positive numbers only.
Default value for each entry is -Inf.
Upper for coefficient values, a vector in length of 1
or of length the number of groups. Must be non-negative numbers only.
Default value for each entry is Inf.
Double vector. Optional observation weights. These can
only be used with a stats::family() object.
Double vector. Optional offset (constant predictor without a
corresponding coefficient). These can only be used with a
stats::family() object.
List created with make_irls_warmup(). These can only be used
with a stats::family() object, and is not typically necessary even then.
Scalar integer. Larger values print more output during
the irls loop. Typical values are 0 (no printing), 1 (some printing
and a progress bar), and 2 (more detailed printing).
These can only be used with a stats::family() object.
Limit the maximum number of groups in the model. Default is no limit.
Limit the maximum number of groups ever to be nonzero. For example once a group enters the model, no matter how many times it exits or re-enters model through the path, it will be counted only once.
Convergence termination tolerance. Defaults value is 1e-8.
Maximum number of outer-loop iterations allowed at fixed lambda
value. Default is 3e8. If models do not converge, consider increasing
maxit.
Liang, X., Cohen, A., Sólon Heinsfeld, A., Pestilli, F., and
McDonald, D.J. 2024.
sparsegl: An R Package for Estimating Sparse Group Lasso.
Journal of Statistical Software, Vol. 110(6): 1–23.
tools:::Rd_expr_doi("10.18637/jss.v110.i06").
n <- 100
p <- 20
X <- matrix(rnorm(n * p), nrow = n)
eps <- rnorm(n)
beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15)))
y <- X %*% beta_star + eps
groups <- rep(1:(p / 5), each = 5)
fit <- sparsegl(X, y, group = groups)
yp <- rpois(n, abs(X %*% beta_star))
fit_pois <- sparsegl(X, yp, group = groups, family = poisson())
Run the code above in your browser using DataLab