path.sparsestep: Approximate path algorithm for the SparseStep model

Description

Fits the entire regularization path for SparseStep using a Golden Section search. Note that this algorithm is approximate, there is no guarantee that the solutions _between_ induced values of lambdas do not differ from those calculated. For instance, if solutions are calculated at \(\lambda_{i}\) and \(\lambda_{i+1}\), this algorithm ensures that \(\lambda_{i+1}\) has one more zero than the solution at \(\lambda_{i}\) (provided the recursion depth is large enough). There is however no guarantee that there are no different solutions between \(\lambda_{i}\) and \(\lambda_{i+1}\). This is an ongoing research topic.

Note that this path algorithm is not faster than running the sparsestep function with the same \(\lambda\) sequence.

Usage

path.sparsestep(
  x,
  y,
  max.depth = 10,
  gamma0 = 1000,
  gammastop = 1e-04,
  IMsteps = 2,
  gammastep = 2,
  normalize = TRUE,
  intercept = TRUE,
  force.zero = TRUE,
  threshold = 1e-07,
  XX = NULL,
  Xy = NULL,
  use.XX = TRUE,
  use.Xy = TRUE,
  quiet = FALSE
)

Arguments

matrix of predictors

response

max.depth

maximum recursion depth

gamma0

starting value of the gamma parameter

gammastop

stopping value of the gamma parameter

IMsteps

number of steps of the majorization algorithm to perform for each value of gamma

gammastep

factor to decrease gamma with at each step

normalize

if TRUE, each variable is standardized to have unit L2 norm, otherwise it is left alone.

intercept

if TRUE, an intercept is included in the model (and not penalized), otherwise no intercept is included

force.zero

if TRUE, absolute coefficients smaller than the provided threshold value are set to absolute zero as a post-processing step, otherwise no thresholding is performed

threshold

threshold value to use for setting coefficients to absolute zero

The X'X matrix; useful for repeated runs where X'X stays the same

The X'y matrix; useful for repeated runs where X'y stays the same

use.XX

whether or not to compute X'X and return it

use.Xy

whether or not to compute X'y and return it

quiet

don't print search info while running

Value

A "sparsestep" S3 object is returned, for which print, predict, coef, and plot methods exist. It has the following items:

call

The call that was used to construct the model.

lambda

The value(s) of lambda used to construct the model.

gamma0

The gamma0 value of the model.

gammastop

The gammastop value of the model

IMsteps

The IMsteps value of the model

gammastep

The gammastep value of the model

intercept

Boolean indicating if an intercept was fitted in the model

force.zero

Boolean indicating if a force zero-setting was performed.

threshold

The threshold used for a forced zero-setting

beta

The resulting coefficients stored in a sparse matrix format (dgCMatrix). This matrix has dimensions nvar x nlambda

The intercept vector for each value of gamma of length nlambda

normx

Vector used to normalize the columns of x

meanx

Vector of column means of x

The matrix X'X if use.XX was set to TRUE

The matrix X'y if use.Xy was set to TRUE

References

Van den Burg, G.J.J., Groenen, P.J.F. and Alfons, A. (2017). SparseStep: Approximating the Counting Norm for Sparse Regularization, arXiv preprint arXiv:1701.06967 [stat.ME]. URL https://arxiv.org/abs/1701.06967.

Examples

Run this code

# NOT RUN {
x <- matrix(rnorm(100*20), 100, 20)
y <- rnorm(100)
pth <- path.sparsestep(x, y)

# }

Run the code above in your browser using DataLab