visualize.shrink: Plots grpnet Shrinkage Operator or its Estimator

Description

Makes a plot or returns a data frame containing the group elastic net shrinkage operator (or its estimator) evaluated at a sequence of input values.

Usage

visualize.shrink(x = seq(-5, 5, length.out = 1001), 
                penalty = c("LASSO", "MCP", "SCAD"), 
                alpha = 1, 
                lambda = 1, 
                gamma = 4, 
                fitted = FALSE,
                plot = TRUE,
                subtitle = TRUE,
                legend = TRUE,
                location = "top",
                ...)

Value

If plot = TRUE, then produces a plot.

If plot = FALSE, then returns a data frame.

Arguments

x: sequence of values at which to evaluate the penalty.
penalty: which penalty or penalties should be plotted?
alpha: elastic net tuning parameter (between 0 and 1).
lambda: overall tuning parameter (non-negative).
gamma: additional hyperparameter for MCP (>1) or SCAD (>2).
fitted: if FALSE (default), then the shrinkage operator is plotted; otherwise the shrunken estimator is plotted.
plot: if TRUE (default), then the result is plotted; otherwise the result is returned as a data frame.
subtitle: if TRUE (default), then the hyperparameter values are displayed in the subtitle.
legend: if TRUE (default), then a legend is included to distinguish the different penalty types.
location: the legend's location; ignored if legend = FALSE.
...: addition arguments passed to plot function, e.g., xlim, ylim, etc.

Author

Nathaniel E. Helwig <helwig@umn.edu>

Details

The updates for the group elastic net estimator have the form $$\boldsymbol\beta_{\alpha, \lambda}^{(t+1)} = S_{\lambda_1, \lambda_2}(\|\mathbf{b}_{\alpha, \lambda}^{(t+1)}\|) \mathbf{b}_{\alpha, \lambda}^{(t+1)}$$ where $S_{\lambda_1, \lambda_2}(\cdot)$ is a shrinkage and selection operator, and $$\mathbf{b}_{\alpha, \lambda}^{(t+1)} = \boldsymbol\beta_{\alpha, \lambda}^{(t)} + (\delta_{(t)} \epsilon)^{-1} \mathbf{g}^{(t)}$$ is the unpenalized update with $\mathbf{g}^{(t)}$ denoting the current gradient.

Note that $\lambda_1 = \lambda \alpha$ is the L1 tuning parameter, $\lambda_2 = \lambda (1-\alpha)$ is the L2 tuning parameter, $\delta_{(t)}$ is an upper-bound on the weights appearing in the Fisher information matrix, and $\epsilon$ is the largest eigenvalue of the Gramm matrix $n^{-1} \mathbf{X}^\top \mathbf{X}$.

References

Fan J, & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348-1360. tools:::Rd_expr_doi("10.1198/016214501753382273")

Helwig, N. E. (2024). Versatile descent algorithms for group regularization and variable selection in generalized linear models. Journal of Computational and Graphical Statistics. tools:::Rd_expr_doi("10.1080/10618600.2024.2362232")

Tibshirani, R. (1996). Regression and shrinkage via the Lasso. Journal of the Royal Statistical Society, Series B, 58, 267-288. tools:::Rd_expr_doi("10.1111/j.2517-6161.1996.tb02080.x")

Zhang CH (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2), 894-942. tools:::Rd_expr_doi("10.1214/09-AOS729")

Examples

Run this code

# plot shrinkage operator
visualize.shrink()

# plot shrunken estimator
visualize.shrink(fitted = TRUE)

Run the code above in your browser using DataLab