Learn R Programming

huge (version 1.0.1)

lasso.stars: Stability Approach to Regularization Selection for Lasso

Description

Implements the Stability Approach to Regularization Selection (StARS) for Lasso

Usage

lasso.stars(x, y, rep.num = 20, lambda = NULL, nlambda = 100, 
lambda.min.ratio = 0.001, stars.thresh = 0.1, sample.ratio = NULL, 
alpha = 1, verbose = TRUE)

Arguments

x
The n by d data matrix representing n observations in d dimensions
y
The n-dimensional response vector
rep.num
The number of subsampling for StARS. The default value is 20.
lambda
A sequence of decresing positive numbers to control regularization. Typical usage is to leave the input lambda = NULL and have the program compute its own lambda sequence based on nlambda and lambda.min.ratio
nlambda
The number of regularization paramters. The default value is 100.
lambda.min.ratio
The smallest value for lambda, as a fraction of the uppperbound (MAX) of the regularization parameter which makes all estimates equal to 0. The program can automatically generate lambda as a sequence of
stars.thresh
The threshold of the variability in StARS. The default value is 0.1. The alternative value is 0.05. Only applicable when criterion = "stars"
sample.ratio
The subsampling ratio. The default value is 10*sqrt(n)/n when n>144 and 0.8 when n<=144< code="">, where n is the sample size.
alpha
The tuning parameter for the elastic-net regression. The default value is 1 (lasso).
verbose
If verbose = FALSE, tracing information printing is disabled. The default value is TRUE.

Value

  • An object with S3 class "stars" is returned:
  • pathThe solution path of regression coefficients (in an d by nlambda matrix)
  • lambdaThe regularization parameters used in Lasso
  • opt.indexThe index of the optimal regularization parameter.
  • opt.betaThe optimal regression coefficients.
  • opt.lambdaThe optimal regularization parameter.
  • VariabilityThe variability along the solution path.

Details

StARS selects the optimal regularization parameter based on the variability of the solution path. It chooses the least sparse graph among all solutions with the same variability. An alternative threshold 0.05 is chosen under the assumption that the model is correctly specified. In applications, the model is usually an approximation of the true model, 0.1 is a safer choice. The implementation is based on the popular package "glmnet".

References

1.Tuo Zhao and Han Liu. HUGE: A Package for High-dimensional Undirected Graph Estimation. Technical Report, Carnegie Mellon University, 2010 2.Han Liu, Kathryn Roeder and Larry Wasserman. Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models. Advances in Neural Information Processing Systems, 2010. 3.Jerome Friedman, Trevor Hastie and Rob Tibshirani. Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, Vol.33, No.1, 2008.

See Also

huge.select, glmnet, huge and huge-package

Examples

Run this code
#generate data
x = matrix(rnorm(50*80),50,80)
beta = c(3,2,1.5,rep(0,77))
y = rnorm(50) + x%*%beta

#StARS for Lasso
z1 = lasso.stars(x,y)
summary(z1)
plot(z1)

#StARS for Lasso
z2 = lasso.stars(x,y, stars.thresh = 0.05)
summary(z2)
plot(z2)

#StARS for Lasso
z3 = lasso.stars(x,y,rep.num = 50)
summary(z3)
plot(z3)

Run the code above in your browser using DataLab