Learn R Programming

picasso (version 0.4.7)

picasso: Pathwise Calibrated Sparse Shooting Algorithm

Description

The function "picasso" implements the user interface.

Usage

picasso(X, Y, lambda = NULL, nlambda = NULL, lambda.min.ratio = NULL,
        lambda.min = NULL,family = "gaussian", method="l1", alg = "greedy", 
        opt = "naive",gamma = 3, df = NULL, sym = "or", standardize = TRUE, 
        perturb = TRUE, max.act.in = 3, truncation = 0.01, prec = 1e-4, 
        max.ite = 1e3, verbose = TRUE)

Arguments

X
For sparse linear regression and sparse logistic regression, X is an $n$ by $d$ design matrix. For sparse inverse column operator, there are 2 options when family = "graph": (1) data is an n by d
Y
For sparse linear regression and sparse logistic regression, Y is an $n$ dimensional response vector. For sparse inverse column operation, no input for Y is needed.
lambda
A sequence of decresing positive values to control the regularization. Typical usage is to leave the input lambda = NULL and have the program compute its own lambda sequence based on nlambda and lambda.min.rati
nlambda
The number of values used in lambda. Default value is 100.
lambda.min.ratio
The smallest value for lambda, as a fraction of the uppperbound (MAX) of the regularization parameter. The program can automatically generate lambda as a sequence of length = nlambda starting from
lambda.min
The smallest value for lambda. If $lambda.min.ratio is provided, then it is set to lambda.min.ratio*MAX, where MAX is the uppperbound of the regularization parameter. The default value is 0.05*MAX<
family
Options for model. Sparse linear regression and sparse multivariate regression is applied if family = "gaussian", sparse logistic regression is applied if family = "binomial" and sparse column inverse operation is applied if
method
Options for regularization. Lasso is applied if method = "l1", MCP is applied if method = "mcp" and SCAD Lasso is applied if method = "scad". The default value is "l1".
alg
Options for updating active sets: The cyclic selection rule is applied if alg = "cyclic"; The greedy selection rule is applied if alg = "greedy", the proximal gradient selection rule is applied if alg = "proximal"; T
opt
Options for updating residuals. The naive update rule is applied if opt = "naive", and the covariance update rule is applied if alg = "cov". The default value is "naive".
gamma
The concavity parameter for MCP and SCAD. The default value is 3.
df
Maximum degree of freedom for the covariance update. The default value is 2*n.
sym
Symmetrization of output graphs. If sym = "and", the edge between node i and node j is selected ONLY when both node i and node j are selected as neighbors for each other. If sym = "or"
standardize
Variables are standardized to have mean zero and unit standard deviation if standardize = TRUE. The default value is TRUE.
perturb
The diagonal of Sigma is added by a positive value to guarantee that Sigma is positive definite if perturb = TRUE. User can specify a numeric value for perturbe. The default value is perturb = TRUE
max.act.in
The maximum number of active variables to add into the active set when alg = "greedy". The default value is 3.
truncation
The critical value for updating active sets when alg = "cyclic". The default value is 1e-2.
prec
Stopping precision. The default value is 1e-4.
max.ite
The iteration limit. The default value is 1e4.
verbose
Tracing information is disabled if verbose = FALSE. The default value is TRUE.

Value

  • An object with S3 classes "lasso", "binomial", and "scio" corresponding to sparse linear regression, sparse logistic regression, and sparse column inverse operator respectively is returned:
  • betaA matrix of regression estimates whose columns correspond to regularization parameters for sparse linear regression and sparse logistic regression. A list of matrices of regression estimation corresponding to regularization parameters for sparse column inverse operator.
  • interceptThe value of intercepts corresponding to regularization parameters for sparse linear regression, and sparse logistic regression.
  • YThe value of Y used in the program.
  • XThe value of X used in the program.
  • lambdaThe sequence of regularization parameters lambda used in the program.
  • nlambdaThe number of values used in lambda.
  • familyThe family from the input.
  • methodThe method from the input.
  • algThe alg from the input.
  • symThe sym from the input.
  • pathA list of d by d adjacency matrices of estimated graphs as a graph path corresponding to lambda.
  • sparsityThe sparsity levels of the graph path for sparse inverse column operator.
  • standardizeThe standardize from the input.
  • perturbThe perturb from the input.
  • dfThe degree of freecom (number of nonzero coefficients) along the solution path for sparse linear regression, nd sparse logistic regression.
  • iteA list of vectors where the i-th entries of ite[[1]] and ite[[2]] correspond to the outer iteration and inner iteration of i-th regularization parameter respectively.
  • verboseThe verbose from the input.

Details

For sparse linear regression,

$$\min_{\beta} {\frac{1}{2n}}|| Y - X \beta ||_2^2 + \lambda R(\beta),$$ where $R(\beta)$ can be $\ell_1$ norm, MCP, SCAD regularizers. For sparse logistic regression,

$$\min_{\beta} {\frac{1}{n}}\sum_{i=1}^n (\log(1+e^{x_i^T \beta}) - y_i x_i^T \beta) + \lambda R(\beta),$$ where $R(\beta)$ can be $\ell_1$ norm, MCP, and SCAD regularizers. For sparse column inverse operation, $$\min_{\beta} {\frac{1}{2}} \beta^T S \beta - e^T \beta + \lambda R(\beta),$$ where $R(\beta)$ can be $\ell_1$ norm, MCP or SCAD regularizers.

References

1. J. Friedman, T. Hastie and H. Hofling and R. Tibshirani. Pathwise coordinate optimization. The Annals of Applied Statistics, 2007. 2. C.H. Zhang. Nearly unbiased variable selection under minimax concave penalty. Annals of Statistics, 2010. 3. J. Fan and R. Li. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 2001. 4. R. Tibshirani, J. Bien, J. Friedman, T. Hastie, N. Simon, J. Taylor and R. Tibshirani. Strong rules for discarding predictors in lasso-type problems. Journal of the Royal Statistical Society: Series B, 2012. 5. T. Zhao and H. Liu. Accelerated Path-following Iterative Shrinkage Algorithm. Journal of Computational and Graphical Statistics, 20-15. 6. T. Zhao, H. Liu, and T. Zhang. A General Theory of Pathwise Coordinate Optimization. Techinical Report, Princeton Univeristy.

See Also

picasso-package.

Examples

Run this code
################################################################
## Sparse linear regression
## Generate the design matrix and regression coefficient vector
n = 100
d = 400
X = matrix(rnorm(n*d), n, d)
beta = c(3,2,0,1.5,rep(0,d-4))

## Generate response using Gaussian noise, and fit sparse linear models
noise = rnorm(n)
Y = X%*%beta + noise
out.l1.cyclic = picasso(X, Y, nlambda=10)
out.l1.greedy = picasso(X, Y, nlambda=10, alg="greedy")
out.mcp.greedy = picasso(X, Y, nlambda=10, method="mcp")

## Visualize the solution path
plot(out.l1.cyclic)
plot(out.l1.greedy)
plot(out.mcp.greedy)


################################################################
## Sparse logistic regression
## Generate the design matrix and regression coefficient vector
n = 100
d = 400
X = matrix(rnorm(n*d), n, d)
beta = c(3,2,0,1.5,rep(0,d-4))

## Generate response and fit sparse logistic models
p = exp(X%*%beta)/(1+exp(X%*%beta))
Y = rbinom(n,rep(1,n),p)
out.l1.cyclic = picasso(X, Y, nlambda=10, family="binomial")
out.l1.greedy = picasso(X, Y, nlambda=10, family="binomial", alg="greedy")
out.mcp.greedy = picasso(X, Y, nlambda=10, family="binomial", method="mcp")

## Visualize the solution path
plot(out.l1.cyclic)
plot(out.l1.greedy)
plot(out.mcp.greedy)

## Estimate of Bernoulli parameters
p.l1 = out.l1.cyclic$p


################################################################
## Sparse column inverse operator
## generating data
n = 100
d = 200
D = scio.generator(n=n,d=d,graph="band",g=1)
plot(D)

## sparse precision matrix estimation
out1 = picasso(D$data, nlambda=10, family="graph")
plot(out1)
scio.plot(out1$path[[4]])

Run the code above in your browser using DataLab