picasso: Pathwise Calibrated Sparse Shooting Algorithm

Description

The function "picasso" implements the user interface.

Usage

picasso(X, Y, lambda = NULL, nlambda = NULL, lambda.min.ratio = NULL,
        lambda.min = NULL,family = "gaussian", method="l1", alg = "greedy", 
        opt = "naive",gamma = 3, df = NULL, sym = "or", standardize = TRUE, 
        perturb = TRUE, max.act.in = 3, truncation = 0.01, prec = 1e-4, 
        max.ite = 1e3, verbose = TRUE)

Arguments

For sparse linear regression and sparse logistic regression, X is an $n$ by $d$ design matrix. For sparse inverse column operator, there are 2 options when family = "graph": (1) data is an n by d

Y

For sparse linear regression and sparse logistic regression, Y is an $n$ dimensional response vector. For sparse inverse column operation, no input for Y is needed.

lambda

A sequence of decresing positive values to control the regularization. Typical usage is to leave the input lambda = NULL and have the program compute its own lambda sequence based on nlambda and lambda.min.rati

nlambda

The number of values used in lambda. Default value is 100.

lambda.min.ratio

The smallest value for lambda, as a fraction of the uppperbound (MAX) of the regularization parameter. The program can automatically generate lambda as a sequence of length = nlambda starting from

lambda.min

The smallest value for lambda. If $lambda.min.ratio is provided, then it is set to lambda.min.ratio*MAX, where MAX is the uppperbound of the regularization parameter. The default value is 0.05*MAX<

family

Options for model. Sparse linear regression and sparse multivariate regression is applied if family = "gaussian", sparse logistic regression is applied if family = "binomial" and sparse column inverse operation is applied if

method

Options for regularization. Lasso is applied if method = "l1", MCP is applied if method = "mcp" and SCAD Lasso is applied if method = "scad". The default value is "l1".

alg

Options for updating active sets: The cyclic selection rule is applied if alg = "cyclic"; The greedy selection rule is applied if alg = "greedy", the proximal gradient selection rule is applied if alg = "proximal"; T

opt

Options for updating residuals. The naive update rule is applied if opt = "naive", and the covariance update rule is applied if alg = "cov". The default value is "naive".

gamma

The concavity parameter for MCP and SCAD. The default value is 3.

df

Maximum degree of freedom for the covariance update. The default value is 2*n.

sym

Symmetrization of output graphs. If sym = "and", the edge between node i and node j is selected ONLY when both node i and node j are selected as neighbors for each other. If sym = "or"

standardize

Variables are standardized to have mean zero and unit standard deviation if standardize = TRUE. The default value is TRUE.

perturb

The diagonal of Sigma is added by a positive value to guarantee that Sigma is positive definite if perturb = TRUE. User can specify a numeric value for perturbe. The default value is perturb = TRUE

max.act.in

The maximum number of active variables to add into the active set when alg = "greedy". The default value is 3.

truncation

The critical value for updating active sets when alg = "cyclic". The default value is 1e-2.

prec

Stopping precision. The default value is 1e-4.

max.ite

The iteration limit. The default value is 1e4.

verbose

Tracing information is disabled if verbose = FALSE. The default value is TRUE.

`Value`

An object with S3 classes "lasso", "binomial", and "scio" corresponding to  sparse linear regression, sparse logistic regression, and sparse column inverse operator respectively is returned:
betaA matrix of regression estimates whose columns correspond to regularization parameters for sparse linear regression and sparse logistic regression. A list of matrices of regression estimation corresponding to regularization parameters for sparse column inverse operator.
interceptThe value of intercepts corresponding to regularization parameters for sparse linear regression, and sparse logistic regression.
YThe value of Y used in the program.
XThe value of X used in the program.
lambdaThe sequence of regularization parameters lambda used in the program.
nlambdaThe number of values used in lambda.
familyThe family from the input.
methodThe method from the input.
algThe alg from the input.
symThe sym from the input.
pathA list of d by d adjacency matrices of estimated graphs as a graph path corresponding to lambda.
sparsityThe sparsity levels of the graph path for sparse inverse column operator.
standardizeThe standardize from the input.
perturbThe perturb from the input.
dfThe degree of freecom (number of nonzero coefficients) along the solution path for sparse linear regression, nd sparse logistic regression.
iteA list of vectors where the i-th entries of ite[[1]] and ite[[2]] correspond to the outer iteration and inner iteration of i-th regularization parameter respectively.
verboseThe verbose from the input.

`Details`

For sparse linear regression,
$$\min_{\beta} {\frac{1}{2n}}|| Y - X \beta ||_2^2 + \lambda R(\beta),$$
where $R(\beta)$ can be $\ell_1$ norm, MCP, SCAD regularizers.
   
For sparse logistic regression,
$$\min_{\beta} {\frac{1}{n}}\sum_{i=1}^n (\log(1+e^{x_i^T \beta}) - y_i x_i^T \beta) + \lambda R(\beta),$$
where $R(\beta)$ can be $\ell_1$ norm, MCP, and SCAD regularizers.
    
For sparse column inverse operation,
$$\min_{\beta} {\frac{1}{2}} \beta^T S \beta - e^T \beta + \lambda R(\beta),$$
where $R(\beta)$ can be $\ell_1$ norm, MCP or SCAD  regularizers.

`References`

1. J. Friedman, T. Hastie and H. Hofling and R. Tibshirani. Pathwise coordinate optimization. The Annals of Applied Statistics, 2007.
2. C.H. Zhang. Nearly unbiased variable selection under minimax concave penalty. Annals of Statistics, 2010.
3. J. Fan and R. Li. Variable selection via nonconcave penalized likelihood and its oracle
properties. Journal of the American Statistical Association, 2001.
4. R. Tibshirani, J. Bien, J. Friedman, T. Hastie, N. Simon, J. Taylor and R. Tibshirani. Strong rules for discarding predictors in lasso-type problems. Journal of the Royal Statistical Society: Series B, 2012.
5. T. Zhao and H. Liu. Accelerated Path-following Iterative Shrinkage Algorithm. Journal of Computational and Graphical Statistics, 20-15.
6. T. Zhao, H. Liu, and T. Zhang. A General Theory of Pathwise Coordinate Optimization. Techinical Report, Princeton Univeristy.

`See Also`

picasso-package.

`Examples`

Run this code################################################################
## Sparse linear regression
## Generate the design matrix and regression coefficient vector
n = 100
d = 400
X = matrix(rnorm(n*d), n, d)
beta = c(3,2,0,1.5,rep(0,d-4))

## Generate response using Gaussian noise, and fit sparse linear models
noise = rnorm(n)
Y = X%*%beta + noise
out.l1.cyclic = picasso(X, Y, nlambda=10)
out.l1.greedy = picasso(X, Y, nlambda=10, alg="greedy")
out.mcp.greedy = picasso(X, Y, nlambda=10, method="mcp")

## Visualize the solution path
plot(out.l1.cyclic)
plot(out.l1.greedy)
plot(out.mcp.greedy)


################################################################
## Sparse logistic regression
## Generate the design matrix and regression coefficient vector
n = 100
d = 400
X = matrix(rnorm(n*d), n, d)
beta = c(3,2,0,1.5,rep(0,d-4))

## Generate response and fit sparse logistic models
p = exp(X%*%beta)/(1+exp(X%*%beta))
Y = rbinom(n,rep(1,n),p)
out.l1.cyclic = picasso(X, Y, nlambda=10, family="binomial")
out.l1.greedy = picasso(X, Y, nlambda=10, family="binomial", alg="greedy")
out.mcp.greedy = picasso(X, Y, nlambda=10, family="binomial", method="mcp")

## Visualize the solution path
plot(out.l1.cyclic)
plot(out.l1.greedy)
plot(out.mcp.greedy)

## Estimate of Bernoulli parameters
p.l1 = out.l1.cyclic$p


################################################################
## Sparse column inverse operator
## generating data
n = 100
d = 200
D = scio.generator(n=n,d=d,graph="band",g=1)
plot(D)

## sparse precision matrix estimation
out1 = picasso(D$data, nlambda=10, family="graph")
plot(out1)
scio.plot(out1$path[[4]])
Run the code above in your browser using DataLab