fusedlasso: Compute the fused lasso solution path for a general graph, or a 1d or 2d grid

Description

These functions produce the solution path for a general fused lasso problem. The fusedlasso function takes either a penalty matrix or a graph object from the igraph package. The fusedlasso1d and fusedlasso2d functions are convenience functions that construct the penalty matrix over a 1d or 2d grid.

Usage

fusedlasso(y, X, D, graph, gamma = 0, approx = FALSE, maxsteps = 2000,
           minlam = 0, tol = 1e-11, verbose = FALSE, fileback = FALSE)  
fusedlasso1d(y, pos, X, gamma = 0, approx = FALSE, maxsteps = 2000,
             minlam = 0, tol = 1e-11, verbose = FALSE)
fusedlasso2d(y, X, dim1, dim2, gamma = 0, approx = FALSE, maxsteps = 2000, 
	     minlam = 0, tol = 1e-11, verbose = FALSE, fileback = FALSE)

Arguments

a numeric response vector. Alternatively, for fusedlasso2d with no matrix X passed, y can be a matrix (its dimensions corresponding to the underlying 2d grid). Note that when y is given as a

pos

only for fusedlasso1d, these are the optional positions of the positions in the 1d grid. If missing, the 1d grid is assumed to have unit spacing. Note that arbitrary positions pos cannot be used in the presence of a

an optional matrix of predictor variables, with observations along the rows, and variables along the columns. It is assumed to have full column rank (linear independent predictors), but this is not checked, for efficiency. A rank deficient

only for fusedlasso, this is the penalty matrix, i.e., the oriented incidence matrix over the underlying graph (the orientation of each edge being arbitrary). Only one of D or graph needs to be specified.

graph

only for fusedlasso, this is the underlying graph as an igraph object from the igraph package. Only one of D or graph needs to be specified.

dim1

only for fusedlasso2d, this is the number of rows in the underlying 2d grid. If missing and y is given as a matrix, it is assumed to be the number of rows of y.

dim2

only for fusedlasso2d, this is the number of columns in the underlying 2d grid. If missing and y is given as a matrix, it is assumed to be the number of columns of y.

gamma

a numeric variable greater than or equal to 0, indicating the ratio of the two tuning parameters, one for the fusion penalty, and the other for the pure $\ell_1$ penalty. Default is 0. See "Details" for more information.

approx

a logical variable indicating if the approximate solution path should be used (with no dual coordinates leaving the boundary). Default is FALSE. Note that for the 1d fused lasso, with identity predicor matrix, this approximat

maxsteps

an integer specifying the maximum number of steps for the algorithm to take before termination. Default is 2000.

minlam

a numeric variable indicating the value of lambda at which the path should terminate. Default is 0.

tol

a numeric variable giving the tolerance used in the calculation of the hitting and leaving times. A larger value is more conservative, and may cause the algorithm to miss some hitting or leaving events (do not change unless you know what you'r

verbose

a logical variable indicating if progress should be reported after each knot in the path.

fileback

only for fusedlasso or fusedlasso2d, this is a logical variable indicating if file backing should be used. It can also be a character string indicating the file name to use for file backing. See "Details" below for mo

Value

If fileback is TRUE, a special list is silently returned which should be processed via filebackout. Otherwise the function returns an object of class "fusedlasso", and subclass "genlasso". This is a list with at least following components:
lambdavalues of lambda at which the solution path changes slope, i.e., kinks or knots.
betaa matrix of primal coefficients, each column corresponding to a knot in the solution path.
fita matrix of fitted values, each column corresponding to a knot in the solution path.
ua matrix of dual coefficients, each column corresponding to a knot in the solution path.
hita vector of logical values indicating if a new variable in the dual solution hit the box contraint boundary. A value of FALSE indicates a variable leaving the boundary.
dfa vector giving an unbiased estimate of the degrees of freedom of the fit at each knot in the solution path.
ythe observed response vector. Useful for plotting and other methods.
completepatha logical variable indicating whether the complete path was computed (terminating the path early with the maxsteps or minlam options results in a value of FALSE).
blsthe least squares solution, i.e., the solution at lambda = 0. This can be NULL when completepath is FALSE.
gammathe value of the lambda ratio.
callthe matched call.

Details

The fused lasso estimate minimizes the criterion $$1/2 \sum_{i=1}^n (y_i - x_i^T \beta_i)^2 + \lambda \sum_{(i,j) \in E} |\beta_i - \beta_j| + \gamma \cdot \lambda \sum_{i=1}^p |\beta_i|,$$ where $x_i$ is the ith row of the predictor matrix and $E$ is the edge set of the underlying graph. The solution $\hat{\beta}$ is computed as a function of the regularization parameter $\lambda$, for a fixed value of $\gamma$. The default is to set $\gamma=0$, which corresponds to pure fusion of the coefficient vector $\beta$. A choice $\gamma>0$ introduces both sparsity and fusion in the coefficient vector, with a higher value placing more priority on sparsity.

If the predictor matrix is the identity, and the primal solution path $\beta$ is desired at several levels of the ratio parameter $\gamma$, it is much more efficient to compute the solution path once with $\gamma=0$, and then use soft-thresholding via the softthresh function. For large runs a potentially substantial increase in speed can be achieved by storing the solution path on disk rather than in memory, which is achieved by setting fileback equal to TRUE. The output can be read via the filebackout function. For very large problems, where the path object cannot be read safely into memory, the user can use a customized call to the readlines function.

Finally, for the image denoising problem, i.e., the fused lasso over a 2d grid with identity predictor matrix, it is easy to specify a huge graph with a seemingly small amount of data. For instance, running the 2d fused lasso (with identity predictor matrix) on an image at standard 1080p HD resolution yields a graph with over 2 million edges. Moreover, in image denoising problems---somewhat unlike most other applications of the fused lasso (and generalized lasso)---a solution is often desired near the dense end of the path ($\lambda=0$) as opposed to the regularized end ($\lambda=\infty$). The dual path algorithm implemented by the fusedlasso2d function begins at the fully regularized end and works its way down to the dense end. For a problem with many edges (dual variables), if a solution at the dense is desired, then it must usually pass through a huge number knots in the path. Hence it is not advisable to run fusedlasso2d on image denoising problems of large scale, as the dual solution path is computationally infeasible. It should be noted that a faster algorithm for the 2d fused lasso solution path (when the predictor matrix is the identity), which begins at the dense end of the path, is available in the flsa package.

References

Tibshirani, R. J. and Taylor, J. (2011), "The solution path of the generalized lasso", Annals of Statistics 39 (3) 1335--1371.

Tibshirani, R., Saunders, M., Rosset, S., Zhu, J. and Knight, K. (2005), "Sparsity and smoothness via the fused lasso", Journal of the Royal Statistics Society: Series B 67(1), 91--108.

Examples

Run this code

# Fused lasso on a custom graph
set.seed(0)
edges = c(1,2,1,3,1,5,2,4,2,5,3,6,3,7,3,8,6,7,6,8)
gr = graph(edges=edges,directed=FALSE)
plot(gr)
y = c(1,1,0,1,1,0,0,0) + rnorm(8,0.1)

# Can either pass the graph object directly, or 
# first construct the penalty matrix, and then 
# pass this
a1 = fusedlasso(y,graph=gr)
D = getDgSparse(gr)
a2 = fusedlasso(y,D=D)

plot(a1,numbers=TRUE)

# The 2d fused lasso with a predictor matrix X
set.seed(0)
dim1 = dim2 = 16
p = dim1*dim2
n = 300
X = matrix(rnorm(n*p),nrow=n)
beta0 = matrix(0,dim1,dim2)
beta0[(row(beta0)-dim1/2)^2 + (col(beta0)-dim2/2)^2 <=
(min(dim1,dim2)/3)^2] = 1
y = X %*% as.numeric(beta0) + rnorm(n)

# Takes about 30 seconds for the full solution path
out = fusedlasso2d(y,X,dim1=dim1,dim2=dim2)

# Grab the solution at 8 values of lambda over the path 
a = coef(out,nlam=8)

# Plot these against the true coefficients
par(mar=c(1,1,2,1),mfrow=c(3,3))
cols = terrain.colors(30)
zlim = range(c(range(beta0),range(a$beta)))
image(beta0,col=cols,zlim=zlim,axes=FALSE)

for (i in 1:8) {
  image(matrix(a$beta[,i],nrow=dim1),col=cols,zlim=zlim,
  axes=FALSE)
  mtext(bquote(lambda==.(sprintf("%.3f",a$lambda[i]))))
}

Run the code above in your browser using DataLab