rpca: Decompose a matrix into a low-rank component and a sparse component by solving Principal Components Pursuit

Description

This function decomposes a rectangular matrix M into a low-rank component, and a sparse component, by solving a convex program called Principal Component Pursuit.

Usage

rpca(M, 
     lambda = 1/sqrt(max(dim(M))), mu = prod(dim(M))/(4 * sum(abs(M))), 
     term.delta = 10^(-7), max.iter = 5000, trace = FALSE,
     thresh.nuclear.fun = thresh.nuclear, thresh.l1.fun = thresh.l1, 
     F2norm.fun = F2norm)

Arguments

a rectangular matrix that is to be decomposed into a low-rank component and a sparse component $M = L + S$ .

lambda

parameter of the convex problem $\|L\|_{*} + \lambda \|S\|_{1}$ which is minimized in the Principal Components Pursuit algorithm. The default value is the one suggested in Candès, E. J., section 1.4, and together with reasonable assumptions abo

parameter from the augumented Lagrange multiplier formulation of the PCP, Candès, E. J., section 5. Default value is the one suggested in references.

term.delta

The algorithm terminates when $\|M-L-S\|_{F} \leq \delta \|M\|_{F}$ where $\|\ \|_{F}$ is Frobenius norm of a matrix.

max.iter

Maximal number of iterations of the augumented Lagrange multiplier algorithm. A warning is issued if the algorithm does not converge by then.

trace

Print out information with every iteration.

thresh.nuclear.fun, thresh.l1.fun, F2norm.fun

Arguments for internal use only.

Value

The function returns two matrices S and L, which have the property that $L+S \simeq M$, where the quality of the approximation depends on the argument term.delta, and the convergence of the algorithm.
SThe sparse component of the matrix decomposition.
LThe low-rank component of the matrix decomposition.
L.svdThe singular value decomposition of L, as returned by the function La.svd .
convergence$convergedTRUE if the algorithm converged with respect to term.delta.
convergence$iterationsNumber of performed iterations.
convergence$final.deltaThe final iteration delta which is compared with term.delta.
convergence$all.deltaAll delta from all iterations.

encoding

UTF-8

Details

These functions decompose a rectangular matrix M into a low-rank component, and a sparse component, by solving a convex program called Principal Component Pursuit: $$\textrm{minimize}\quad \|L\|_{*} + \lambda \|S\|_{1}$$ $$\textrm{subject to}\quad L+S = M$$ where $\|L\|_{*}$ is the nuclear norm of L (sum of singular values).

References

Candès, E. J., Li, X., Ma, Y., & Wright, J. (2011). Robust principal component analysis?. Journal of the ACM (JACM), 58(3), 11.

Yuan, X., & Yang, J. (2009). Sparse and low-rank matrix decomposition via alternating direction methods. preprint, 12.

Examples

Run this code

data(iris)
M <- as.matrix(iris[,1:4])
Mcent <- sweep(M,2,colMeans(M))

res <- rpca(Mcent)

## Check convergence and number of iterations
with(res$convergence,list(converged,iterations))
## Final delta F2 norm divided by F2norm(Mcent)
with(res$convergence,final.delta)

## Check properites of the decomposition
with(res,c(
all(abs( L+S - Mcent ) < 10^-5),
all( L == L.svd$u%*%(L.svd$d*L.svd$vt) )
))
# [1] TRUE TRUE

## The low rank component has rank 2
length(res$L.svd$d)
## However, the sparse component is not sparse 
## - thus this data set is not the best example here.
mean(res$S==0)

## Plot the first (the only) two principal components
## of the low-rank component L
rpc<-res$L.svd$u%*%diag(res$L.svd$d)
plot(jitter(rpc[,1:2],amount=.001),col=iris[,5])

## Compare with classical principal components
pc <- prcomp(M,center=TRUE)
plot(pc$x[,1:2],col=iris[,5])
points(rpc[,1:2],col=iris[,5],pch="+")

## "Sparse" elements distribution
plot(density(abs(res$S),from=0))
curve(dexp(x,rate=1/mean(abs(res$S))),add=TRUE,lty=2)

## Plot measurements against measurements corrected by sparse components
par(mfcol=c(2,2))
for(i in 1:4) {
plot(M[,i],M[,i]-res$S[,i],col=iris[,5],xlab=colnames(M)[i])
}

Run the code above in your browser using DataLab