svdr: Find a few approximate largest singular values and corresponding singular vectors of a matrix.

Description

The randomized method for truncated SVD by P. G. Martinsson and colleagues finds a few approximate largest singular values and corresponding singular vectors of a sparse or dense matrix. It is a fast and memory-efficient way to compute a partial SVD, similar in performance for many problems to irlba. The svdr method is a block method and may produce more accurate estimations with less work for problems with clustered large singular values (see the examples). In other problems, irlba may exhibit faster convergence.

Usage

svdr(x, k, tol = 1e-05, it = 100L, extra = min(10L, dim(x) - k),
  center = NULL, Q = NULL, return.Q = FALSE)

Arguments

numeric real- or complex-valued matrix or real-valued sparse matrix.

dimension of subspace to estimate (number of approximate singular values to compute).

tol

stop iteration when the largest absolute relative change in estimated singular values from one iteration to the next falls below this value.

maximum number of algorithm iterations.

extra

number of extra vectors of dimension ncol(x), larger values generally improve accuracy (with increased computational cost).

center

optional column centering vector whose values are implicitly subtracted from each column of A without explicitly forming the centered matrix (preserving sparsity). Optionally specify center=TRUE as shorthand for center=colMeans(x). Use for efficient principal components computation.

optional initial random matrix, defaults to a matrix of size ncol(x) by k + extra with entries sampled from a normal random distribution.

return.Q

if TRUE return the Q matrix for restarting (see examples).

Value

Returns a list with entries:

d:: k approximate singular values
u:: k approximate left singular vectors
v:: k approximate right singular vectors
mprod:: total number of matrix products carried out
Q:: optional subspace matrix (when return.Q=TRUE)

Details

Also see an alternate implementation (rsvd) of this method by N. Benjamin Erichson in the https://cran.r-project.org/package=rsvd package.

References

Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions N. Halko, P. G. Martinsson, J. Tropp. Sep. 2009.

Examples

Run this code

# NOT RUN {
set.seed(1)

A <- matrix(runif(400), nrow=20)
svdr(A, 3)$d

# Compare with svd
svd(A)$d[1:3]

# Compare with irlba
irlba(A, 3)$d

# }
# NOT RUN {
# A problem with clustered large singular values where svdr out-performs irlba.
tprolate <- function(n, w=0.25)
{
  a <- rep(0, n)
  a[1] <- 2 * w
  a[2:n] <- sin( 2 * pi * w * (1:(n-1)) ) / ( pi * (1:(n-1)) )
  toeplitz(a)
}

x <- tprolate(512)
set.seed(1)
tL <- system.time(L <- irlba(x, 20))
tR <- system.time(R <- svdr(x, 20))
S <- svd(x)
plot(S$d)
data.frame(time=c(tL[3], tR[3]),
           error=sqrt(c(crossprod(L$d - S$d[1:20]), crossprod(R$d - S$d[1:20]))),
           row.names=c("IRLBA", "Randomized SVD"))

# But, here is a similar problem with clustered singular values where svdr
# doesn't out-perform irlba as easily...clusters of singular values are,
# in general, very hard to deal with!
# (This example based on https://github.com/bwlewis/irlba/issues/16.)
set.seed(1)
s <- svd(matrix(rnorm(200 * 200), 200))
x <- s$u %*% (c(exp(-(1:100)^0.3) * 1e-12 + 1, rep(0.5, 100)) * t(s$v))
tL <- system.time(L <- irlba(x, 5))
tR <- system.time(R <- svdr(x, 5))
S <- svd(x)
plot(S$d)
data.frame(time=c(tL[3], tR[3]),
           error=sqrt(c(crossprod(L$d - S$d[1:5]), crossprod(R$d - S$d[1:5]))),
           row.names=c("IRLBA", "Randomized SVD"))
# }

Run the code above in your browser using DataLab