RGeode (version 0.1.0)

randSVD: Randomized Singular Value Decomposition.

Description

Compute the near-optimal low-rank singular value decomposition (SVD) of a rectangular matrix. The algorithm follows a randomized approach.

Usage

randSVD(A, k = NULL, l = NULL, its = 2, sdist = "unif")

Arguments

A

array_like a real/complex input matrix (or data frame), with dimensions \((m, n)\). It is the real/complex matrix being approximated.

k

int, optional determines the target rank of the low-rank decomposition and should satisfy \(k << min(m,n)\). Set by default to 6.

l

int, optional block size of the block Lanczos iterations; \(l\) must be a positive integer greater than \(k\), and defaults \(l=k+2\).

its

int, optional number of full iterations of a block Lanczos method to conduct; \(its\) must be a nonnegative integer, and defaults to 2.

sdist

str \(c('normal', 'unif')\), optional Specifies the sampling distribution. \('unif'\) : (default) Uniform `[-1,1]`. \('normal\)' : Normal `~N(0,1)`.

Value

randSVD returns a list containing the following three components:

d

array_like \((k,k)\) matrix in the rank-k approximation USV' to A, where A is \((m,n)\); the entries of \(S\) are all nonnegative, and its only nonzero entries appear in nonincreasing order on the diagonal.

u

matrix \((m, k)\) matrix in the rank-\(k\) approximation \(A = U * diag(S) * V'\) to A; the columns of U are orthonormal and are called Left singular vect. We want to remark that this is the transpose matrix, hence the vectors are on the rows of our matrix.

v

matrix \((n, k)\) matrix in the rank-\(k\) approximation \(A = U * diag(S) * V'\) to A; the columns of V are orthonormal and are called Right singular vect.

Details

Randomized SVD (randSVD) is a fast algorithm to compute the approximate low-rank SVD of a rectangular \((m,n)\) matrix \(A\) using a probabilistic algorithm. Given the decided rank \(k << n\), rSVD factors the input matrix \(A\) as \(A = U * diag(S) * V'\), which is the typical SVD form. Precisely, the columns of U are orthonormal, as are the columns of V, the entries of S are all nonnegative, and the only nonzero entries of S appear in non-increasing order on its diagonal. The dimensions are: U is \((m,k)\), V is \((n,k)\), and S is \((k,k)\), when A is \((m,n)\).

Increasing \(its\) or \(l\) improves the accuracy of the approximation USV' to A.

The parameter \(its\) specifies the number of normalized power iterations (subspace iterations) to reduce the approximation error. This is recommended if the the singular values decay slowly. In practice 1 or 2 iterations achieve good results, however, computing power iterations increases the computational time. The number of power iterations is set to \(its=2\) by default.

References

  • [1] N. Halko, P. Martinsson, and J. Tropp. "Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions" (2009). (available at arXiv http://arxiv.org/abs/0909.4061).

  • [2] S. Voronin and P.Martinsson. "RSVDPACK: Subroutines for computing partial singular value decompositions via randomized sampling on single core, multi core, and GPU architectures" (2015). (available at `arXiv http://arxiv.org/abs/1502.05366).

  • [3] N. Benjamin Erichson. "Randomized Singular Value Decomposition (rsvd): R package" (2016). (available in the CRAN).

  • [4] Nathan Halko, Per-Gunnar Martinsson, and Joel Tropp. "Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions" (2009). (available at http://arxiv.org).

  • [5] V. Rokhlin, A. Szlam, M. Tygert. "A randomized algorithm for principal component analysis" (2009). (available at https://arxiv.org/abs/0809.2274). The implementation of rand SVD is inspired by the MatLab implementation of RandPCA by M. Tygert.

Examples

Run this code
# NOT RUN {
#Simulate a general matrix with 1000 rows and 1000 columns
vy= rnorm(1000*1000,0,1)
y= matrix(vy,1000,1000,byrow=TRUE)

#Compute the randSVD for the first hundred components of the matrix y and measure the time
start.time <- Sys.time()
prova1= randSVD(y,k=100)
Sys.time()- start.time

#Compare with a classical SVD
start.time <- Sys.time()
prova2= svd(y)
Sys.time()- start.time


# }

Run the code above in your browser using DataCamp Workspace