irlba v2.2.1

0

Monthly downloads

0th

Percentile

Fast Truncated Singular Value Decomposition and Principal Components Analysis for Large Dense and Sparse Matrices

Fast and memory efficient methods for truncated singular value decomposition and principal components analysis of large sparse and dense matrices.

Readme

irlba

Implicitly-restarted Lanczos methods for fast truncated singular value decomposition of sparse and dense matrices (also referred to as partial SVD). IRLBA stands for Augmented, Implicitly Restarted Lanczos Bidiagonalization Algorithm. The package provides the following functions (see help on each for details and examples).

  • irlba() partial SVD function
  • svdr() alternate partial SVD function based on randomized SVD
  • prcomp_irlba() principal components function similar to the prcomp function in stats package for computing the first few principal components of large matrices
  • partial_eigen() a very limited partial eigenvalue decomposition for symmetric matrices (see the RSpectra package for more comprehensive truncated eigenvalue decomposition)

Help documentation for each function includes extensive documentation and examples. Also see the package vignette, vignette("irlba", package="irlba").

What's new in Version 2.2.1?

We include stronger convergence criteria and a new argument svtol for that. The new approach helps guarantee more accurate solutions for some difficult problems. The tradeoff is that the default behavior is a little slower than before because at least two Lanczos iterations are always run. The new convergence behavior can be disabled with svtol=Inf.

The new package version includes a new function, svdr()--another state of the art truncated SVD method based on the randomized SVD algorithm of Gunnar Martinsson and others. Both irlba() and svdr() work well. Svdr uses a block method and may exhibit better convergence in problems where the largest singular values are clustered. See the documentation and examples in the package. (Block versions of irlba exists, but are not yet implemented by this R package--something coming in the future.)

We re-introduced a solver for estimating the smallest singular values of a matrix and associated singular vector spaces. The solver is based on the oringial Harmonic Ritz vector augmentation method of Baglama and Reichel. Beware that this method is somewhat experimental and may fail to converge, or may converge poorly, to estimated singular values for very ill-conditioned matrices. Along with block methods for irlba, this is an active area of work--feel free to contribute!

Deprecated features

The mult argument is deprecated and will be removed in a future version. We now recommend simply defining a custom class with a custom multiplcation operator. The example below illustrates the old and new approaches.

library(irlba)
set.seed(1)
A <- matrix(rnorm(100), 10)

# ------------------ old way ----------------------------------------------
# A custom matrix multiplication function that scales the columns of A
# (cf the scale option). This function scales the columns of A to unit norm.
col_scale <- sqrt(apply(A, 2, crossprod))
mult <- function(x, y)
        {
          # check if x is a  vector
          if (is.vector(x))
          {
            return((x %*% y) / col_scale)
          }
          # else x is the matrix
          x %*% (y / col_scale)
        }
irlba(A, 3, mult=mult)$d
## [1] 1.820227 1.622988 1.067185

# Compare with:
irlba(A, 3, scale=col_scale)$d
## [1] 1.820227 1.622988 1.067185

# Compare with:
svd(sweep(A, 2, col_scale, FUN=`/`))$d[1:3]
## [1] 1.820227 1.622988 1.067185

# ------------------ new way ----------------------------------------------
setClass("scaled_matrix", contains="matrix", slots=c(scale="numeric"))
setMethod("%*%", signature(x="scaled_matrix", y="numeric"), function(x ,y) x@.Data %*% (y / x@scale))
setMethod("%*%", signature(x="numeric", y="scaled_matrix"), function(x ,y) (x %*% y@.Data) / y@scale)
a <- new("scaled_matrix", A, scale=col_scale)

irlba(a, 3)$d
## [1] 1.820227 1.622988 1.067185

We have learned that using R's existing S4 system is simpler, easier, and more flexible than using custom arguments with idiosyncratic syntax and behavior. We've even used the new approach to implement distributed parallel matrix products for very large problems with amazingly little code.

Wishlist

  • Optional block implementation for some use cases
  • More Matrix classes supported in the fast code path
  • Help improving the solver for smallest singular values in tricky cases (basically, for ill-conditioned problems)

References

  • Augmented Implicitly Restarted Lanczos Bidiagonalization Methods, J. Baglama and L. Reichel, SIAM J. Sci. Comput. 2005. (http://www.math.uri.edu/~jbaglama/papers/paper14.pdf)
  • Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions N. Halko, P. G. Martinsson, J. Tropp. Sep. 2009.

Status

Travis CI status Codecov

Functions in irlba

Name Description
irlba Find a few approximate singular values and corresponding singular vectors of a matrix.
partial_eigen Find a few approximate largest eigenvalues and corresponding eigenvectors of a symmetric matrix.
prcomp_irlba Principal Components Analysis
svdr Find a few approximate largest singular values and corresponding singular vectors of a matrix.
No Results!

Vignettes of irlba

Name
irlba.Rnw
No Results!

Last month downloads

Details

Type Package
Date 2017-05-16
LinkingTo Matrix
License GPL-3
BugReports https://github.com/bwlewis/irlba/issues
RoxygenNote 5.0.1
NeedsCompilation yes
Packaged 2017-05-16 20:24:46 UTC; blewis
Repository CRAN
Date/Publication 2017-05-17 07:28:01 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/irlba)](http://www.rdocumentation.org/packages/irlba)