Learn R Programming

gmodels (version 2.15.0)

fast.prcomp: Efficient computation of principal components and singular value decompositions.

Description

The standard prcomp and svd function are very inefficient for wide matrixes. fast.prcomp and fast.svd are modified versions which are efficient even for matrixes that are very wide.

Usage

fast.prcomp(x, retx = TRUE, center = TRUE, scale. = FALSE, tol = NULL)
  fast.svd( x, nu = min(n, p), nv = min(n, p), ...)

Arguments

x
data matrix
retx, center, scale., tol
See documetation for prcomp
nu, nv, ...
See documetation for svd

Value

Details

The current implementation of the function svd in S-Plus and R is much slower when operating on a matrix with a large number of columns than on the transpose of this matrix, which has a large number of rows. As a consequence, prcomp, which uses svd, is also very slow when applied to matrixes with a large number of rows. For R, the simple solution is to use La.svd instead of svd. A suitable patch to prcomp has been submitted. In the mean time, the function fast.prcomp has been provided as a short-term work-around.

For S-Plus the solution is to replace the standard svd with a version that checks the dimensions of the matrix, and performs the computation on the transposed the matrix if it is wider than tall.

For R: [object Object],[object Object]

For S-Plus: [object Object],[object Object]

See Also

prcomp, svd, La.svd

Examples

Run this code
# create test matrix
  set.seed(4943546)
  nr <- 50
  nc <- 2000
  x  <- matrix( rnorm( nr*nc), nrow=nr, ncol=nc )
  tx <- t(x)

  # SVD directly on matrix is SLOW:
  system.time( val.x <- svd(x)$u )

  # SVD on t(matrix) is FAST:
  system.time( val.tx <- svd(tx)$v )

  # and the results are equivalent:
  max( abs(val.x) - abs(val.tx) )

  # Time gap dissapears using fast.svd:
  system.time( val.x <- fast.svd(x)$u )
  system.time( val.tx <- fast.svd(tx)$v )
  max( abs(val.x) - abs(val.tx) )


  library(stats)

  # prcomp directly on matrix is SLOW:
  system.time( pr.x <- prcomp(x) )

  # prcomp.fast is much faster
  system.time( fast.pr.x <- fast.prcomp(x) )

  # and the results are equivalent
  max( pr.x$sdev - fast.pr.x$sdev )
  max( abs(pr.x$rotation[,1:49]) - abs(fast.pr.x$rotation[,1:49]) )
  max( abs(pr.x$x) - abs(fast.pr.x$x)  )

  # (except for the last and least significant component):
  max( abs(pr.x$rotation[,50]) - abs(fast.pr.x$rotation[,50]) )

Run the code above in your browser using DataLab