rppca: Fast pedigree PCA using sparse matrices and randomised linear algebra

Description

Fast pedigree PCA using sparse matrices and randomised linear algebra

Usage

rppca(X, ...)
# S3 method for spam
rppca(
  X,
  method = "randSVD",
  rank = 10,
  depth = 3,
  numVectors,
  totVar = NULL,
  center = FALSE,
  ...
)
# S3 method for pedigree
rppca(
  X,
  method = "randSVD",
  rank = 10,
  depth = 3,
  numVectors,
  totVar = NULL,
  center = FALSE,
  ...
)

Value

A list containing:

x: the principal components
sdev: the variance components of each PC. Note that the total variance is not known per se and this these components cannot be used to compute the proportion of the total variance accounted for by each PC. However, if nVecTraceEst is specified, rppca will estimate the total variance and return variance proportions.
vProp: the estimated variance proportions accounted for by each PC. Only returned if totVar is set.
scale: always FALSE
center: logical indicating whether or not the implicit data matrix was centred
rotation: the right singular values of the relationship matrix. Only returned if returnRotation == TRUE
varProps: proportion of the total variance explained by each PC. Only returned if starting from a pedigree object without centring, or if totVar is supplied.

Arguments

X: A representation of a pedigree, see Details.
...: optional arguments passed to methods
method: string, "randSVD" (the default) or "rspec" can be chosen, see Details
rank: integer, the number of principal components to return
depth: integer, number of iterations for generating the range matrix
numVectors: integer > rank, the number of random vectors to be sampled when generating the range matrix, defaults to ceiling(rank*1.5).
totVar: scalar, (optional) the total variance, required for computation of variance proportions when using an L-inverse matrix a input
center: logical, whether or not to (implicitly) centre the additive relationship matrix

Details

The output slots are named like those of R's built in prcomp function. Rotation is not returned by default as it is the transpose of the PC scores, which are returned in x. scale and center are set to FALSE.

Which method performs better depends on the number of PC requested, whether centring is applied, and on the structure of the pedigree. As a rule of thumb, "rspec" is faster than the default when rank is 8 or greater.

Examples

Run this code

pc <- rppca(pedLInv)
ped <- pedigree(sire=pedMeta$fid,
                dam=pedMeta$mid,
                label=pedMeta$id
                )
pc2 <- rppca(ped)

Run the code above in your browser using DataLab