Learn R Programming

decorrelate (version 0.1.6.3)

eclairs_sq: Compute eclairs decomp of squared correlation matrix

Description

Given the eclairs decomp of C, compute the eclairs decomp of C^2

Usage

eclairs_sq(ecl, rank1 = ecl$k, rank2 = Inf, varianceFraction = 1)

Value

compute the eclairs of C^2

Arguments

ecl

estimate of correlation matrix from eclairs() storing \(U\), \(d_1^2\), \(\lambda\) and \(\nu\)

rank1

use the first 'rank' singular vectors from the SVD. Using increasing rank1 will increase the accuracy of the estimation. But note that the computationaly complexity is O(P choose(rank, 2)), where P is the number of features in the dataset

rank2

rank of eclairs() decomposition returned

varianceFraction

fraction of variance to retain after truncating trailing eigen values

Details

Consider a data matrix \(X_{N x P}\) of \(P\) features and \(N\) samples where \(N << P\). Let the columns of X be scaled so that \(C_{P x P} = XX^T\). C is often too big to compute directly since it is O(P^2) and O(P^3) to invert. But we can compute the SVD of X in O(PN^2). The goal is to compute the SVD of the matrix C^2, given only the SVD of C in less than \(O(P^2)\) time. Here we compute this SVD of C^2 in \(O(PN^4)\) time, which is tractible for small N. Moreover, if we use an SVD X = UDV^T with of rank R, we can approximate the SVD of C^2 in O(PR^4) using only D and V

Examples

Run this code
# Compute correlations directly and using eclairs decomp

n <- 600 # number of samples
p <- 100 # number of features

# create correlation matrix
Sigma <- autocorr.mat(p, .9)

# draw data from correlation matrix Sigma
Y <- Rfast::rmvnorm(n, rep(0, p), sigma = Sigma, seed = 1)
rownames(Y) <- paste0("sample_", seq(n))
colnames(Y) <- paste0("gene_", seq(p))

# correlation computed directly
C <- cor(Y)

# correlation from eclairs decomposition
ecl <- eclairs(Y, compute = "cor")
C.eclairs <- getCor(ecl, lambda = 0)

all.equal(C, C.eclairs)

# Correlation of Y^2
#-------------------

# exact quadratic way
C <- 2 * cor(Y)^2

# faster low rank
ecl2 <- eclairs_sq(ecl)
C.eclairs <- 2 * getCor(ecl2, lambda = 0)

all.equal(C.eclairs, C)

Run the code above in your browser using DataLab