Learn R Programming

⚠️There's a newer version (0.1.6.4) of this package.Take me there.

Fast Probabilistic Whitening Transformation for Ultra-High Dimensional Data

Data whitening is a widely used preprocessing step to remove correlation structure since statistical models often assume independence (Kessy, et al. 2018). The typical procedures transforms the observed data by an inverse square root of the sample correlation matrix (Figure 1). For low dimension data (i.e. $n > p$), this transformation produces transformed data with an identity sample covariance matrix. This procedure assumes either that the true covariance matrix is know, or is well estimated by the sample covariance matrix. Yet the use of the sample covariance matrix for this transformation can be problematic since 1) the complexity is $\mathcal{O}(p^3)$ and 2) it is not applicable to the high dimensional (i.e. $n \ll p$) case since the sample covariance matrix is no longer full rank.

Here we use a probabilistic model of the observed data to apply a whitening transformation. Our Gaussian Inverse Wishart Empirical Bayes (GIW-EB) 1) model substantially reduces computational complexity, and 2) regularizes the eigen-values of the sample covariance matrix to improve out-of-sample performance.

Installation

devtools::install_github("GabrielHoffman/decorrelate")

Copy Link

Version

Install

install.packages('decorrelate')

Version

0.1.6.3

License

Artistic-2.0

Maintainer

Gabriel E Hoffman

Last Published

July 17th, 2025

Functions in decorrelate (0.1.6.3)

getCov

Get full covariance/correlation matrix from eclairs
getWhiteningMatrix

Get whitening matrix
mahalanobisDistance

Mahalanobis Distance
lm_eclairs

Fit linear model after decorrelating
fastcca

Fast canonical correlation analysis
optimal_SVHT_coef

Optimal Hard Threshold for Singular Values
logDet

Evaluate the log determinant
rmvnorm_eclairs

Draw from multivariate normal and t distributions
averageCorr

Summarize correlation matrix
quadForm

Evaluate quadratic form
plot,eclairs-method

Plot eclairs object
sv_threshold

Singular value thresholding
whiten

Decorrelation projection + eclairs
reform_decomp

Recompute eclairs after dropping features
dmult

Multiply by diagonal matrix
eclairs-class

Class eclairs
autocorr.mat

Create auto-correlation matrix
cca

Canonical correlation analysis
eclairs_sq

Compute eclairs decomp of squared correlation matrix
cov_transform

Estimate covariance matrix after applying transformation
eclairs_corMat

Estimate covariance/correlation with low rank and shrinkage
decorrelate

Decorrelation projection
eclairs

Estimate covariance/correlation with low rank and shrinkage
fastcca-class

Class fastcca
getShrinkageParams

Estimate shrinkage parameter by empirical Bayes
kappa,eclairs-method

Compute condition number
lm_each_eclairs

Fit linear model on each feature after decorrelating
mult_eclairs

Multiply by eclairs matrix