Learn R Programming

matrixCorr (version 0.10.0)

schafer_corr: Schafer-Strimmer shrinkage correlation

Description

Computes a Schafer-Strimmer shrinkage correlation matrix for numeric data using a high-performance 'C++' backend. This stabilises Pearson correlation estimates by shrinking off-diagonal entries towards zero.

Usage

schafer_corr(data)

# S3 method for schafer_corr print( x, digits = 4, n = NULL, topn = NULL, max_vars = NULL, width = NULL, show_ci = NULL, ... )

# S3 method for schafer_corr plot( x, title = "Schafer-Strimmer shrinkage correlation", cluster = TRUE, hclust_method = "complete", triangle = c("upper", "lower", "full"), show_value = TRUE, show_values = NULL, value_text_limit = 60, value_text_size = 3, palette = c("diverging", "viridis"), ... )

# S3 method for schafer_corr summary( object, n = NULL, topn = NULL, max_vars = NULL, width = NULL, show_ci = NULL, ... )

Value

A symmetric numeric matrix of class schafer_corr where entry (i, j) is the shrunk correlation between the i-th and j-th numeric columns. Attributes:

  • method = "schafer_shrinkage"

  • description = "Schafer-Strimmer shrinkage correlation matrix"

  • package = "matrixCorr"

Columns with zero variance are set to NA across row/column (including the diagonal), matching pearson_corr() behaviour.

Invisibly returns x.

A ggplot object.

Arguments

data

A numeric matrix or a data frame with at least two numeric columns. All non-numeric columns will be excluded. Columns must be numeric and contain no NAs.

x

An object of class schafer_corr.

digits

Integer; number of decimal places to print.

n

Optional row threshold for compact preview output.

topn

Optional number of leading/trailing rows to show when truncated.

max_vars

Optional maximum number of visible columns; NULL derives this from console width.

width

Optional display width; defaults to getOption("width").

show_ci

One of "yes" or "no".

...

Additional arguments passed to ggplot2::theme().

title

Plot title.

cluster

Logical; if TRUE, reorder rows/cols by hierarchical clustering on distance \(1 - r\).

hclust_method

Linkage method for hclust; default "complete".

triangle

One of "full", "upper", "lower". Default to upper.

show_value

Logical; if TRUE (default), overlay numeric values on the heatmap tiles (subject to value_text_limit).

show_values

Deprecated compatibility alias for show_value. If supplied, it overrides show_value.

value_text_limit

Integer threshold controlling when values are drawn.

value_text_size

Font size for values if shown.

palette

Character; "diverging" (default) or "viridis".

object

An object of class schafer_corr.

Author

Thiago de Paula Oliveira

Details

Let \(R\) be the sample Pearson correlation matrix. The Schafer-Strimmer shrinkage estimator targets the identity in correlation space and uses \(\hat\lambda = \frac{\sum_{i<j}\widehat{\mathrm{Var}}(r_{ij})} {\sum_{i<j} r_{ij}^2}\) (clamped to \([0,1]\)), where \(\widehat{\mathrm{Var}}(r_{ij}) \approx \frac{(1-r_{ij}^2)^2}{n-1}\). The returned estimator is \(R_{\mathrm{shr}} = (1-\hat\lambda)R + \hat\lambda I\).

References

Schafer, J. & Strimmer, K. (2005). A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statistical Applications in Genetics and Molecular Biology, 4(1).

See Also

print.schafer_corr, plot.schafer_corr, pearson_corr

Examples

Run this code
## Multivariate normal with AR(1) dependence (Toeplitz correlation)
set.seed(1)
n <- 80; p <- 40; rho <- 0.6
d <- abs(outer(seq_len(p), seq_len(p), "-"))
Sigma <- rho^d

X <- MASS::mvrnorm(n, mu = rep(0, p), Sigma = Sigma)
colnames(X) <- paste0("V", seq_len(p))

Rshr <- schafer_corr(X)
print(Rshr, digits = 2, n = 6, max_vars = 6)
summary(Rshr)
plot(Rshr)

## Shrinkage typically moves the sample correlation closer to the truth
Rraw <- stats::cor(X)
off  <- upper.tri(Sigma, diag = FALSE)
mae_raw <- mean(abs(Rraw[off] - Sigma[off]))
mae_shr <- mean(abs(Rshr[off] - Sigma[off]))
print(c(MAE_raw = mae_raw, MAE_shrunk = mae_shr))
plot(Rshr, title = "Schafer-Strimmer shrinkage correlation")

# Interactive viewing (requires shiny)
if (interactive() && requireNamespace("shiny", quietly = TRUE)) {
  view_corr_shiny(Rshr)
}

Run the code above in your browser using DataLab