Learn R Programming

matrixCorr

matrixCorr computes correlation and related association matrices from small to high-dimensional data using simple, consistent functions and sensible defaults. It includes shrinkage and robust options for noisy or p >= n settings, plus convenient print/plot/summary methods. Performance-critical paths are implemented in C++ with BLAS/OpenMP and memory-aware symmetric updates. The API accepts base matrices and data frames and returns standard R objects via a consistent S3 interface.

Contributions from other researchers who want to add new correlation methods are very welcome. A central goal of matrixCorr is to keep efficient correlation and agreement estimation in one package with a common interface and consistent outputs, so methods can be extended, compared, and used without repeated translation across packages.

Supported measures include Pearson, Spearman, Kendall, distance correlation, partial correlation, robust biweight mid-correlation, percentage bend, Winsorized, skipped correlation, and latent categorical/ordinal correlations (tetrachoric, polychoric, polyserial, and biserial), plus repeated-measures correlation (rmcorr()); agreement tools cover Bland-Altman (two-method and repeated-measures) and Lin’s concordance correlation coefficient (including repeated-measures LMM/REML extensions).

Features

  • High-performance C++ backend using Rcpp
  • General correlations such as pearson_corr(), spearman_rho(), kendall_tau()
  • Robust correlation metrics (bicor(), pbcor(), wincor(), skipped_corr())
  • Distance correlation (dcor())
  • Partial correlation (pcorr())
  • Latent categorical/ordinal correlations (tetrachoric(), polychoric(), polyserial(), biserial())
  • Repeated-measures correlation (rmcorr())
  • Shrinkage for $p >> n$ (schafer_corr())
  • Agreement metrics
    • Bland-Altman (two-method ba() and repeated-measures ba_rm()),
    • Lin’s concordance correlation coefficient (pairwise ccc(), repeated-measures LMM/REML ccc_rm_reml() and non-parametric ccc_rm_ustat())
  • Interactive Shiny viewers for matrix-style outputs with a dedicated repeated-measures correlation viewer (view_rmcorr_shiny())

Installation

# Install from CRAN
install.packages("matrixCorr")

# Development version from GitHub
# install.packages("remotes")
remotes::install_github("Prof-ThiagoOliveira/matrixCorr")

Example

Correlation matrices (Pearson, Spearman, Kendall)

library(matrixCorr)

set.seed(1)
X <- as.data.frame(matrix(rnorm(300 * 6), ncol = 6))
names(X) <- paste0("V", 1:6)

R_pear <- pearson_corr(X)
R_spr  <- spearman_rho(X)
R_ken  <- kendall_tau(X)

print(R_pear, digits = 2)
#> Pearson correlation matrix: 
#>       V1    V2    V3    V4    V5    V6
#> V1  1.00  0.02  0.04 -0.02 -0.07  0.01
#> V2  0.02  1.00  0.04  0.03 -0.05  0.13
#> V3  0.04  0.04  1.00 -0.06  0.08 -0.14
#> V4 -0.02  0.03 -0.06  1.00  0.07  0.03
#> V5 -0.07 -0.05  0.08  0.07  1.00  0.04
#> V6  0.01  0.13 -0.14  0.03  0.04  1.00
summary(R_pear)
#> Correlation summary:
#>   class      : pearson_corr
#>   method     : pearson
#>   dimensions : 6 x 6
#>   n_variables: 6
#>   n_pairs    : 15
#>   symmetric  : yes
#>   missing    : 0
#>   min        : -0.1410
#>   max        : 0.1272
plot(R_spr)   # heatmap

Robust correlation (biweight mid-correlation)

set.seed(2)
Y <- X
# inject outliers
Y$V1[sample.int(nrow(Y), 8)] <- Y$V1[sample.int(nrow(Y), 8)] + 8

R_bicor <- bicor(Y)
print(R_bicor, digits = 2)
#> Biweight mid-correlation matrix (bicor): 
#>       V1    V2    V3    V4    V5    V6
#> V1  1.00  0.05  0.03 -0.02 -0.07  0.02
#> V2  0.05  1.00  0.05  0.03 -0.04  0.14
#> V3  0.03  0.05  1.00 -0.06  0.05 -0.16
#> V4 -0.02  0.03 -0.06  1.00  0.08  0.04
#> V5 -0.07 -0.04  0.05  0.08  1.00  0.05
#> V6  0.02  0.14 -0.16  0.04  0.05  1.00

Other robust correlations

R_pb   <- pbcor(Y)
R_win  <- wincor(Y)
R_skip <- skipped_corr(Y)

print(R_pb, digits = 2)
#> Percentage bend correlation matrix: 
#>       V1    V2    V3    V4    V5    V6
#> V1  1.00  0.03  0.02  0.01 -0.05  0.04
#> V2  0.03  1.00  0.06  0.02 -0.02  0.14
#> V3  0.02  0.06  1.00 -0.06  0.03 -0.16
#> V4  0.01  0.02 -0.06  1.00  0.07  0.05
#> V5 -0.05 -0.02  0.03  0.07  1.00  0.05
#> V6  0.04  0.14 -0.16  0.05  0.05  1.00

skipped_corr() is typically slower than pbcor() and wincor() because it performs pairwise bivariate outlier detection before computing the final correlation.

High-dimensional shrinkage correlation ($p >> n$)

set.seed(3)
n <- 60; p <- 200
Xd <- matrix(rnorm(n * p), n, p)
colnames(Xd) <- paste0("G", seq_len(p))

R_shr <- schafer_corr(Xd)
print(R_shr, digits = 2, max_rows = 6, max_cols = 6)
#> Schafer-Strimmer shrinkage correlation matrix: 
#>       G1    G2 G3 G4 G5 G6
#> G1  1.00 -0.01  0  0  0  0
#> G2 -0.01  1.00  0  0  0  0
#> G3  0.00  0.00  1  0  0  0
#> G4  0.00  0.00  0  1  0  0
#> G5  0.00  0.00  0  0  1  0
#> G6  0.00  0.00  0  0  0  1
#> ... omitted: 194 rows, 194 cols
summary(R_shr)
#> Correlation summary:
#>   class      : schafer_corr
#>   method     : schafer_shrinkage
#>   dimensions : 200 x 200
#>   n_variables: 200
#>   n_pairs    : 19900
#>   symmetric  : yes
#>   missing    : 0
#>   min        : -0.0144
#>   max        : 0.0130

Partial correlation matrix

R_part <- pcorr(X)
print(R_part, digits = 2)
#> Partial correlation (sample covariance)
#>       V1    V2    V3    V4    V5    V6
#> V1  1.00  0.02  0.05 -0.01 -0.07  0.02
#> V2  0.02  1.00  0.07  0.04 -0.06  0.14
#> V3  0.05  0.07  1.00 -0.06  0.10 -0.15
#> V4 -0.01  0.04 -0.06  1.00  0.08  0.02
#> V5 -0.07 -0.06  0.10  0.08  1.00  0.06
#> V6  0.02  0.14 -0.15  0.02  0.06  1.00
summary(R_part)
#> Correlation summary:
#>   class      : partial_corr
#>   method     : sample
#>   jitter     : 0
#>   dimensions : 6 x 6
#>   n_variables: 6
#>   n_pairs    : 15
#>   symmetric  : yes
#>   missing    : 0
#>   min        : -0.1518
#>   max        : 0.1368
plot(R_part)

pcorr() defaults to the sample partial correlation and also supports method = "oas", method = "ridge", and method = "glasso" for regularized estimation.

Distance correlation matrix

R_dcor <- dcor(X)
print(R_dcor, digits = 2)
#> Distance correlation (dCor) matrix: 
#>    V1   V2   V3 V4 V5   V6
#> V1  1 0.00 0.00  0  0 0.00
#> V2  0 1.00 0.00  0  0 0.02
#> V3  0 0.00 1.00  0  0 0.02
#> V4  0 0.00 0.00  1  0 0.00
#> V5  0 0.00 0.00  0  1 0.00
#> V6  0 0.02 0.02  0  0 1.00

dcor() uses an unbiased estimator with a fast univariate $O(n \log n)$ dispatch and an exact $O(n^2)$ fallback for robustness.

Latent categorical and ordinal correlations

set.seed(8)
n <- 800
Sigma_lat <- matrix(c(
  1.00, 0.55, 0.35, 0.20,
  0.55, 1.00, 0.40, 0.25,
  0.35, 0.40, 1.00, 0.45,
  0.20, 0.25, 0.45, 1.00
), 4, 4, byrow = TRUE)

Z <- matrix(rnorm(n * 4), n, 4) %*% chol(Sigma_lat)

X_bin <- data.frame(
  b1 = Z[, 1] > qnorm(0.70),
  b2 = Z[, 2] > qnorm(0.55),
  b3 = Z[, 3] > qnorm(0.50)
)

X_ord <- data.frame(
  o1 = ordered(cut(Z[, 2], breaks = c(-Inf, -0.5, 0.4, Inf),
                   labels = c("low", "mid", "high"))),
  o2 = ordered(cut(Z[, 3], breaks = c(-Inf, -1, 0, 1, Inf),
                   labels = c("1", "2", "3", "4")))
)

X_cont <- data.frame(x1 = Z[, 1], x2 = Z[, 4])

R_tet <- tetrachoric(X_bin)
R_pol <- polychoric(X_ord)
R_ps  <- polyserial(X_cont, X_ord)
R_bis <- biserial(X_cont, X_bin[, 1:2])

print(R_tet, digits = 2)
#> Tetrachoric correlation matrix: 
#>      b1   b2   b3
#> b1 1.00 0.55 0.39
#> b2 0.55 1.00 0.33
#> b3 0.39 0.33 1.00
summary(R_ps)
#> Latent correlation summary:
#>   class      : polyserial_corr
#>   method     : polyserial
#>   dimensions : 2 x 2
#>   n_pairs    : 4
#>   symmetric  : no
#>   missing    : 0
#>   min        : 0.2298
#>   max        : 0.5523
plot(R_pol)

Repeated-measures correlation

set.seed(2026)
n_subjects <- 20
n_rep <- 4
subject <- rep(seq_len(n_subjects), each = n_rep)
subj_eff_x <- rnorm(n_subjects, sd = 1.5)
subj_eff_y <- rnorm(n_subjects, sd = 2.0)
within_signal <- rnorm(n_subjects * n_rep)

dat_rm <- data.frame(
  subject = subject,
  x = subj_eff_x[subject] + within_signal + rnorm(n_subjects * n_rep, sd = 0.2),
  y = subj_eff_y[subject] + 0.8 * within_signal + rnorm(n_subjects * n_rep, sd = 0.3),
  z = subj_eff_y[subject] - 0.4 * within_signal + rnorm(n_subjects * n_rep, sd = 0.4)
)

fit_xy <- rmcorr(dat_rm, response = c("x", "y"), subject = "subject")
fit_mat <- rmcorr(dat_rm, response = c("x", "y", "z"), subject = "subject")

print(fit_xy)
#> Repeated-measures correlation:
#>   responses : x vs y
#>   based.on  : 80
#>   subjects  : 20
#>   df        : 59
#>   r         : 0.9448
#>   slope     : 0.7991
#>   p_value   : 2.769e-30
#>   CI 95.0%   : [0.9094, 0.9667]
print(fit_mat, digits = 2)
#> Repeated-measures correlation matrix: 
#>       x     y     z
#> x  1.00  0.94 -0.63
#> y  0.94  1.00 -0.62
#> z -0.63 -0.62  1.00
plot(fit_xy)
plot(fit_mat)

# Dedicated Shiny viewer for repeated-measures correlation matrices
# if (interactive() && requireNamespace("shiny", quietly = TRUE)) {
#   fit_mat_view <- rmcorr(
#     dat_rm,
#     response = c("x", "y", "z"),
#     subject = "subject",
#     keep_data = TRUE
#   )
#   view_rmcorr_shiny(fit_mat_view)
# }

Agreement analyses

Here we look at agreement analyses for Bland-Altman and repeated-measures concordance workflows.

Two-method Bland-Altman

set.seed(4)
x <- rnorm(120, 100, 10)
y <- x + 0.5 + rnorm(120, 0, 8)

ba_fit <- ba(x, y)
print(ba_fit)
#> Bland-Altman (n = 120) - LoA = mean +/- 1.96 * SD, 95% CI
#> 
#>  quantity        estimate lwr     upr    
#>  Mean difference 0.001    -1.450  1.453  
#>  Lower LoA       -15.741  -18.255 -13.226
#>  Upper LoA       15.743   13.229  18.258 
#> 
#> SD(differences): 8.032   LoA width: 31.484
summary(ba_fit)
#> Bland-Altman (two methods), 95% CI
#> 
#> Agreement estimates
#> 
#>  n   bias  sd_loa loa_low loa_up width  loa_multiplier
#>  120 0.001 8.032  -15.741 15.743 31.484 1.96          
#> 
#> Confidence intervals
#> 
#>  bias_lwr bias_upr lo_lwr  lo_upr  up_lwr up_upr
#>  -1.45    1.453    -18.255 -13.226 13.229 18.258
plot(ba_fit)

Repeated-measures Bland-Altman (pairwise matrix)

set.seed(5)
S <- 20; Tm <- 6
subj  <- rep(seq_len(S), each = Tm)
time  <- rep(seq_len(Tm), times = S)

true  <- rnorm(S, 50, 6)[subj] + (time - mean(time)) * 0.4
mA    <- true + rnorm(length(true), 0, 2)
mB    <- true + 1.0 + rnorm(length(true), 0, 2.2)
mC    <- 0.95 * true + rnorm(length(true), 0, 2.5)

dat <- rbind(
  data.frame(y = mA, subject = subj, method = "A", time = time),
  data.frame(y = mB, subject = subj, method = "B", time = time),
  data.frame(y = mC, subject = subj, method = "C", time = time)
)
dat$method <- factor(dat$method, levels = c("A","B","C"))

ba_rep <- ba_rm(
  data = dat, response = "y", subject = "subject",
  method = "method", time = "time",
  include_slope = FALSE, use_ar1 = FALSE
)
print(ba_rep)
#> Bland-Altman (row − column), 95% CI
#> 
#>  method1 method2 bias   sd_loa loa_low loa_up width  n  
#>  A       B        0.875 3.021  -5.045  6.796  11.841 120
#>  A       C       -2.609 3.156  -8.795  3.577  12.372 120
#>  B       C       -3.484 3.167  -9.692  2.724  12.416 120
summary(ba_rep)
#> Bland-Altman (pairwise), 95% CI
#> 
#> Agreement estimates
#> 
#>  method1 method2 n   bias   sd_loa loa_low loa_up width 
#>  A       B       120  0.875 3.021  -5.045  6.796  11.841
#>  A       C       120 -2.609 3.156  -8.795  3.577  12.372
#>  B       C       120 -3.484 3.167  -9.692  2.724  12.416
#> 
#> Confidence intervals
#> 
#>  bias_lwr bias_upr lo_lwr  lo_upr up_lwr up_upr
#>   0.335    1.416    -5.996 -4.094 5.845  7.747 
#>  -3.173   -2.044   -10.313 -7.276 2.059  5.095 
#>  -4.051   -2.918   -10.259 -9.126 2.157  3.291 
#> 
#> Model details
#> 
#>  sigma2_subject sigma2_resid residual_model
#>  0               9.124       iid           
#>  0               9.960       iid           
#>  0              10.032       iid
plot(ba_rep)

Lin’s concordance correlation coefficient

set.seed(6)
x <- rnorm(80, 100, 8)
y <- x + 0.4 + rnorm(80, 0, 3)

fit_ccc <- ccc(data.frame(A = x, B = y), ci = TRUE)
print(fit_ccc)
#> Concordance pairs (Lin's CCC, 95% CI)
#> 
#>  method1 method2 estimate lwr    upr   
#>  A       B       0.9262   0.8877 0.9519
summary(fit_ccc)
#> Concordance pairs (Lin's CCC, 95% CI)
#> 
#>  method1 method2 estimate lwr  upr 
#>  A       B       0.9262   0.89 0.95
plot(fit_ccc)

Use ccc_rm_ustat() when you have repeated measurements on the same subjects across methods and want a direct non-parametric repeated-measures CCC. Use ccc_rm_reml() when you want a model-based estimate from variance components, especially when subject effects, time effects, or within-subject correlation need to be modeled explicitly.

Repeated-measures Lin’s concordance correlation coefficient (LMM/REML)

set.seed(6)
S <- 30; Tm <- 8
id     <- factor(rep(seq_len(S), each = 2 * Tm))
method <- factor(rep(rep(c("A","B"), each = Tm), times = S))
time   <- rep(rep(seq_len(Tm), times = 2), times = S)

u  <- rnorm(S, 0, 0.8)[as.integer(id)]
g  <- rnorm(S * Tm, 0, 0.5)
g  <- g[(as.integer(id) - 1L) * Tm + as.integer(time)]
y  <- (method == "B") * 0.3 + u + g + rnorm(length(id), 0, 0.7)

dat_ccc <- data.frame(y, id, method, time)
fit_ccc_rm <- ccc_rm_reml(dat_ccc, response = "y", rind = "id",
                          method = "method", time = "time")
print(fit_ccc_rm)
#> Concordance pairs (Lin's CCC)
#> 
#>  method1 method2 estimate
#>  A       B       0.8987
summary(fit_ccc_rm)  # overall CCC, variance components, SEs/CI
#> Repeated-measures concordance (REML)
#> 
#> Concordance estimates
#> 
#>  method1 method2 estimate SB     se_ccc
#>  A       B       0.8987   0.0427 0.0172
#> 
#> Variance components
#> 
#>  sigma2_subject sigma2_subject_method sigma2_subject_time sigma2_error
#>  0.8149         0                     0.2965              0.4267      
#> 
#> AR(1) diagnostics
#> 
#>  ar1_rho ar1_rho_lag1 ar1_rho_mom ar1_pairs ar1_pval use_ar1 ar1_recommend
#>  -0.0533 -0.0533      -0.0533     420       0.2747   FALSE   FALSE
plot(fit_ccc_rm)

Contributing

Issues and pull requests are welcome. Please see CONTRIBUTING.md for guidelines and cran-comments.md/DESCRIPTION for package metadata.

License

MIT Thiago de Paula Oliveira

See inst/LICENSE for the full MIT license text.

Copy Link

Version

Install

install.packages('matrixCorr')

Monthly Downloads

176

Version

0.10.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Thiago de Paula Oliveira

Last Published

April 3rd, 2026

Functions in matrixCorr (0.10.0)

build_LDZ

build_LDZ
bicor

Biweight mid-correlation (bicor)
biserial

Biserial Correlation Between Continuous and Binary Variables
ccc_rm_reml

Concordance Correlation via REML (Linear Mixed-Effects Model)
ba

Bland-Altman statistics with confidence intervals
abort_bad_arg

Abort with a standardised argument error
ccc

Pairwise Lin's concordance correlation coefficient
ba_rm

Bland-Altman for repeated measurements
ccc_lmm_reml_pairwise

ccc_lmm_reml_pairwise
abort_internal

Abort for internal errors (should not happen)
check_matrix_dims

Check matrix dimensions
check_prob_scalar

Check probability in unit interval
ccc_rm_ustat

Repeated-Measures Lin's Concordance Correlation Coefficient (CCC)
check_required_cols

Check that a data frame contains required columns
check_ar1_rho

Check AR(1) correlation parameter
check_same_length

Check that two vectors have the same length
deprecated-matrixCorr

Deprecated Compatibility Wrappers
check_bool

Check a single logical flag
check_inherits

Check object class
dcor

Pairwise Distance Correlation (dCor)
compute_ci_from_se

compute_ci_from_se
.vc_message

.vc_message
.ba_rep_two_methods

two-method helper
check_scalar_character

Check scalar character (non-empty)
check_scalar_int_pos

Check strictly positive scalar integer
.align_weights_to_levels

Align (optional named) weights to a subset of time levels
matrixCorr-package

matrixCorr
kendall_tau

Pairwise (or Two-Vector) Kendall's Tau Rank Correlation
check_weights

Check weights vector (non-negative, finite, correct length)
check_symmetric_matrix

Check matrix symmetry
match_arg

Match argument to allowed values
matrixCorr-internal

Validation and normalisation for correlation
check_scalar_numeric

Check scalar numeric (with optional bounds)
check_scalar_nonneg

Check a non-negative scalar (>= 0)
estimate_rho

estimate_rho
inform_if_verbose

Inform only when verbose
num_or_na_vec

num_or_na_vec
pearson_corr

Pairwise Pearson correlation
print.matrixCorr_ccc_ci

Print method for matrixCorr CCC objects with CIs
num_or_na

num_or_na
polychoric

Pairwise Polychoric Correlation
print.matrixCorr_ccc

Print method for matrixCorr CCC objects
print.summary_corr_matrix

Summary Method for Correlation Matrices
spearman_rho

Pairwise Spearman's rank correlation
print.summary_latent_corr

Summary Method for Latent Correlation Matrices
run_cpp

run_cpp
print.rmcorr

Methods for Pairwise rmcorr Objects
view_rmcorr_shiny

Interactive Shiny Viewer for Repeated-Measures Correlation
wincor

Pairwise Winsorized correlation
summary.ccc_rm_reml

Summary Method for ccc_rm_reml Objects
print.rmcorr_matrix

Methods for rmcorr Matrix Objects
rmcorr

Repeated-Measures Correlation (rmcorr)
skipped_corr

Pairwise skipped correlation
schafer_corr

Schafer-Strimmer shrinkage correlation
pbcor

Percentage bend correlation
polyserial

Polyserial Correlation Between Continuous and Ordinal Variables
view_corr_shiny

Interactive Shiny viewer for matrixCorr objects
tetrachoric

Pairwise Tetrachoric Correlation
print.ccc_ci

S3 print for legacy class ccc_ci
pcorr

Partial correlation matrix (sample / ridge / OAS / graphical lasso)