ccc_pairwise_u_stat: Repeated-Measures Lin's Concordance Correlation Coefficient (CCC)

Description

Computes all pairwise Lin's Concordance Correlation Coefficients (CCC) across multiple methods (L $\geq$ 2) for repeated-measures data. Each subject must be measured by all methods across the same set of time points or replicates.

CCC measures both accuracy (how close measurements are to the line of equality) and precision (Pearson correlation). Confidence intervals are optionally computed using a U-statistics-based estimator with Fisher's Z transformation

Usage

ccc_pairwise_u_stat(
  data,
  response,
  method,
  subject,
  time = NULL,
  Dmat = NULL,
  delta = 1,
  ci = FALSE,
  conf_level = 0.95,
  n_threads = getOption("matrixCorr.threads", 1L),
  verbose = FALSE
)

Value

If ci = FALSE, a symmetric matrix of class "ccc" (estimates only). If ci = TRUE, a list of class "ccc", "ccc_ci" with elements:

est: CCC estimate matrix
lwr.ci: Lower bound matrix
upr.ci: Upper bound matrix

Arguments

data

A data frame containing the repeated-measures dataset.

response

Character. Name of the numeric outcome column.

method

Character. Name of the method column (factor with L $\geq$ 2 levels).

subject

Character. Column identifying subjects. Every subject must have measurements from all methods (and times, when supplied); rows with incomplete {subject, time, method} coverage are dropped per pair.

time

Character or NULL. Name of the time/repetition column. If NULL, one time point is assumed.

Dmat

Optional numeric weight matrix (T $\times$ T) for timepoints. Defaults to identity.

delta

Numeric. Power exponent used in the distance computations between method trajectories across time points. This controls the contribution of differences between measurements:

delta = 1 (default) uses absolute differences.
delta = 2 uses squared differences, more sensitive to larger deviations.
delta = 0 reduces to a binary distance (presence/absence of disagreement), analogous to a repeated-measures version of the kappa statistic.

The choice of delta should reflect the penalty you want to assign to measurement disagreement.

ci

Logical. If TRUE, returns confidence intervals (default FALSE).

conf_level

Confidence level for CI (default 0.95).

n_threads

Integer ($\geq$ 1). Number of OpenMP threads to use for computation. Defaults to getOption("matrixCorr.threads", 1L).

verbose

Logical. If TRUE, prints diagnostic output (default FALSE).

Author

Thiago de Paula Oliveira

Details

This function computes pairwise Lin's Concordance Correlation Coefficient (CCC) between methods in a repeated-measures design using a U-statistics-based nonparametric estimator proposed by Carrasco et al. (2013). It is computationally efficient and robust, particularly for large-scale or balanced longitudinal designs.

Lin's CCC is defined as $$ \rho_c = \frac{2 \cdot \mathrm{cov}(X, Y)}{\sigma_X^2 + \sigma_Y^2 + (\mu_X - \mu_Y)^2} $$ where:

$X$ and $Y$ are paired measurements from two methods.
$\mu_X$, $\mu_Y$ are means, and $\sigma_X^2$, $\sigma_Y^2$ are variances.

U-statistics Estimation

For repeated measures across $T$ time points and $n$ subjects we assume

all $n(n-1)$ pairs of subjects are considered to compute a U-statistic estimator for within-method and cross-method distances.
if delta > 0, pairwise distances are raised to a power before applying a time-weighted kernel matrix $D$.
if delta = 0, the method reduces to a version similar to a repeated-measures kappa.

Confidence Intervals

Confidence intervals are constructed using a Fisher Z-transformation of the CCC. Specifically,

The CCC is transformed using $Z = 0.5 \log((1 + \rho_c) / (1 - \rho_c))$.
Standard errors are computed from the asymptotic variance of the U-statistic.
Normal-based intervals are computed on the Z-scale and then back-transformed to the CCC scale.

Assumptions

The design must be balanced, where all subjects must have complete observations for all methods and time points.
The method is nonparametric and does not require assumptions of normality or linear mixed effects.
Weights (Dmat) allow differential importance of time points.

For unbalanced or complex hierarchical data (e.g., missing timepoints, covariate adjustments), consider using ccc_lmm_reml, which uses a variance components approach via linear mixed models.

References

Lin L (1989). A concordance correlation coefficient to evaluate reproducibility. Biometrics, 45: 255-268.

Lin L (2000). A note on the concordance correlation coefficient. Biometrics, 56: 324-325.

Carrasco JL, Jover L (2003). Estimating the concordance correlation coefficient: a new approach. Computational Statistics & Data Analysis, 47(4): 519-539.

Examples

Run this code

set.seed(123)
df <- expand.grid(subject = 1:10,
                  time = 1:2,
                  method = c("A", "B", "C"))
df$y <- rnorm(nrow(df), mean = match(df$method, c("A", "B", "C")), sd = 1)

# CCC matrix (no CIs)
ccc1 <- ccc_pairwise_u_stat(df, response = "y", method = "method",
                            subject = "subject", time = "time")
print(ccc1)
summary(ccc1)
plot(ccc1)

# With confidence intervals
ccc2 <- ccc_pairwise_u_stat(df, response = "y", method = "method",
                            subject = "subject", time = "time", ci = TRUE)
print(ccc2)
summary(ccc2)
plot(ccc2)

# Interactive viewing (requires shiny)
if (interactive() && requireNamespace("shiny", quietly = TRUE)) {
  view_corr_shiny(ccc2)
}

#------------------------------------------------------------------------
# Choosing delta based on distance sensitivity
#------------------------------------------------------------------------
# Absolute distance (L1 norm) - robust
ccc_pairwise_u_stat(df, response = "y", method = "method",
                    subject = "subject", time = "time", delta = 1)

# Squared distance (L2 norm) - amplifies large deviations
ccc_pairwise_u_stat(df, response = "y", method = "method",
                    subject = "subject", time = "time", delta = 2)

# Presence/absence of disagreement (like kappa)
ccc_pairwise_u_stat(df, response = "y", method = "method",
                    subject = "subject", time = "time", delta = 0)

Run the code above in your browser using DataLab