ba_rm: Bland-Altman for repeated measurements

Description

Repeated-measures Bland-Altman (BA) analysis for method comparison based on a mixed-effects model fitted to subject-time matched paired differences. The fitted model includes a subject-specific random intercept and, optionally, an AR(1) residual correlation structure within subject.

The function accepts either exactly two methods or $\ge 3$ methods. With exactly two methods it returns a single fitted BA object. With $\ge 3$ methods it fits the same model to every unordered method pair and returns pairwise matrices of results.

Required variables

response: numeric measurements.
subject: subject identifier.
method: method label with at least two distinct levels.
time: replicate/time key used to form within-subject pairs.

For any analysed pair of methods, only records where both methods are present for the same subject and integer-coerced time contribute to the fit. Rows with missing values in any required field are excluded for that analysed pair.

Usage

ba_rm(
  data = NULL,
  response,
  subject,
  method,
  time,
  loa_multiplier = 1.96,
  conf_level = 0.95,
  include_slope = FALSE,
  use_ar1 = FALSE,
  ar1_rho = NA_real_,
  max_iter = 200L,
  tol = 1e-06,
  verbose = FALSE
)
# S3 method for ba_repeated
print(
  x,
  digits = 3,
  ci_digits = 3,
  n = NULL,
  topn = NULL,
  max_vars = NULL,
  width = NULL,
  show_ci = NULL,
  ...
)
# S3 method for ba_repeated_matrix
print(
  x,
  digits = 3,
  ci_digits = 3,
  n = NULL,
  topn = NULL,
  max_vars = NULL,
  width = NULL,
  show_ci = NULL,
  style = c("pairs", "matrices"),
  ...
)
# S3 method for ba_repeated
plot(
  x,
  title = "Bland-Altman (repeated measurements)",
  subtitle = NULL,
  point_alpha = 0.7,
  point_size = 2.2,
  line_size = 0.8,
  shade_ci = TRUE,
  shade_alpha = 0.08,
  smoother = c("none", "loess", "lm"),
  symmetrize_y = TRUE,
  show_points = TRUE,
  show_value = TRUE,
  ...
)
# S3 method for ba_repeated_matrix
plot(
  x,
  pairs = NULL,
  against = NULL,
  facet_scales = c("free_y", "fixed"),
  title = "Bland-Altman (repeated, pairwise)",
  point_alpha = 0.6,
  point_size = 1.8,
  line_size = 0.7,
  shade_ci = TRUE,
  shade_alpha = 0.08,
  smoother = c("none", "loess", "lm"),
  show_points = TRUE,
  show_value = TRUE,
  ...
)

Value

Either a "ba_repeated" object (exactly two methods) or a "ba_repeated_matrix" object ($\ge 3$ methods).

If exactly two methods are supplied, the returned "ba_repeated" object is a list with components:

means: numeric vector of paired means $(y_1 + y_2)/2$ used for plotting helpers.
diffs: numeric vector of paired differences $y_2 - y_1$ used for plotting helpers.
based.on: integer number of complete subject-time pairs used.
mean.diffs: scalar model-based BA centre. When include_slope = FALSE, this is the fitted intercept of the paired- difference model. When include_slope = TRUE, this is the fitted mean difference at the centred reference paired mean used internally by the backend.
lower.limit, upper.limit: scalar limits of agreement, computed as $\mu_0 \pm \texttt{loa\_multiplier}\sqrt{\sigma_u^2 + \sigma_e^2}$.
lines: named numeric vector with entries lower, mean, and upper.
CI.lines: named numeric vector containing Wald confidence interval bounds for the bias and both LoA endpoints: mean.diff.ci.lower, mean.diff.ci.upper, lower.limit.ci.lower, lower.limit.ci.upper, upper.limit.ci.lower, upper.limit.ci.upper.
loa_multiplier: scalar LoA multiplier actually used.
critical.diff: scalar LoA half-width $\texttt{loa\_multiplier} \times \texttt{sd\_loa}$.
include_slope: logical, copied from the call.
beta_slope: proportional-bias slope on the original paired-mean scale when include_slope = TRUE; otherwise NA.
sigma2_subject: estimated variance of the subject-level random intercept on paired differences.
sigma2_resid: estimated residual variance on paired differences.
use_ar1: logical, copied from the call.
residual_model: either "ar1" or "iid", indicating the final residual structure actually used.
ar1_rho: AR(1) correlation actually used in the final fit when residual_model == "ar1"; otherwise NA.
ar1_estimated: logical indicating whether ar1_rho was estimated internally (TRUE) or supplied by the user (FALSE) when the final residual model is AR(1); otherwise NA.
data_long: stored long-format data frame used downstream for plotting and reconstruction. It uses canonical internal column names .response, .subject, .method, and .time.
mapping: named list identifying those stored canonical column names for response, subject, method, and time.

The confidence level is stored as attr(x, "conf.level").

If $\ge 3$ methods are supplied, the returned "ba_repeated_matrix" object is a list with components:

bias: numeric $m \times m$ matrix of model-based BA centres. For indices $(j, k)$ with $j < k$, bias[j, k] estimates $\mathrm{method}_k - \mathrm{method}_j$. Thus the matrix orientation is column minus row, not row minus column. The diagonal is NA.
sd_loa: numeric $m \times m$ matrix of LoA SDs, $\sqrt{\sigma_u^2 + \sigma_e^2}$. This matrix is symmetric.
loa_lower, loa_upper: numeric $m \times m$ matrices of LoA endpoints corresponding to bias. These satisfy $\texttt{loa\_lower}[j,k] = -\texttt{loa\_upper}[k,j]$ and $\texttt{loa\_upper}[j,k] = -\texttt{loa\_lower}[k,j]$.
width: numeric $m \times m$ matrix of LoA widths, loa_upper - loa_lower. This matrix is symmetric.
n: integer $m \times m$ matrix giving the number of complete subject-time pairs used for each analysed contrast. Pairs with fewer than two complete matches are left as NA in the estimate matrices.
mean_ci_low, mean_ci_high: numeric $m \times m$ matrices of Wald confidence interval bounds for bias.
loa_lower_ci_low, loa_lower_ci_high: numeric $m \times m$ matrices of Wald confidence interval bounds for the lower LoA.
loa_upper_ci_low, loa_upper_ci_high: numeric $m \times m$ matrices of Wald confidence interval bounds for the upper LoA.
slope: optional numeric $m \times m$ matrix of proportional-bias slopes on the original paired-mean scale when include_slope = TRUE; otherwise NULL. This matrix is antisymmetric in sign because each fitted contrast is reversed across the transpose.
methods: character vector of method levels defining matrix row and column order.
loa_multiplier: scalar LoA multiplier actually used.
conf_level: scalar confidence level used for the reported Wald intervals.
use_ar1: logical, copied from the call.
ar1_rho: scalar equal to the user-supplied common ar1_rho when use_ar1 = TRUE and a value was supplied; otherwise NA. This field does not store the per-pair estimated AR(1) parameters.
residual_model: character $m \times m$ matrix whose entries are "ar1", "iid", or NA, indicating the final residual structure used for each pair.
sigma2_subject: numeric $m \times m$ matrix of estimated subject-level random-intercept variances.
sigma2_resid: numeric $m \times m$ matrix of estimated residual variances.
ar1_rho_pair: optional numeric $m \times m$ matrix giving the AR(1) correlation actually used for each pair when the final residual model is "ar1"; otherwise NA for that entry. Present only when use_ar1 = TRUE.
ar1_estimated: optional logical $m \times m$ matrix indicating whether the pair-specific ar1_rho_pair was estimated internally (TRUE) or supplied by the user (FALSE) for entries whose final residual model is "ar1"; otherwise NA. Present only when use_ar1 = TRUE.
data_long: stored long-format data frame used downstream for plotting and reconstruction. It uses canonical internal column names .response, .subject, .method, and .time.
mapping: named list identifying those stored canonical column names for response, subject, method, and time.

Arguments

data: Optional data frame-like object. If supplied, response, subject, method, and time may be column names. Objects not already inheriting from data.frame are first coerced with as.data.frame().
response: Numeric response vector, or a single character string naming the response column in data.
subject: Subject identifier vector (integer, numeric, or factor), or a single character string naming the subject column in data.
method: Method label vector (character, factor, integer, or numeric), or a single character string naming the method column in data. At least two distinct method levels are required.
time: Replicate/time index vector (integer or numeric), or a single character string naming the time column in data. Values are coerced to integer before pairing and before AR(1) contiguity checks.
loa_multiplier: Positive scalar giving the SD multiplier used to form the limits of agreement. Default is 1.96. In the exported ba_rm() interface this default is fixed and does not depend on conf_level.
conf_level: Confidence level for Wald confidence intervals for the reported bias and both LoA endpoints. Must lie in (0, 1). Default 0.95.
include_slope: Logical. If TRUE, the model includes the paired mean as a fixed effect and estimates a proportional-bias slope. The reported BA centre remains the fitted mean difference at the centred reference paired mean used internally by the backend; the returned LoA remain horizontal bands and are not regression-adjusted curves.
use_ar1: Logical. If TRUE, request an AR(1) residual structure within subject over contiguous integer time blocks.
ar1_rho: Optional AR(1) parameter. Must satisfy abs(ar1_rho) < 0.999 when supplied. If NA and use_ar1 = TRUE, the backend estimates rho separately for each analysed pair.
max_iter: Maximum number of EM/GLS iterations used by the backend.
tol: Convergence tolerance for the backend EM/GLS iterations.
verbose: Logical. If TRUE, print progress for the pairwise $\ge 3$-method path. It has no material effect in the exactly-two-method path.
x: A "ba_repeated_matrix" object.
digits: Number of digits for estimates (default 3).
ci_digits: Number of digits for CI bounds (default 3).
n: Optional row threshold for compact preview output.
topn: Optional number of leading/trailing rows to show when truncated.
max_vars: Optional maximum number of visible columns; NULL derives this from console width.
width: Optional display width; defaults to getOption("width").
show_ci: One of "yes" or "no".
...: Additional theme adjustments passed to ggplot2::theme(...) (e.g., plot.title.position = "plot", axis.title.x = element_text(size=11)).
style: Show as pairs or matrix format?
title: Plot title (character scalar). Defaults to "Bland-Altman (repeated measurements)" for two methods and "Bland-Altman (repeated, pairwise)" for the faceted matrix plot.
subtitle: Optional subtitle (character scalar). If NULL, a compact summary is shown using the fitted object.
point_alpha: Numeric in [0, 1]. Transparency for scatter points drawn at (pair mean, pair difference) when point data are available. Passed to ggplot2::geom_point(alpha = ...). Default 0.7.
point_size: Positive numeric. Size of scatter points; passed to ggplot2::geom_point(size = ...). Default 2.2.
line_size: Positive numeric. Line width for horizontal bands (bias and both LoA) and, when requested, the proportional-bias line. Passed to ggplot2::geom_hline(linewidth = ...) (and geom_abline). Default 0.8.
shade_ci: Logical. If TRUE and confidence intervals are available in the object (CI.lines for two methods; *_ci_* matrices for the pairwise case), semi-transparent rectangles are drawn to indicate CI bands for the bias and each LoA. If FALSE, dashed horizontal CI lines are drawn instead. Has no effect if CIs are not present. Default TRUE.
shade_alpha: Numeric in [0, 1]. Opacity of the CI shading rectangles when shade_ci = TRUE. Passed to ggplot2::annotate("rect", alpha = ...). Default 0.08.
smoother: One of "none", "loess", or "lm". Adds an overlaid trend for differences vs means when points are drawn, to visualise proportional bias. "lm" fits a straight line with no SE ribbon; "loess" draws a locally-smoothed curve (span 0.9) with no SE ribbon; "none" draws no smoother. Ignored if show_points = FALSE or if no point data are available.
symmetrize_y: Logical (two-method plot only). If TRUE, the y-axis is centred at the estimated bias and expanded symmetrically to cover all elements used to compute the range (bands, CIs, and points if shown). Default TRUE.
show_points: Logical. If TRUE, per-pair points are drawn when present in the fitted object (two-method path) or when they can be reconstructed from x$data_long and x$mapping (pairwise path). If FALSE or if point data are unavailable, only the bands (and optional CI indicators) are drawn. Default TRUE.
show_value: Logical; included for a consistent plotting interface. Repeated-measures Bland-Altman plots do not overlay numeric cell values, so this argument currently has no effect.
pairs: (Faceted pairwise plot only.) Optional character vector of labels specifying which method contrasts to display. Labels must match the "row - column" convention used by print()/summary() (e.g., "B - A"). Defaults to all upper-triangle pairs.
against: (Faceted pairwise plot only.) Optional single method name. If supplied, facets are restricted to contrasts of the chosen method against all others. Ignored when pairs is provided.
facet_scales: (Faceted pairwise plot only.) Either "free_y" (default) to allow each facet its own y-axis limits, or "fixed" for a common scale across facets. Passed to ggplot2::facet_wrap(scales = ...).

Author

Thiago de Paula Oliveira

Details

For a selected pair of methods $(a, b)$, the backend first forms complete within-subject pairs at matched subject and integer-coerced time. Let $$ d_{it} = y_{itb} - y_{ita}, \qquad m_{it} = \frac{y_{ita} + y_{itb}}{2}, $$ where $d_{it}$ is the paired difference and $m_{it}$ is the paired mean for subject $i$ at time/replicate $t$. Only complete subject-time matches contribute to that pairwise fit.

If multiple rows are present for the same subject-time-method combination within an analysed pair, the backend keeps the last encountered value for that combination when forming the pair. The function therefore implicitly assumes at most one observation per subject-time-method cell for each analysed contrast.

The fitted model for each analysed pair is $$ d_{it} = \beta_0 + \beta_1 x_{it} + u_i + \varepsilon_{it}, $$ where $x_{it} = m_{it}$ if include_slope = TRUE and the term is omitted otherwise; $u_i \sim \mathcal{N}(0, \sigma_u^2)$ is a subject-specific random intercept; and the within-subject residual vector satisfies $\mathrm{Cov}(\varepsilon_i) = \sigma_e^2 R_i$.

When use_ar1 = FALSE, $R_i = I$. When use_ar1 = TRUE, the backend works with the residual precision matrix $C_i = R_i^{-1}$ over contiguous time blocks within subject and uses $\sigma_e^2 C_i^{-1}$ as the residual covariance.

AR(1) residual structure

Within each subject, paired observations are ordered by integer-coerced time. AR(1) correlation is applied only over strictly contiguous runs satisfying $t_{k+1} = t_k + 1$. Gaps break the run. Negative times, and any isolated positions not belonging to a contiguous run, are treated as independent singletons.

For a contiguous run of length $L$ and correlation parameter $\rho$, the block precision matrix is $$ C = \frac{1}{1-\rho^2} \begin{bmatrix} 1 & -\rho & & & \\ -\rho & 1+\rho^2 & -\rho & & \\ & \ddots & \ddots & \ddots & \\ & & -\rho & 1+\rho^2 & -\rho \\ & & & -\rho & 1 \end{bmatrix}, $$ with a very small ridge added to the diagonal for numerical stability.

If use_ar1 = TRUE and ar1_rho is supplied, that value is used after validation and clipping to the admissible numerical range handled by the backend.

If use_ar1 = TRUE and ar1_rho = NA, the backend estimates rho separately for each analysed pair by:

fitting the corresponding iid model;
computing a moments-based lag-1 estimate from detrended residuals within contiguous blocks, used only as a seed; and
refining that seed by a short profile search over rho using the profiled REML log-likelihood.

In the exported ba_rm() wrapper, if an AR(1) fit for a given analysed pair fails specifically because the backend EM/GLS routine did not converge to admissible finite variance-component estimates, the wrapper retries that pair with iid residuals. If the iid refit succeeds, the final reported residual model for that pair is "iid" and a warning is issued. Other AR(1) failures are not simplified and are propagated as errors.

Internal centring and scaling for the proportional-bias slope

When include_slope = TRUE, the paired mean regressor is centred and scaled internally before fitting. Let $\bar m$ be the mean of the observed paired means. The backend chooses a scaling denominator from:

the sample SD;
the IQR-based scale $\mathrm{IQR}(m)/1.349$;
the MAD-based scale $1.4826\,\mathrm{MAD}(m)$.

It uses the first of these that is not judged near-zero relative to the largest finite positive candidate scale, under a threshold proportional to $\sqrt{\epsilon_{\mathrm{mach}}}$. If all candidate scales are treated as near-zero, the fit stops with an error because the proportional-bias slope is not estimable on the observed paired-mean scale.

The returned beta_slope is back-transformed to the original paired-mean scale. The returned BA centre is the fitted mean difference at the centred reference paired mean $\bar m$, not the original-scale intercept coefficient.

Estimation

The backend uses a stabilised EM/GLS scheme.

Conditional on current variance components, the fixed effects are updated by GLS using the marginal precision of the paired differences after integrating out the random subject intercept. The resulting fixed-effect covariance used in the confidence-interval calculations is the GLS covariance $$ \mathrm{Var}(\hat\beta \mid \hat\theta) = \left( \sum_i X_i^\top V_i^{-1} X_i \right)^{-1}. $$

Given updated fixed effects, the variance components are refreshed by EM using the conditional moments of the subject random intercept and the residual quadratic forms. Variance updates are ratio-damped and clipped to admissible ranges for numerical stability.

Reported BA centre and limits of agreement

The reported BA centre is always model-based.

When include_slope = FALSE, it is the fitted intercept of the paired- difference mixed model.

When include_slope = TRUE, it is the fitted mean difference at the centred reference paired mean used internally by the backend.

The reported limits of agreement are $$ \mu_0 \pm \texttt{loa\_multiplier}\sqrt{\sigma_u^2 + \sigma_e^2}, $$ where $\mu_0$ is the reported model-based BA centre. These LoA are for a single new paired difference from a random subject under the fitted model.

Under the implemented parameterisation, AR(1) correlation affects the off-diagonal within-subject covariance structure and therefore the estimation of the model parameters and their uncertainty, but not the marginal variance of a single paired difference. Consequently rho does not appear explicitly in the LoA point-estimate formula.

Confidence intervals

The backend returns Wald confidence intervals for the reported BA centre and for both LoA endpoints.

These intervals combine:

the conditional GLS uncertainty in the fixed effects at the fitted covariance parameters; and
a delta-method propagation of covariance-parameter uncertainty from the observed information matrix of the profiled REML log-likelihood.

The covariance-parameter vector is profiled on transformed scales: log-variances for $\sigma_u^2$ and $\sigma_e^2$, and, when rho is estimated internally under AR(1), a transformed correlation parameter mapped back by $\rho = 0.95\tanh(z)$.

Numerical central finite differences are used to approximate both the observed Hessian of the profiled REML log-likelihood and the gradients of the reported derived quantities. The resulting variances are combined and the final intervals are formed with the normal quantile corresponding to conf_level.

Exactly two methods versus $\ge 3$ methods

With exactly two methods, at least two complete subject-time pairs are required; otherwise the function errors.

With $\ge 3$ methods, the function analyses every unordered pair of method levels. For a given pair with fewer than two complete subject-time matches, that contrast is skipped and the corresponding matrix entries remain NA.

For a fitted contrast between methods in matrix positions $(j, k)$ with $j < k$, the stored orientation is: $$ \texttt{bias}[j,k] \approx \mathrm{method}_k - \mathrm{method}_j. $$ Hence the transposed entry changes sign, while sd_loa and width are symmetric.

Identifiability and safeguards

Separate estimation of the residual and subject-level variance components requires sufficient complete within-subject replication after pairing. If the paired data are not adequate to separate these components, the fit stops with an identifiability error.

If the model is conceptually estimable but no finite positive pooled within-subject variance can be formed during initialisation, the backend uses $0.5 \times v_{\mathrm{ref}}$ only as a temporary positive starting value for the EM routine and records a warning string in the backend output. The exported wrapper does not otherwise modify the final estimates.

If the EM/GLS routine fails to reach admissible finite variance-component estimates, the backend throws an explicit convergence error rather than returning fallback estimates.

Examples

Run this code

# -------- Simulate repeated-measures data --------
set.seed(1)

# design (no AR)
# subjects
S   <- 30L
# replicates per subject
Tm  <- 15L
subj <- rep(seq_len(S), each = Tm)
time <- rep(seq_len(Tm), times = S)

# subject signal centered at 0 so BA "bias" won't be driven by the mean level
mu_s  <- rnorm(S, mean = 0, sd = 8)
# constant within subject across replicates
true  <- mu_s[subj]

# common noise (no AR, i.i.d.)
sd_e <- 2
e0   <- rnorm(length(true), 0, sd_e)

# --- Methods ---
# M1: signal + noise
y1 <- true + e0

# M2: same precision as M1; here identical so M3 can be
#     almost perfectly the inverse of both M1 and M2
y2 <- y1 + rnorm(length(true), 0, 0.01)

# M3: perfect inverse of M1 and M2
y3 <- -y1   # = -(true + e0)

# M4: unrelated to all others (pure noise, different scale)
y4 <- rnorm(length(true), 3, 6)

data <- rbind(
  data.frame(y = y1, subject = subj, method = "M1", time = time),
  data.frame(y = y2, subject = subj, method = "M2", time = time),
  data.frame(y = y3, subject = subj, method = "M3", time = time),
  data.frame(y = y4, subject = subj, method = "M4", time = time)
)
data$method <- factor(data$method, levels = c("M1","M2","M3","M4"))

# quick sanity checks
with(data, {
  Y <- split(y, method)
  round(cor(cbind(M1 = Y$M1, M2 = Y$M2, M3 = Y$M3, M4 = Y$M4)), 3)
})

# Run BA (no AR)
ba4 <- ba_rm(
  data = data,
  response = "y", subject = "subject", method = "method", time = "time",
  loa_multiplier = 1.96, conf_level = 0.95,
  include_slope = FALSE, use_ar1 = FALSE
)
summary(ba4)
plot(ba4)

# -------- Simulate repeated-measures with AR(1) data --------
set.seed(123)
S <- 40L                      # subjects
Tm <- 50L                     # replicates per subject
methods <- c("A","B","C")     # N = 3 methods
rho <- 0.4                    # AR(1) within-subject across time

ar1_sim <- function(n, rho, sd = 1) {
  z <- rnorm(n)
  e <- numeric(n)
  e[1] <- z[1] * sd
  if (n > 1) for (t in 2:n) e[t] <- rho * e[t-1] + sqrt(1 - rho^2) * z[t] * sd
  e
}

# Subject baseline + time trend (latent "true" signal)
subj <- rep(seq_len(S), each = Tm)
time <- rep(seq_len(Tm), times = S)
# subject effects
mu_s  <- rnorm(S, 50, 7)
trend <- rep(seq_len(Tm) - mean(seq_len(Tm)), times = S) * 0.8
true  <- mu_s[subj] + trend

# Method-specific biases (B has +1.5 constant; C has slight proportional bias)
bias  <- c(A = 0, B = 1.5, C = -0.5)
# proportional component on "true"
prop  <- c(A = 0.00, B = 0.00, C = 0.10)

# Build long data: for each method, add AR(1) noise within subject over time
make_method <- function(meth, sd = 3) {
  e <- unlist(lapply(split(seq_along(time), subj),
                     function(ix) ar1_sim(length(ix), rho, sd)))
  y <- true * (1 + prop[meth]) + bias[meth] + e
  data.frame(y = y, subject = subj, method = meth, time = time,
             check.names = FALSE)
}

data <- do.call(rbind, lapply(methods, make_method))
data$method <- factor(data$method, levels = methods)

# -------- Repeated BA (pairwise matrix) ---------------------
baN <- ba_rm(
  response = data$y, subject = data$subject, method = data$method, time = data$time,
  loa_multiplier = 1.96, conf_level = 0.95,
  include_slope = FALSE,         # estimate proportional bias per pair
  use_ar1 = TRUE, ar1_rho = rho
)

# Matrices (row - column orientation)
print(baN)
summary(baN)

# Faceted BA scatter by pair
plot(baN, smoother = "lm", facet_scales = "free_y")

# -------- Two-method AR(1) path (A vs B only) ------------------------------
data_AB <- subset(data, method %in% c("A","B"))
baAB <- ba_rm(
  response = data_AB$y, subject = data_AB$subject,
  method = droplevels(data_AB$method), time = data_AB$time,
  include_slope = FALSE, use_ar1 = TRUE, ar1_rho = 0.4
)
print(baAB)
plot(baAB)

Run the code above in your browser using DataLab