coat: Conditional Method Agreement Trees (COAT)

Description

Tree models capturing the dependence of method agreement on covariates. The classic Bland-Altman analysis is used for modeling method agreement while the covariate dependency can be learned either nonparametrically via conditional inference trees (CTree) or using model-based recursive partitioning (MOB).

Usage

coat(
  formula,
  data,
  subset,
  na.action,
  weights,
  means = FALSE,
  type = c("ctree", "mob"),
  minsize = 10L,
  minbucket = minsize,
  minsplit = NULL,
  ...
)

Value

Object of class coat, inheriting either from constparty (if ctree

is used) or modelparty (if mob is used).

Arguments

formula: symbolic description of the model of type y1 + y2 ~ x1 + ... + xk. The left-hand side should specify a pair of measurements (y1 and y2) for the Bland-Altman analysis. The right-hand side can specify any number of potential split variables for the tree.
data, subset, na.action: arguments controlling the formula processing via model.frame.
weights: optional numeric vector of weights (case/frequency weights, by default).
means: logical. Should the intra-individual mean values of measurements be included as potential split variable?
type: character string specifying the type of tree to be fit. Either "ctree" (default) or "mob".
minsize, minbucket: integer. The minimum number of observations in a subgroup. Only one of the two arguments should be used (see also below).
minsplit: integer. The minimum number of observations to consider splitting. Must be at least twice the minimal subgroup size (minsplit or minbucket). If set to NULL (the default) it is set to be at least 2.5 times the minimal subgroup size.
...: further control arguments, either passed to ctree_control or mob_control, respectively.

Details

Conditional method agreement trees (COAT) employ unbiased recursive partitioning in order to detect and model dependency on covariates in the classic Bland-Altman analysis. One of two recursive partitioning techniques can be used to find subgroups defined by splits in covariates to a pair of measurements, either nonparametric conditional inference trees (CTree) or parametric model-based trees (MOB). In both cases, each subgroup is associated with two parameter estimates: the mean of the measurement difference (“Bias”) and the corresponding sample standard deviation (“SD”) which can be used to construct the limits of agreement (i.e., the corresponding confidence intervals).

The minimum number of observations in a subgroup defaults to 10, so that the mean and variance of the measurement differences can be estimated reasonably for the Bland-Altman analysis. The default can be changed with with the argument minsize or, equivalently, minbucket. (The different names stem from slightly different conventions in the underlying tree functions.) Consequently, the minimum number of observations to consider splitting (minsplit) must be, at the very least, twice the minimum number of observations per subgroup (which would allow only one possible split, though). By default, minsplit is 2.5 times minsize. Users are encouraged to consider whether for their application it is sensible to increase or decrease these defaults. Finally, further control parameters can be specified through the ... argument, see ctree_control and mob_control, respectively, for details.

In addition to the standard specification of the two response measurements in the formula via y1 + y2 ~ ..., it is also possible to use y1 - y2 ~ .... The latter may be more intuitive for users that think of it as a model for the difference of two measurements. Finally cbind(y1, y2) ~ ... also works. Internally, all of these are processed in the same way, namely as a bivariate dependent variable that can then be modeled and plotted appropriately.

To add the means of the measurement pair as a potential splitting variable, there are also different equivalent strategies. The standard specification would be via the means argument: y1 + y2 ~ x1 + ..., means = TRUE. Alternatively, the user can also extend the formula argument via y1 + y2 ~ x1 + ... + means(y1, y2).

The SD is estimated by the usual sample standard deviation in each subgroup, i.e., divided by the sample size \(n - 1\). Note that the inference in the MOB algorithm internally uses the maximum likelihood estimate (divided by \(n\)) instead so the the fluctuation tests for parameter instability can be applied.

References

Karapetyan S, Zeileis A, Henriksen A, Hapfelmeier A (2023). “Tree Models for Assessing Covariate-Dependent Method Agreement.” arXiv 2306.04456, arXiv.org E-Print Archive. tools:::Rd_expr_doi("10.48550/arXiv.2306.04456")

Examples

Run this code

 if(!requireNamespace("MethComp")) {
  if(interactive() || is.na(Sys.getenv("_R_CHECK_PACKAGE_NAME_", NA))) {
    stop("the MethComp package is required for this example but is not installed")
  } else q() }

## package and data (reshaped to wide format)
library("coat")
data("scint", package = "MethComp")
scint_wide <- reshape(scint, v.names = "y", timevar = "meth", idvar = "item", direction = "wide")

## coat based on ctree() without and with mean values of paired measurements as predictor
tr1 <- coat(y.DTPA + y.DMSA ~ age + sex, data = scint_wide)
tr2 <- coat(y.DTPA + y.DMSA ~ age + sex, data = scint_wide, means = TRUE)

## display
print(tr1)
plot(tr1)

print(tr2)
plot(tr2)

## tweak various graphical arguments of the panel function (just for illustration):
## different colors, nonparametric bootstrap percentile confidence intervals, ...
plot(tr1, tp_args = list(
  xscale = c(0, 150), linecol = "deeppink",
  confint = TRUE, B = 250, cilevel = 0.5, cicol = "gold"
))

Run the code above in your browser using DataLab