confounder_sensitivity: Confounder sensitivity summaries

Description

Computes performance metrics within confounder strata to surface potential confounding. Requires aligned metadata in `coldata`.

Usage

confounder_sensitivity(
  fit,
  confounders = NULL,
  metric = NULL,
  min_n = 10,
  coldata = NULL,
  numeric_bins = 4,
  learner = NULL
)

Value

A data.frame with per-confounder, per-level metrics and counts.

Arguments

fit: A [LeakFit] object from [fit_resample()].
confounders: Character vector of columns in `coldata` to evaluate. Defaults to common batch/study identifiers when available.
metric: Metric name to compute within each stratum. Defaults to the first metric used in the fit (or task defaults if unavailable).
min_n: Minimum samples per stratum; smaller strata return NA metrics.
coldata: Optional data.frame of sample metadata. Defaults to `fit@splits@info$coldata` when available.
numeric_bins: Integer number of quantile bins for numeric confounders with many unique values.
learner: Optional character scalar. When predictions include multiple learners, selects the learner to summarize.

Examples

Run this code

set.seed(42)
df <- data.frame(
  subject = rep(1:15, each = 2),
  outcome = factor(rep(c(0, 1), 15)),
  batch = factor(rep(c("A", "B", "C"), 10)),
  x1 = rnorm(30),
  x2 = rnorm(30)
)
splits <- make_split_plan(df, outcome = "outcome",
                          mode = "subject_grouped", group = "subject",
                          v = 3, progress = FALSE)
custom <- list(
  glm = list(
    fit = function(x, y, task, weights, ...) {
      stats::glm(y ~ ., data = as.data.frame(x),
                 family = stats::binomial(), weights = weights)
    },
    predict = function(object, newdata, task, ...) {
      as.numeric(stats::predict(object, newdata = as.data.frame(newdata),
                                type = "response"))
    }
  )
)
fit <- fit_resample(df, outcome = "outcome", splits = splits,
                    learner = "glm", custom_learners = custom,
                    metrics = "auc", refit = FALSE, seed = 1)
confounder_sensitivity(fit, confounders = "batch", coldata = df)

Run the code above in your browser using DataLab