Learn R Programming

bioLeak (version 0.2.0)

confounder_sensitivity: Confounder sensitivity summaries

Description

Computes performance metrics within confounder strata to surface potential confounding. Requires aligned metadata in `coldata`.

Usage

confounder_sensitivity(
  fit,
  confounders = NULL,
  metric = NULL,
  min_n = 10,
  coldata = NULL,
  numeric_bins = 4,
  learner = NULL
)

Value

A data.frame with per-confounder, per-level metrics and counts.

Arguments

fit

A [LeakFit] object from [fit_resample()].

confounders

Character vector of columns in `coldata` to evaluate. Defaults to common batch/study identifiers when available.

metric

Metric name to compute within each stratum. Defaults to the first metric used in the fit (or task defaults if unavailable).

min_n

Minimum samples per stratum; smaller strata return NA metrics.

coldata

Optional data.frame of sample metadata. Defaults to `fit@splits@info$coldata` when available.

numeric_bins

Integer number of quantile bins for numeric confounders with many unique values.

learner

Optional character scalar. When predictions include multiple learners, selects the learner to summarize.

Examples

Run this code
set.seed(42)
df <- data.frame(
  subject = rep(1:15, each = 2),
  outcome = factor(rep(c(0, 1), 15)),
  batch = factor(rep(c("A", "B", "C"), 10)),
  x1 = rnorm(30),
  x2 = rnorm(30)
)
splits <- make_split_plan(df, outcome = "outcome",
                          mode = "subject_grouped", group = "subject",
                          v = 3, progress = FALSE)
custom <- list(
  glm = list(
    fit = function(x, y, task, weights, ...) {
      stats::glm(y ~ ., data = as.data.frame(x),
                 family = stats::binomial(), weights = weights)
    },
    predict = function(object, newdata, task, ...) {
      as.numeric(stats::predict(object, newdata = as.data.frame(newdata),
                                type = "response"))
    }
  )
)
fit <- fit_resample(df, outcome = "outcome", splits = splits,
                    learner = "glm", custom_learners = custom,
                    metrics = "auc", refit = FALSE, seed = 1)
confounder_sensitivity(fit, confounders = "batch", coldata = df)

Run the code above in your browser using DataLab