PEAXAI_peer: Identify Benchmark Peers Based on Estimated Efficiency Probabilities

Description

Identifies peer units (i.e., reference benchmarks) for each decision-making unit (DMU) based on predicted probabilities of technical efficiency. Given a fitted classification model that estimates the probability of being efficient, the function selects, for each DMU, its nearest efficient peer according to Euclidean or weighted distances. Multiple efficiency thresholds can be specified to assess different levels of benchmarking stringency.

Usage

PEAXAI_peer(
  data,
  x,
  y,
  final_model,
  efficiency_thresholds,
  weighted = FALSE,
  relative_importance = NULL
)

Value

A named list of matrices. Each element corresponds to an efficiency threshold and contains, for each DMU, the index of the closest efficient peer. If weighted = FALSE, the list contains unweighted peers. If weighted = TRUE, the list contains weighted peers.

Arguments

data: A data.frame or matrix containing input and output variables used in the efficiency model.
x: Integer vector indicating the column indices of input variables in data.
y: Integer vector indicating the column indices of output variables in data.
final_model: A fitted classification model used to estimate efficiency probabilities. Supported classes: "train" (from caret) or "glm" (binomial).
efficiency_thresholds: Numeric vector indicating the minimum probability values required to consider a DMU as efficient.
weighted: Logical. If TRUE, peers are selected using weighted Euclidean distances based on variable importance. If FALSE (default), unweighted distances are used.
relative_importance: Optional named numeric vector indicating the relative importance of each input/output variable (used when weighted = TRUE).

Details

This function enables probabilistic peer identification under uncertainty, supporting flexible definitions of efficiency based on thresholds over estimated probabilities. When weighted = TRUE, variable weights (e.g., derived from feature importance) modulate the peer selection process, allowing for context-aware benchmarking.

Examples

Run this code

# \donttest{
  data("firms", package = "PEAXAI")

  data <- subset(
    firms,
    autonomous_community == "Comunidad Valenciana"
  )

  x <- 1:4
  y <- 5
  RTS <- "vrs"
  imbalance_rate <- NULL

  trControl <- list(
    method = "cv",
    number = 3
  )

  # glm method
  methods <- list(
    "glm" = list(
      weights = "dinamic"
     )
   )

  metric_priority <- c("Balanced_Accuracy", "ROC_AUC")

  models <- PEAXAI_fitting(
    data = data, x = x, y = y, RTS = RTS,
    imbalance_rate = imbalance_rate,
    methods = methods,
    trControl = trControl,
    metric_priority = metric_priority,
    verbose = FALSE,
    seed = 1
  )

  final_model <- models[["best_model_fit"]][["glm"]]

  relative_importance <- PEAXAI_global_importance(
    data = data, x = x, y = y,
    final_model = final_model,
    background = "real", target = "real",
    importance_method = list(name = "PI", n.repetitions = 5)
  )

  efficiency_thresholds <- seq(0.75, 0.95, 0.1)

  directional_vector <- list(relative_importance = relative_importance,
  scope = "global", baseline  = "mean")

  targets <- PEAXAI_targets(data = data, x = x, y = y, final_model = final_model,
  efficiency_thresholds = efficiency_thresholds, directional_vector = directional_vector,
  n_expand = 0.5, n_grid = 50, max_y = 2, min_x = 1)

  peers <- PEAXAI_peer(data = data, x = x, y = y, final_model = final_model,
  efficiency_thresholds = efficiency_thresholds, weighted = FALSE)
# }

Run the code above in your browser using DataLab