svem_select_from_score_table: Select best row and diverse candidates from an SVEM score table

Description

Given a scored random-search table (e.g. from svem_score_random()), pick a single "best" row under a chosen objective column and sample a small, diverse set of medoid candidates from the top of that ranking. Any per-response CI columns (e.g. *_lwr / *_upr) present in score_table are carried through unchanged.

Optionally, a string label can be supplied to annotate the returned best row and candidates by appending that label to a "Notes_from_SVEMnet" column. If "Notes_from_SVEMnet" is missing, it is created. If it exists and is nonempty, the label is appended with "; " as a separator.

Usage

svem_select_from_score_table(
  score_table,
  target = "score",
  direction = c("max", "min"),
  k = 5,
  top_type = c("frac", "n"),
  top = 0.1,
  predictor_cols = NULL,
  label = NULL
)

Value

A list with components:

best: One-row data frame at the optimum of target under the specified direction, including any columns present in score_table (e.g. *_lwr / *_upr).
candidates: Data frame of medoid candidates (possibly empty or NULL) drawn from the top top of the ranking on target, with all columns carried through from score_table.
call: The matched call, including all arguments used to create this selection object.

Arguments

score_table: Data frame with predictors, responses, scores, and uncertainty_measure, typically scored$score_table from svem_score_random. When medoids are requested (k > 0), the predictor columns used for clustering are taken from the "svem_predictor_cols" attribute by default. If that attribute is missing, a heuristic is used. If you accidentally pass the full scored list, a helpful error is thrown reminding you to use scored$score_table.
target: Character scalar naming the column in score_table to optimize (e.g. "score", "wmt_score", "uncertainty_measure").
direction: Either "max" or "min" indicating whether larger or smaller values of target are preferred.
k: Integer; desired number of medoid candidates to return. If k <= 0, only the best row is returned and no clustering is performed.
top_type: Either "frac" or "n" specifying whether top is a fraction of rows or an integer count.
top: Value for the top set: a fraction in (0,1] if top_type = "frac", or an integer >= 1 if top_type = "n".
predictor_cols: Optional character vector of predictor column names used to measure diversity in the PAM step when k > 0. When NULL (default), the function first tries attr(score_table, "svem_predictor_cols"). If that is unavailable, it falls back to a heuristic that prefers non-derived predictor columns (excluding e.g. *_pred, *_des, *_lwr, *_upr, *_ciw_w, *_p_in_spec_mean, *_in_spec_point, score, wmt_score, uncertainty_measure, p_joint_mean, joint_in_spec_point, candidate_type, selection_label, Notes_from_SVEMnet). If no usable predictor columns can be inferred, a warning is issued and only best is returned.
label: Optional character scalar. When non-NULL, this label is appended into a "Notes_from_SVEMnet" column for the returned best row and candidates. If "Notes_from_SVEMnet" is missing, it is created; if present and nonempty, the label is appended using "; " as separator.

Description

Usage

Value

Arguments

See Also