Calculates thresholds that maximize a statistic (goal) for cues.
cuerank(formula = NULL, data = NULL, goal = "bacc", sens.w = 0.5,
cost.outcomes = c(0, 1, 1, 0), cost.cues = NULL, numthresh.method = "o",
rounding = NULL, factor.directions = c("=", "!="),
numeric.directions = c(">", "
formula. A formula specifying a binary criterion as a function of multiple variables
dataframe. A dataframe containing variables in formula
character. A string indicating the statistic to maximize: "acc" = overall accuracy, "bacc" = balanced accuracy, "wacc" = weighted accuracy, "d" = dprime
numeric. A number from 0 to 1 indicating how to weight sensitivity relative to specificity.
numeric. A vector of length 4 specifying the costs of a hit, false alarm, miss, and correct rejection rspectively. E.g.; cost.outcomes = c(0, 10, 20, 0)
means that a false alarm and miss cost 10 and 20 respectively while correct decisions have no cost.
dataframe. A dataframe with two columns specifying the cost of each cue. The first column should be a vector of cue names, and the second column should be a numeric vector of costs. Cues in the dataset not present in cost.cues
are assume to have 0 cost.
character. A string indicating how to calculate cue splitting thresholds. "m" = median split, "o" = split that maximizes the goal,
integer. An integer indicating digit rounding for non-integer numeric cue thresholds. The default is NULL which means no rounding. A value of 0 rounds all possible thresholds to the nearest integer, 1 rounds to the nearest .1 (etc.).
character. A vector of possible directions for factor values. c("=", "!=")
allows both equality and inequality, while "="
only allows for equality.
character. A vector of possible directions for numeric values. c(">", "<")
allows only strict inequalities while c("<=", "<", ">=", ">")
is more flexible.
logical. Should FALSE logical values be considered as potential thresholds? This is only relevant for very special algorithms.
logical. Should ongoing diagnostics be printed?
dataframe. An optional df specifying existing cue thresholds, directions, names, and classes
A dataframe containing thresholds and marginal classification statistics for each cue
# NOT RUN {
# }
# NOT RUN {
# What are the best thresholds for each cue in the mushrooms dataset?
mushrooms.cues <- cuerank(formula = poisonous ~.,
data = mushrooms)
# }
# NOT RUN {
# }
Run the code above in your browser using DataCamp Workspace