comp_accu_prob: Compute exact accuracy metrics based on probabilities.

Description

comp_accu_prob computes a list of exact accuracy metrics from a sufficient and valid set of 3 essential probabilities (prev, and sens or its complement mirt, and spec or its complement fart).

Usage

comp_accu_prob(prev = prob$prev, sens = prob$sens, mirt = NA,
  spec = prob$spec, fart = NA, tol = 0.01, w = 0.5)

Arguments

The condition's prevalence prev (i.e., the probability of condition being TRUE).

sens

The decision's sensitivity sens (i.e., the conditional probability of a positive decision provided that the condition is TRUE). sens is optional when its complement mirt is provided.

mirt

The decision's miss rate mirt (i.e., the conditional probability of a negative decision provided that the condition is TRUE). mirt is optional when its complement sens is provided.

spec

The decision's specificity value spec (i.e., the conditional probability of a negative decision provided that the condition is FALSE). spec is optional when its complement fart is provided.

fart

The decision's false alarm rate fart (i.e., the conditional probability of a positive decision provided that the condition is FALSE). fart is optional when its complement spec is provided.

tol

A numeric tolerance value for is_complement. Default: tol = .01.

The weighting parameter w (from 0 to 1) for computing weighted accuracy wacc. Default: w = .50 (i.e., yielding balanced accuracy bacc).

Notes:

Accuracy metrics describe the correspondence of decisions (or predictions) to actual conditions (or truth).

There are several possible interpretations of accuracy:
1. as probabilities (i.e., acc being the proportion of correct classifications, or the ratio dec_cor/N),
2. as frequencies (e.g., as classifying a population of N individuals into cases of dec_cor vs. dec_err),
3. as correlations (e.g., see mcc in accu).
Computing exact accuracy values based on probabilities (by comp_accu_prob) may differ from accuracy values computed from (possibly rounded) frequencies (by comp_accu_freq).

When frequencies are rounded to integers (see the default of round = TRUE in comp_freq and comp_freq_prob) the accuracy metrics computed by comp_accu_freq correspond to these rounded values. Use comp_accu_prob to obtain exact accuracy metrics from probabilities.

Value

A list accu containing current accuracy metrics.

Details

Currently computed accuracy metrics include:

acc: Overall accuracy as the proportion (or probability) of correctly classifying cases or of dec_cor cases:

(a) from prob: acc = (prev x sens) + [(1 - prev) x spec]

(b) from freq: acc = dec_cor/N = (hi + cr)/(hi + mi + fa + cr)

When frequencies in freq are not rounded, (b) coincides with (a).

Values range from 0 (no correct prediction) to 1 (perfect prediction).
wacc: Weighted accuracy, as a weighted average of the sensitivity sens (aka. hit rate HR, TPR, power or recall) and the the specificity spec (aka. TNR) in which sens is multiplied by a weighting parameter w (ranging from 0 to 1) and spec is multiplied by w's complement (1 - w):
wacc = (w * sens) + ((1 - w) * spec)

If w = .50, wacc becomes balanced accuracy bacc.
mcc: The Matthews correlation coefficient (with values ranging from -1 to +1):
mcc = ((hi * cr) - (fa * mi)) / sqrt((hi + fa) * (hi + mi) * (cr + fa) * (cr + mi))

A value of mcc = 0 implies random performance; mcc = 1 implies perfect performance.

See Wikipedia: Matthews correlation coefficient for additional information.
f1s: The harmonic mean of the positive predictive value PPV (aka. precision) and the sensitivity sens (aka. hit rate HR, TPR, power or recall):
f1s = 2 * (PPV * sens) / (PPV + sens)

See Wikipedia: F1 score for additional information.

Note that some accuracy metrics can be interpreted as probabilities (e.g., acc) and some as correlations (e.g., mcc).

Also, accuracy can be viewed as a probability (e.g., the ratio of or link between dec_cor and N) or as a frequency type (containing dec_cor and dec_err).

comp_accu_prob computes exact accuracy metrics from probabilities. When input frequencies were rounded (see the default of round = TRUE in comp_freq and comp_freq_prob) the accuracy metrics computed by comp_accu correspond these rounded values.

References

Consult Wikipedia: Confusion matrix for additional information.

Examples

Run this code

# NOT RUN {
comp_accu_prob()  # => accuracy metrics for prob of current scenario
comp_accu_prob(prev = .2, sens = .5, spec = .5)  # medium accuracy, but cr > hi.

# Extreme cases:
comp_accu_prob(prev = NaN, sens = NaN, spec = NaN)  # returns list of NA values
comp_accu_prob(prev = 0, sens = NaN, spec = 1)      # returns list of NA values
comp_accu_prob(prev = 0, sens = 0, spec = 1)     # perfect acc = 1, but f1s is NaN
comp_accu_prob(prev = .5, sens = .5, spec = .5)  # random performance
comp_accu_prob(prev = .5, sens = 1,  spec = 1)   # perfect accuracy
comp_accu_prob(prev = .5, sens = 0,  spec = 0)   # zero accuracy, but f1s is NaN
comp_accu_prob(prev = 1,  sens = 1,  spec = 0)   # perfect, but see wacc (0.5) and mcc (0)

# Effects of w:
comp_accu_prob(prev = .5, sens = .6, spec = .4, w = 1/2)  # equal weights to sens and spec
comp_accu_prob(prev = .5, sens = .6, spec = .4, w = 2/3)  # more weight on sens: wacc up
comp_accu_prob(prev = .5, sens = .6, spec = .4, w = 1/3)  # more weight on spec: wacc down

# Contrasting comp_accu_freq and comp_accu_prob:
# (a) comp_accu_freq (based on rounded frequencies):
freq1 <- comp_freq(N = 10, prev = 1/3, sens = 2/3, spec = 3/4)   # => rounded frequencies!
accu1 <- comp_accu_freq(freq1$hi, freq1$mi, freq1$fa, freq1$cr)  # => accu1 (based on rounded freq).
# accu1

# (b) comp_accu_prob (based on probabilities):
accu2 <- comp_accu_prob(prev = 1/3, sens = 2/3, spec = 3/4)      # => exact accu (based on prob).
# accu2
all.equal(accu1, accu2)  # => 4 differences!
#
# (c) comp_accu_freq (exact values, i.e., without rounding):
freq3 <- comp_freq(N = 10, prev = 1/3, sens = 2/3, spec = 3/4, round = FALSE)
accu3 <- comp_accu_freq(freq3$hi, freq3$mi, freq3$fa, freq3$cr)  # => accu3 (based on EXACT freq).
# accu3
all.equal(accu2, accu3)  # => TRUE (qed).


# }

Run the code above in your browser using DataLab