comp_freq: Compute frequencies from (3 essential) probabilities.

Description

comp_freq computes frequencies (typically as rounded integers) given 3 basic probabilities -- prev, sens, and spec -- for a population of N individuals. It returns a list of 11 key frequencies freq as its output.

Usage

comp_freq(
  prev = num$prev,
  sens = num$sens,
  spec = num$spec,
  N = num$N,
  round = TRUE,
  sample = FALSE
)

Value

A list freq containing 11 key frequency values.

Arguments

prev

The condition's prevalence prev (i.e., the probability of condition being TRUE).

sens

The decision's sensitivity sens (i.e., the conditional probability of a positive decision provided that the condition is TRUE).

spec

The decision's specificity value spec (i.e., the conditional probability of a negative decision provided that the condition is FALSE).

N

The number of individuals in the population. If N is unknown (NA), a suitable minimum value is computed by comp_min_N.

round

Boolean value that determines whether frequency values are rounded to the nearest integer. Default: round = TRUE.

Note: Removed n_digits parameter: Number of digits to which frequency values are to be rounded when round = FALSE. Default: n_digits = 5.

sample

Boolean value that determines whether frequency values are sampled from N, given the probability values of prev, sens, and spec. Default: sample = FALSE.

Note: Sampling uses sample() and returns integer values.

Details

In addition to prev, both sens and spec are necessary arguments. If only their complements mirt or fart are known, use the wrapper function comp_freq_prob which also accepts mirt and fart as inputs (but requires that the entire set of provided probabilities is sufficient and consistent). Alternatively, use comp_complement, comp_comp_pair, or comp_complete_prob_set to obtain the 3 essential probabilities.

comp_freq is the frequency counterpart to the probability function comp_prob.

By default, comp_freq and its wrapper function comp_freq_prob round frequencies to nearest integers to avoid decimal values in freq (i.e., round = TRUE by default). When frequencies are rounded, probabilities computed from freq may differ from exact probabilities. Using the option round = FALSE turns off rounding.

Key relationships between probabilities and frequencies:

Three perspectives on a population:

A population of N individuals can be split into 2 subsets of frequencies in 3 different ways:
1. by condition:
  
  N = cond_true + cond_false
  
  The frequency cond_true depends on the prevalence prev and the frequency cond_false depends on the prevalence's complement 1 - prev.
2. by decision:
  
  N = dec_pos + dec_neg
  
  The frequency dec_pos depends on the proportion of positive decisions ppod and the frequency dec_neg depends on the proportion of negative decisions 1 - ppod.
3. by accuracy (i.e., correspondence of decision to condition):
  
  N = dec_cor + dec_err
Each perspective combines 2 pairs of the 4 essential probabilities (hi, mi, fa, cr).

When providing probabilities, the population size N is a free parameter (independent of the essential probabilities prev, sens, and spec).

If N is unknown (NA), a suitable minimum value can be computed by comp_min_N.
Defining probabilities in terms of frequencies:

Probabilities are -- determine, describe, or are defined as -- the relationships between frequencies. Thus, they can be computed as ratios between frequencies:
1. prevalence prev:
  
  prev = cond_true/N = (hi + mi) / (hi + mi + fa + cr)
2. sensitivity sens:
  
  sens = hi/cond_true = hi / (hi + mi) = (1 - mirt)
3. miss rate mirt:
  
  mirt = mi/cond_true = mi / (hi + mi) = (1 - sens)
4. specificity spec:
  
  spec = cr/cond_false = cr / (fa + cr) = (1 - fart)
5. false alarm rate fart:
  
  fart = fa/cond_false = fa / (fa + cr) = (1 - spec)
6. proportion of positive decisions ppod:
  
  ppod = dec_pos/N = (hi + fa) / (hi + mi + fa + cr)
7. positive predictive value PPV:
  
  PPV = hi/dec_pos = hi / (hi + fa) = (1 - FDR)
8. negative predictive value NPV:
  
  NPV = cr/dec_neg = cr / (mi + cr) = (1 - FOR)
9. false detection rate FDR:
  
  FDR = fa/dec_pos = fa / (hi + fa) = (1 - PPV)
10. false omission rate FOR:
  
  FOR = mi/dec_neg = mi / (mi + cr) = (1 - NPV)
11. accuracy acc:
  
  acc = dec_cor/N = (hi + cr) / (hi + mi + fa + cr)
12. rate of hits, given accuracy p_acc_hi:
  
  p_acc_hi = hi/dec_cor = (1 - cr/dec_cor)
13. rate of false alarms, given inaccuracy p_err_fa:
  
  p_err_fa = fa/dec_err = (1 - mi/dec_err)
Beware of rounding and sampling issues! If frequencies are rounded (by round = TRUE in comp_freq) or sampled from probabilities (by sample = TRUE), then any probabilities computed from freq may differ from original and exact probabilities.

Functions translating between representational formats: comp_prob_prob, comp_prob_freq, comp_freq_prob, comp_freq_freq (see documentation of comp_prob_prob for details).

Examples

Run this code

comp_freq()          # ok, using current defaults
length(comp_freq())  # 11 key frequencies

# Rounding:
comp_freq(prev = .5, sens = .5, spec = .5, N = 1)   # yields fa = 1 (see ?round for reason)
comp_freq(prev = .1, sens = .9, spec = .8, N = 10)  # 1 hit (TP, rounded)
comp_freq(prev = .1, sens = .9, spec = .8, N = 10, round = FALSE)    # hi = .9
comp_freq(prev = 1/3, sens = 6/7, spec = 2/3, N = 1, round = FALSE)  # hi = 0.2857143

# Sampling (from probabilistic description):
comp_freq_prob(prev = .5, sens = .5, spec = .5, N = 100, sample = TRUE)  # freq values vary

# Extreme cases:
comp_freq(prev = 1, sens = 1, spec = 1, 100)  # ok, N hits (TP)
comp_freq(prev = 1, sens = 1, spec = 0, 100)  # ok, N hits
comp_freq(prev = 1, sens = 0, spec = 1, 100)  # ok, N misses (FN)
comp_freq(prev = 1, sens = 0, spec = 0, 100)  # ok, N misses
comp_freq(prev = 0, sens = 1, spec = 1, 100)  # ok, N correct rejections (TN)
comp_freq(prev = 0, sens = 1, spec = 0, 100)  # ok, N false alarms (FP)

# Watch out for:
comp_freq(prev = 1, sens = 1, spec = 1, N = NA)  # ok, but warning that N = 1 was computed
comp_freq(prev = 1, sens = 1, spec = 1, N =  0)  # ok, but all 0 + warning (extreme case: N hits)
comp_freq(prev = .5, sens = .5, spec = .5, N = 10, round = TRUE)   # ok, rounded (see mi and fa)
comp_freq(prev = .5, sens = .5, spec = .5, N = 10, round = FALSE)  # ok, not rounded

# Ways to fail:
comp_freq(prev = NA,  sens = 1, spec = 1,  100)   # NAs + warning (prev NA)
comp_freq(prev = 1,  sens = NA, spec = 1,  100)   # NAs + warning (sens NA)
comp_freq(prev = 1,  sens = 1,  spec = NA, 100)   # NAs + warning (spec NA)
comp_freq(prev = 8,  sens = 1,  spec = 1,  100)   # NAs + warning (prev beyond range)
comp_freq(prev = 1,  sens = 8,  spec = 1,  100)   # NAs + warning (sens beyond range)