Learn R Programming

EPX (version 1.0.4)

IE: Calculate Initial Enhancement

Description

Calculates initial enhancement (IE), which is the precision at one specific shortlist length (cutoff) normalised by the proportion of relevants in the total sample size (Tomal et al. 2015). Since IE is a rescaling of precision, we expect IE and AHR to lead to similar conclusions as an assessment metric for the EPX algorithm.

Usage

IE(y, phat, cutoff = length(y)/2, ...)

Arguments

y

True (binary) response vector where 1 is the rare/relevant class.

phat

Numeric vector of estimated probabilities of relevance.

cutoff

Shortlist cutoff length, and so must not exceed length of y; default is half the sample size.

...

Further arguments passed to or from other methods.

Value

Numeric value of IE.

Details

Let \(c\) be the cutoff and \(h(c)\) be the hitrate at \(c\). Let also \(A\) be the total number of relevants and \(N\) be the total number of observations. IE is defined as $$IE = h(c) / (A / N)$$ IE calculation does not change whether there are ties in phat or not.

References

Tomal, J. H., Welch, W. J., & Zamar, R. H. (2015). Ensembling classification models based on phalanxes of variables with applications in drug discovery. The Annals of Applied Statistics, 9(1), 69-93. 10.1214/14-AOAS778

Examples

Run this code
# NOT RUN {
## IE when there are no ties in phat:
# }
# NOT RUN {
resp <- c(1, 1, 0,   0,   0,   0,   0,    1,   0, 0)
prob <- (10:1) * 0.1
IE(y = resp, phat = prob, cutoff = 3)
# expect answer: (2/3) / (3/10)

## IE when there are ties
resp <- c(1, 1, 0,   0,   0,   0,   0,    1,   0, 0)
prob <- c(1, 1, 1, 0.4, 0.4, 0.3, 0.2, 0.15, 0.1, 0)
IE(y = resp, phat = prob, cutoff = 3)
# }
# NOT RUN {
# expect answer: same as above
# }

Run the code above in your browser using DataLab