caret (version 4.20)

sensitivity: Calculate sensitivity, specificity and predictive values

Description

These functions calculate the sensitivity, specificity or predictive values of a measurement system compared to a reference results (the truth or a gold standard). The measurement and "truth" data must have the same two possible outcomes and one of the outcomes must be thought of as a "positive" results. The sensitivity is defined as the proportion of positive results out of the number of samples which were actually positive. When there are no positive results, sensitivity is not defined and a value of NA is returned. Similarly, when there are no negative results, specificity is not defined and a value of NA is returned. Similar statements are true for predictive values. The positive predictive value is defined as the percent of predicted positives that are actually positive while the negative predictive value is defined as the percent of negative positives that are actually negative.

Usage

sensitivity(data, ...)
## S3 method for class 'default':
sensitivity(data, reference, positive = levels(reference)[1], ...)
## S3 method for class 'table':
sensitivity(data, positive = rownames(data)[1], ...)
## S3 method for class 'matrix':
sensitivity(data, positive = rownames(data)[1], ...)

specificity(data, ...) ## S3 method for class 'default': specificity(data, reference, negative = levels(reference)[-1], ...) ## S3 method for class 'table': specificity(data, negative = rownames(data)[-1], ...) ## S3 method for class 'matrix': specificity(data, negative = rownames(data)[-1], ...)

posPredValue(data, ...) ## S3 method for class 'default': posPredValue(data, reference, positive = levels(reference)[1], prevalence = NULL, ...) ## S3 method for class 'table': posPredValue(data, positive = rownames(data)[1], prevalence = NULL, ...) ## S3 method for class 'matrix': posPredValue(data, positive = rownames(data)[1], prevalence = NULL, ...)

negPredValue(data, ...) ## S3 method for class 'default': negPredValue(data, reference, negative = levels(reference)[2], prevalence = NULL, ...) ## S3 method for class 'table': negPredValue(data, negative = rownames(data)[-1], prevalence = NULL, ...) ## S3 method for class 'matrix': negPredValue(data, negative = rownames(data)[-1], prevalence = NULL, ...)

Arguments

data
for the default functions, a factor containing the discrete measurements. For the table or matrix functions, a table or matric object, respectively.
reference
a factor containing the reference values
positive
a character string that defines the factor level corresponding to the "positive" results
negative
a character string that defines the factor level corresponding to the "negative" results
prevalence
a numeric value for the rate of the "positive" class of the data
...
not currently used

Value

  • A number between 0 and 1 (or NA).

Details

Suppose a 2x2 table with notation

rcc{ Reference Predicted Event No Event Event A B No Event C D }

The formulas used here are: $$Sensitivity = A/(A+C)$$ $$Specificity = D/(B+D)$$ $$Prevalence = (A+C)/(A+B+C+D)$$ $$PPV = (sensitivity * Prevalence)/((sensitivity*Prevalence) + ((1-specificity)*(1-Prevalence)))$$ $$NPV = (specificity * (1-Prevalence))/(((1-sensitivity)*Prevalence) + ((specificity)*(1-Prevalence)))$$

See the references for discusions of the statistics.

References

Kuhn, M. (2008), ``Building predictive models in R using the caret package, '' Journal of Statistical Software, (http://www.jstatsoft.org/v28/i05/).

Altman, D.G., Bland, J.M. (1994) ``Diagnostic tests 1: sensitivity and specificity,'' British Medical Journal, vol 308, 1552.

Altman, D.G., Bland, J.M. (1994) ``Diagnostic tests 2: predictive values,'' British Medical Journal, vol 309, 102.

See Also

confusionMatrix

Examples

Run this code
###################
## 2 class example

lvs <- c("normal", "abnormal")
truth <- factor(rep(lvs, times = c(86, 258)),
                levels = rev(lvs))
pred <- factor(
               c(
                 rep(lvs, times = c(54, 32)),
                 rep(lvs, times = c(27, 231))),               
               levels = rev(lvs))

xtab <- table(pred, truth)

sensitivity(pred, truth)
sensitivity(xtab)
posPredValue(pred, truth)
posPredValue(pred, truth, prevalence = 0.25)

specificity(pred, truth)
negPredValue(pred, truth)
negPredValue(xtab)
negPredValue(pred, truth, prevalence = 0.25)


prev <- seq(0.001, .99, length = 20)
npvVals <- ppvVals <- prev  * NA
for(i in seq(along = prev))
  {
    ppvVals[i] <- posPredValue(pred, truth, prevalence = prev[i])
    npvVals[i] <- negPredValue(pred, truth, prevalence = prev[i])
  }

plot(prev, ppvVals,
     ylim = c(0, 1),
     type = "l",
     ylab = "",
     xlab = "Prevalence (i.e. prior)")
points(prev, npvVals, type = "l", col = "red")
abline(h=sensitivity(pred, truth), lty = 2)
abline(h=specificity(pred, truth), lty = 2, col = "red")
legend(.5, .5,
       c("ppv", "npv", "sens", "spec"),
       col = c("black", "red", "black", "red"),
       lty = c(1, 1, 2, 2))

###################
## 3 class example

library(MASS)

fit <- lda(Species ~ ., data = iris)
model <- predict(fit)$class

irisTabs <- table(model, iris$Species)

## When passing factors, an error occurs with more
## than two levels
sensitivity(model, iris$Species)

## When passing a table, more than two levels can
## be used
sensitivity(irisTabs, "versicolor")
specificity(irisTabs, c("setosa", "virginica"))

Run the code above in your browser using DataCamp Workspace