coords: Coordinates of a ROC curve

Description

This function returns the coordinates of the ROC curve at one or several specified point(s).

Usage

coords(...)
# S3 method for auc
coords(auc, ...)
# S3 method for roc
coords(roc, x, input="threshold", ret=c("threshold",
"specificity", "sensitivity"),  ignore.partial.auc=FALSE,
as.list=FALSE, drop=TRUE, best.method=c("youden", "closest.topleft"),
best.weights=c(1, 0.5), transpose = FALSE, as.matrix=FALSE, ...)
# S3 method for smooth.roc
coords(smooth.roc, x, input, ret=c("specificity",
"sensitivity"), ignore.partial.auc=FALSE, as.list=FALSE, drop=TRUE, 
best.method=c("youden", "closest.topleft"), best.weights=c(1, 0.5), 
transpose = FALSE, as.matrix=FALSE, ...)

Value

A data.frame with ret as columns and as many rows as given by x.

In all cases where input="specificity" or input="sensitivity"

and interpolation was required, threshold is returned as NA.

Note that if giving a character as x (“all”,

“local maximas” or “best”), you cannot predict the dimension of the return value. Even “best” may return more than one value (for example if the ROC curve is below the identity line, both extreme points).

Arguments

auc, roc, smooth.roc: a “roc” object from the roc function, or a “smooth.roc” object from the smooth function, or an “auc” object from the auc function.
x: the coordinates to look for. Numeric (if so, their meaning is defined by the input argument) or one of “all” (all the points of the ROC curve), “local maximas” (the local maximas of the ROC curve) or “best” (see best.method argument). If missing or NULL, defaults to “all”.
input: If x is numeric, the kind of input coordinate (x). Typically one of “threshold”, “specificity” or “sensitivity”, but can be any of the monotone coordinate available, see the “Valid input” column under “Available coordinates”. Can be shortened like ret. Defaults to “threshold”. Note that “threshold” is not allowed in coords.smooth.roc and that the argument is ignored when x is a character.
ret: The coordinates to return. See “Available coordinates” section below. Alternatively, the single value “all” can be used to return every coordinate available.
ignore.partial.auc: If the roc object contains a partial AUC specification, it will be ignored.
as.list: DEPRECATED. If the returned object must be a list. Will be removed in a future version.
drop: DEPRECATED. If TRUE the result is coerced to the lowest possible dimension, as per Extract. By default only drops if transpose = TRUE and either ret or x is of length 1.
best.method: if x="best", the method to determine the best threshold. Defaults to "youden". See details in the ‘Best thresholds’ section.
best.weights: if x="best", the weights to determine the best threshold. See details in the ‘Best thresholds’ section.
transpose: DEPRECATED. Whether to return the thresholds in columns (TRUE) or rows (FALSE). Since pROC 1.16 the default value is FALSE. See coords_transpose for more details the change.
as.matrix: DEPRECATED. If transpose is FALSE, whether to return a matrix (TRUE) or a data.frame (FALSE, the default). A data.frame is more convenient and flexible to use, but incurs a slight speed penalty. Consider setting this argument to TRUE if you are calling the function repeatedly.
...: further arguments passed from other methods. Ignored.

Details

This function takes a “roc” or “smooth.roc” object as first argument, on which the coordinates will be determined. The coordinates are defined by the x and input arguments. “threshold” coordinates cannot be determined in a smoothed ROC.

If input="threshold", the coordinates for the threshold are reported, even if the exact threshold do not define the ROC curve. The following convenience characters are allowed: “all”, “local maximas” and “best”. They will return all the thresholds, only the thresholds defining local maximas (upper angles of the ROC curve), or only the threshold(s) corresponding to the best sum of sensitivity + specificity respectively. Note that “best” can return more than one threshold. If x is a character, and ignore.partial.auc=TRUE, the coordinates are limited to the thresholds within the partial AUC if it has been defined, and not necessarily to the whole curve.

For input="specificity" and input="sensitivity", the function checks if the specificity or sensitivity is one of the points of the ROC curve (in roc$sensitivities or roc$specificities). More than one point may match (in step curves), then only the upper-left-most point coordinates are returned. Otherwise, the specificity and specificity of the point is interpolated and NA is returned as threshold.

The coords function in this package is a generic, but it might be superseded by functions in other packages such as colorspace or spatstat if they are loaded after pROC. In this case, call the pROC::coords explicitly.

Best thresholds

If x="best", the best.method argument controls how the optimal threshold is determined.

“youden”

Youden's J statistic (Youden, 1950) is employed (default). The optimal cut-off is the threshold that maximizes the distance to the identity (diagonal) line. Can be shortened to “y”.

The optimality criterion is: $$max(sensitivities + specificities)$$

“closest.topleft”

The optimal threshold is the point closest to the top-left part of the plot with perfect sensitivity or specificity. Can be shortened to “c” or “t”.

The optimality criterion is: $$min((1 - sensitivities)^2 + (1- specificities)^2)$$

In addition, weights can be supplied if false positive and false negative predictions are not equivalent: a numeric vector of length 2 to the best.weights argument. The elements define

the relative cost of of a false negative classification (as compared with a false positive classification)
the prevalence, or the proportion of cases in the population ($\frac{n_{cases}}{n_{controls}+n_{cases}}$).

The optimality criteria are modified as proposed by Perkins and Schisterman:

“youden”: $$max(sensitivities + r * specificities)$$
“closest.topleft”: $$min((1 - sensitivities)^2 + r * (1- specificities)^2)$$

with

$$r = \frac{1 - prevalence}{cost * prevalence}$$

By default, prevalence is 0.5 and cost is 1 so that no weight is applied in effect.

Note that several thresholds might be equally optimal.

Available coordinates

The following table lists the coordinates that are available in the ret and input arguments.

Value	Description	Formula	Synonyms	Valid input
`threshold`	The threshold value	-	-	Yes
`tn`	True negative count	-	-	Yes
`tp`	True positive count	-	-	Yes
`fn`	False negative count	-	-	Yes
`fp`	False positive count	-	-	Yes
`specificity`	Specificity	tn / (tn + fp)	tnr	Yes
`sensitivity`	Sensitivity	tp / (tp + fn)	recall, tpr	Yes
`accuracy`	Accuracy	(tp + tn) / N	-	No
`npv`	Negative Predictive Value	tn / (tn + fn)	-	No
`ppv`	Positive Predictive Value	tp / (tp + fp)	precision	No
`precision`	Precision	tp / (tp + fp)	ppv	No
`recall`	Recall	tp / (tp + fn)	sensitivity, tpr	Yes
`tpr`	True Positive Rate	tp / (tp + fn)	sensitivity, recall	Yes
`fpr`	False Positive Rate	fp / (tn + fp)	1-specificity	Yes
`tnr`	True Negative Rate	tn / (tn + fp)	specificity	Yes
`fnr`	False Negative Rate	fn / (tp + fn)	1-sensitivity	Yes
`fdr`	False Discovery Rate	fp / (tp + fp)	1-ppv	No
`lr_pos`	Positive Likelihood Ratio	se / (1 - sp)	-	No
`lr_neg`	Negative Likelihood Ratio	(1 - se) / (sp)	-	No
`youden`	Youden Index	se + r * sp	-	No
`closest.topleft`	Distance to the top left corner of the ROC space	- ((1 - se)^2 + r * (1 - sp)^2)	-	No

The value “threshold” is not allowed in coords.smooth.roc.

Values can be shortenend (for example to “thr”, “sens” and “spec”, or even to “se”, “sp” or “1-np”). In addition, some values can be prefixed with 1- to get their complement: 1-specificity, 1-sensitivity, 1-accuracy, 1-npv, 1-ppv.

The values npe and ppe are automatically replaced with 1-npv and 1-ppv, respectively (and will therefore not appear as is in the output, but as 1-npv and 1-ppv instead). These must be used verbatim in ROC curves with percent=TRUE (ie. “100-ppv” is never accepted).

The “youden” and “closest.topleft” are weighted with r, according to the value of the best.weights argument. See the “Best thresholds” section above for more details.

For ret, the single value “all” can be used to return every coordinate available.

References

Neil J. Perkins, Enrique F. Schisterman (2006) ``The Inconsistency of "Optimal" Cutpoints Obtained using Two Criteria based on the Receiver Operating Characteristic Curve''. American Journal of Epidemiology 163(7), 670--675. DOI: tools:::Rd_expr_doi("10.1093/aje/kwj063").

Xavier Robin, Natacha Turck, Alexandre Hainard, et al. (2011) ``pROC: an open-source package for R and S+ to analyze and compare ROC curves''. BMC Bioinformatics, 7, 77. DOI: tools:::Rd_expr_doi("10.1186/1471-2105-12-77").

W. J. Youden (1950) ``Index for rating diagnostic tests''. Cancer, 3, 32--35. DOI: tools:::Rd_expr_doi("10.1002/1097-0142(1950)3:1<32::aid-cncr2820030106>3.0.CO;2-3").

Examples

Run this code


# Create a ROC curve:
data(aSAH)
roc.s100b <- roc(aSAH$outcome, aSAH$s100b, percent = TRUE)

# Get the coordinates of S100B threshold 0.55
coords(roc.s100b, 0.55)

# Get the coordinates at 50% sensitivity
coords(roc=roc.s100b, x=50, input="sensitivity")
# Can be abbreviated:
coords(roc.s100b, 50, "se")

# Works with smoothed ROC curves
coords(smooth(roc.s100b), 90, "specificity")

# Get the sensitivities for all thresholds
cc <- coords(roc.s100b, "all", ret="sensitivity")
print(cc$sensitivity)

# Get the best threshold
coords(roc.s100b, "best", ret="threshold")

# Get the best threshold according to different methods
roc.ndka <- roc(aSAH$outcome, aSAH$ndka, percent=TRUE)
coords(roc.ndka, "best", ret="threshold", 
       best.method="youden") # default
coords(roc.ndka, "best", ret="threshold", 
       best.method="closest.topleft")

# and with different weights
coords(roc.ndka, "best", ret="threshold", 
       best.method="youden", best.weights=c(50, 0.2))
coords(roc.ndka, "best", ret="threshold", 
       best.method="closest.topleft", best.weights=c(5, 0.2))
       
# This is available with the plot.roc function too:
plot(roc.ndka, print.thres="best", print.thres.best.method="youden",
                                 print.thres.best.weights=c(50, 0.2)) 

# Return more values:
coords(roc.s100b, "best", ret=c("threshold", "specificity", "sensitivity", "accuracy",
                           "precision", "recall"))

# Return all values
coords(roc.s100b, "best", ret = "all")
                           
# You can use coords to plot for instance a sensitivity + specificity vs. cut-off diagram
plot(specificity + sensitivity ~ threshold, 
     coords(roc.ndka, "all"), 
     type = "l", log="x", 
     subset = is.finite(threshold))

# Plot the Precision-Recall curve
plot(precision ~ recall, 
     coords(roc.ndka, "all", ret = c("recall", "precision")),
     type="l", ylim = c(0, 100))

# Alternatively plot the curve with TPR and FPR instead of SE/SP 
# (identical curve, only the axis change)
plot(tpr ~ fpr, 
     coords(roc.ndka, "all", ret = c("tpr", "fpr")),
     type="l")

Run the code above in your browser using DataLab