Learn R Programming

pROC (version 1.19.0.1)

roc: Build a ROC curve

Description

This is the main function of the pROC package. It builds a ROC curve and returns a “roc” object, a list of class “roc”. This object can be printed, plotted, or passed to the functions auc, ci, smooth.roc and coords. Additionally, two roc objects can be compared with roc.test.

Usage

roc(...)
# S3 method for formula
roc(formula, data, ...)
# S3 method for data.frame
roc(data, response, predictor,
ret = c("roc", "coords", "all_coords"), ...)
# S3 method for default
roc(response, predictor, controls, cases,
density.controls, density.cases,
levels=base::levels(as.factor(response)), percent=FALSE, na.rm=TRUE,
direction=c("auto", "<", "="">"), algorithm = 2, quiet = FALSE, 
smooth=FALSE, auc=TRUE, ci=FALSE, plot=FALSE, smooth.method="binormal",
smooth.n=512, ci.method=NULL, density=NULL, ...)
roc_(data, response, predictor, ret = c("roc", "coords", "all_coords"), ...)

Arguments

Value

If the data contained any NA value and na.rm=FALSE, NA is returned. Otherwise, if smooth=FALSE, a list of class

“roc” with the following fields:

auc

if called with auc=TRUE, a numeric of class “auc” as defined in auc.

ci

if called with ci=TRUE, a numeric of class “ci” as defined in ci.

response

the response vector. Patients whose response is not %in% levels are discarded. If NA values were removed, a na.action attribute similar to na.omit stores the row numbers.

predictor

the predictor vector converted to numeric as used to build the ROC curve. Patients whose response is not %in% levels are discarded. If NA values were removed, a na.action attribute similar to na.omit stores the row numbers.

original.predictor, original.response

the response and predictor vectors as passed in argument.

levels

the levels of the response as defined in argument.

controls

the predictor values for the control observations.

cases

the predictor values for the cases.

percent

if the sensitivities, specificities and AUC are reported in percent, as defined in argument.

direction

the direction of the comparison, as defined in argument.

sensitivities

the sensitivities defining the ROC curve.

specificities

the specificities defining the ROC curve.

thresholds

the thresholds at which the sensitivities and specificities were computed. See below for details.

call

how the function was called. See match.call for more details.

fun.sesp

DEPRECATED. The value will be removed in a future version.

If smooth=TRUE a list of class “smooth.roc” as returned by smooth, with or without additional elements

auc and ci (according to the call).

Details

This function's main job is to build a ROC object. See the “Value” section to this page for more details. Before returning, it will call (in this order) the smooth, auc, ci and plot.roc functions if smooth auc, ci and plot.roc (respectively) arguments are set to TRUE. By default, only auc is called.

Data can be provided as response, predictor, where the predictor is the numeric (or ordered) level of the evaluated signal, and the response encodes the observation class (control or case). The level argument specifies which response level must be taken as controls (first value of level) or cases (second). It can safely be ignored when the response is encoded as 0 and 1, but it will frequently fail otherwise. By default, the first two values of levels(as.factor(response)) are taken, and the remaining levels are ignored. This means that if your response is coded “control” and “case”, the levels will be inverted.

In some cases, it is more convenient to pass the data as controls, cases, but both arguments are ignored if response, predictor was specified to non-NULL values. It is also possible to pass density data with density.controls, density.cases, which will result in a smoothed ROC curve even if smooth=FALSE, but are ignored if response, predictor or controls, cases are provided.

Thresholds are selected as the means between any two consecutive values observed in the data. This choice is aimed to facilitate their interpretation, as any data point will be unambiguously positive or negative regardless of whether the comparison operator includes equality or not. As a corollary, thresholds do not correspond to actual values in the data.

Specifications for auc, ci and plot.roc are not kept if auc, ci or plot are set to FALSE. Especially, in the following case:


    myRoc <- roc(..., auc.polygon=TRUE, grid=TRUE, plot=FALSE)
    plot(myRoc)
  

the plot will not have the AUC polygon nor the grid. Similarly, when comparing “roc” objects, the following is not possible:


    roc1 <- roc(..., partial.auc=c(1, 0.8), auc=FALSE)
    roc2 <- roc(..., partial.auc=c(1, 0.8), auc=FALSE)
    roc.test(roc1, roc2)
  

This will produce a test on the full AUC, not the partial AUC. To make a comparison on the partial AUC, you must repeat the specifications when calling roc.test:


    roc.test(roc1, roc2, partial.auc=c(1, 0.8))
  

Note that if roc was called with auc=TRUE, the latter syntax will not allow redefining the AUC specifications. You must use reuse.auc=FALSE for that.

References

Tom Fawcett (2006) ``An introduction to ROC analysis''. Pattern Recognition Letters 27, 861--874. DOI: tools:::Rd_expr_doi("10.1016/j.patrec.2005.10.010").

Xavier Robin, Natacha Turck, Alexandre Hainard, et al. (2011) ``pROC: an open-source package for R and S+ to analyze and compare ROC curves''. BMC Bioinformatics, 7, 77. DOI: tools:::Rd_expr_doi("10.1186/1471-2105-12-77").

See Also

auc, ci, plot.roc, print.roc, roc.test

Examples

Run this code
data(aSAH)

# Basic example
roc(aSAH$outcome, aSAH$s100b,
    levels=c("Good", "Poor"))
# As levels aSAH$outcome == c("Good", "Poor"),
# this is equivalent to:
roc(aSAH$outcome, aSAH$s100b)
# In some cases, ignoring levels could lead to unexpected results
# Equivalent syntaxes:
roc(outcome ~ s100b, aSAH)
roc(aSAH$outcome ~ aSAH$s100b)
with(aSAH, roc(outcome, s100b))
with(aSAH, roc(outcome ~ s100b))

# With a formula:
roc(outcome ~ s100b, data=aSAH)

if (FALSE) {
library(dplyr)
aSAH %>%
    filter(gender == "Female") %>%
    roc(outcome, s100b)
}

# Using subset (only with formula)
roc(outcome ~ s100b, data=aSAH, subset=(gender == "Male"))
roc(outcome ~ s100b, data=aSAH, subset=(gender == "Female"))

# With numeric controls/cases
roc(controls=aSAH$s100b[aSAH$outcome=="Good"], cases=aSAH$s100b[aSAH$outcome=="Poor"])
# With ordered controls/cases
roc(controls=aSAH$wfns[aSAH$outcome=="Good"], cases=aSAH$wfns[aSAH$outcome=="Poor"])

# Inverted the levels: "Poor" are now controls and "Good" cases:
roc(aSAH$outcome, aSAH$s100b,
    levels=c("Poor", "Good"))

# The result was exactly the same because of direction="auto".
# The following will give an AUC < 0.5:
roc(aSAH$outcome, aSAH$s100b,
    levels=c("Poor", "Good"), direction="<")

# If we are sure about levels and direction auto-detection,
# we can turn off the messages:
roc(aSAH$outcome, aSAH$s100b, quiet = TRUE)

# If we prefer counting in percent:
roc(aSAH$outcome, aSAH$s100b, percent=TRUE)

# Plot and CI (see plot.roc and ci for more options):
roc(aSAH$outcome, aSAH$s100b,
    percent=TRUE, plot=TRUE, ci=TRUE)

# Smoothed ROC curve
roc(aSAH$outcome, aSAH$s100b, smooth=TRUE)
# this is not identical to
smooth(roc(aSAH$outcome, aSAH$s100b))
# because in the latter case, the returned object contains no AUC

Run the code above in your browser using DataLab