varoc: VAROC: value added receiver operating characteristics (ROC) curve

Description

ROC curve to visualize classification and continuity performances of biomarkers, diagnostic tests, or risk prediction models.

Usage

varoc(y,x,tmd.range=NULL,legend="right",lwd=3,digits=2)

Value

th: Threshold values.
tpf: True positive fraction at each threshold.
fpf: False positive fraction at each th.
tpm: True positive mean at each th.
fpm: False positive fraction at each th.
tmd: Tail mean difference, i.e., tpm-fpm, at each th.
auc: Area under the ROC curve.
itmd: Integrated tmd over all theresholds.

Arguments

y: binary outcome, where y=1 if disease (or case) and y=0 if non-disease (or control).
x: continuous score, e.g. biomarker, diagnostic test, risk score.
tmd.range: minimum and maximum values of TMD, displayed on the plot.
legend: legend location, "bottomright", "bottom", "bottomleft", "left", "topleft", "top", "topright", "right" and "center".
lwd: line width.
digits: number of decimals.

Author

Yunro Chung [aut, cre]

Details

The varoc function summarizes a continuity performance of x at each cutoff c using two key metrics: (i) tail mean difference (TMD) and (ii) intergrated TMD (ITMD). For (i), TMD(c) is true positive mean(TPM)(c) minus false positive mean(FPM)(c), where TPM(c) is E(x>c|y=1) and FPM(c) is E(x>c|y=0). For (ii), ITMD is a global measure of evaluating continuity performance of x over all thresholds.

These measures are continuous versions of ROC curve-based measures. Specifically, TPM(c) and FPM(c) are continuous versions of true positive fraction (TPF)(c) and false positive fraction (FPF)(c), where TPF(c)=P(x>c|y=1) and FPF(c)=P(x>c|y=0). Thus, the useful (or useless) x has TPF(c)-FPF(c)>0 and TMD(c)>0 (or TPF(c)-FPF(c)=0 and TMD(c)=0); and useful (or useless) x has area under the ROC curve (AUC)>0.5 and ITMD(c)>0 (or AUC=0.5 and ITMD(c)=0).

References

Danielle Brister and Yunro Chung, Value added receiver operating characteristics curve (in-progress)

Examples

Run this code

set.seed(100)

n1=100
n0=100
y=c(rep(1,n1),rep(0,n0))

#1. useless marker
x1=abs(c(rnorm(n1,0,1),rnorm(n0,0,1)))
fit1=varoc(y=y,x=x1)

#2. useful marker
x2=abs(c(rnorm(n1,2,1),rnorm(n0,0,1)))
fit2=varoc(y=y,x=x2)

#4. markers 1 vs 2
opar=par(mfrow=c(1,2))
tmd.range=range(c(fit1$tmd,fit2$tmd))
fit1=varoc(y=y,x=x1,tmd.range=tmd.range)
fit2=varoc(y=y,x=x2,tmd.range=tmd.range)
on.exit(par(opar))

Run the code above in your browser using DataLab