power.roc.test: Sample size and power computation for ROC curves

Description

Computes sample size, power, significance level or minimum AUC for ROC curves.

Usage

power.roc.test(...)
# One or Two ROC curves test with roc objects:
# S3 method for roc
power.roc.test(roc1, roc2, sig.level = 0.05, 
power = NULL, alternative = c("two.sided", "one.sided"), 
reuse.auc=TRUE, method = c("delong", "bootstrap", "obuchowski"), ...)
# One ROC curve with a given AUC:
# S3 method for numeric
power.roc.test(auc = NULL, ncontrols = NULL, 
ncases = NULL, sig.level = 0.05, power = NULL, kappa = 1, 
alternative = c("two.sided", "one.sided"), ...)
# Two ROC curves with the given parameters:
# S3 method for list
power.roc.test(parslist, ncontrols = NULL, 
ncases = NULL, sig.level = 0.05, power = NULL,  kappa = 1, 
alternative = c("two.sided", "one.sided"), ...)

Arguments

roc1, roc2

one or two “roc” object from the roc function.

auc

expected AUC.

parslist

a list of parameters for the two ROC curves test with Obuchowski variance when no empirical ROC curve is known:

A1: binormal A parameter for ROC curve 1
B1: binormal B parameter for ROC curve 1
A2: binormal A parameter for ROC curve 2
B2: binormal B parameter for ROC curve 2
rn: correlation between the variables in control patients
ra: correlation between the variables in case patients
delta: the difference of AUC between the two ROC curves

For a partial AUC, the following additional parameters must be set:

FPR11: Upper bound of FPR (1 - specificity) of ROC curve 1
FPR12: Lower bound of FPR (1 - specificity) of ROC curve 1
FPR21: Upper bound of FPR (1 - specificity) of ROC curve 2
FPR22: Lower bound of FPR (1 - specificity) of ROC curve 2

ncontrols, ncases

number of controls and case observations available.

sig.level

expected significance level (probability of type I error).

power

expected power of the test (1 - probability of type II error).

kappa

expected balance between control and case observations. Must be positive. Only for sample size determination, that is to determine ncontrols and ncases.

alternative

whether a one or two-sided test is performed.

reuse.auc

if TRUE (default) and the “roc” objects contain an “auc” field, re-use these specifications for the test. See the AUC specification section for more details.

method

the method to compute variance and covariance, either “delong”, “bootstrap” or “obuchowski”. The first letter is sufficient. Only for Two ROC curves power calculation. See var and cov documentations for more details.

…

further arguments passed to or from other methods, especially auc (with reuse.auc=FALSE or no AUC in the ROC curve), cov and var (especially arguments method, boot.n and boot.stratified). Ignored (with a warning) with a parslist.

Value

An object of class power.htest (such as that given by power.t.test) with the supplied and computed values.

One ROC curve power calculation

If one or no ROC curves are passed to power.roc.test, a one ROC curve power calculation is performed. The function expects either power, sig.level or auc, or both ncontrols and ncases to be missing, so that the parameter is determined from the others with the formula by Obuchowski et al., 2004 (formulas 2 and 3, p. 1123).

For the sample size, ncases is computed directly from formulas 2 and 3 and ncontrols is deduced with kappa. AUC is optimized by uniroot while sig.level and power are solved as quadratic equations.

power.roc.test can also be passed a roc object from the roc function, but the empirical ROC will not be used, only the number of patients and the AUC.

Two paired ROC curves power calculation

If two ROC curves are passed to power.roc.test, the function will compute either the required sample size (if power is supplied), the significance level (if sig.level=NULL and power is supplied) or the power of a test of a difference between to AUCs according to the formula by Obuchowski and McClish, 1997 (formulas 2 and 3, p. 1530--1531). The null hypothesis is that the AUC of roc1 is the same than the AUC of roc2, with roc1 taken as the reference ROC curve.

For the sample size, ncases is computed directly from formula 2 and ncontrols is deduced from the ratio observed in roc1 and roc2. sig.level and power are solved as quadratic equations.

The variance and covariance of the ROC curve are computed with the var and cov functions. By default, DeLong method using the algorithm by Sun and Xu (2014) is used for full AUCs and the bootstrap for partial AUCs. It is possible to force the use of Obuchowski's variance by specifying method="obuchowski".

Alternatively when no empirical ROC curve is known, or if only one is available, a list can be passed to power.roc.test, with the contents defined in the “Arguments” section. The variance and covariance are computed from Table 1 and Equation 4 and 5 of Obuchowski and McClish (1997), p. 1530--1531.

Power calculation for unpaired ROC curves is not implemented.

AUC specification

The comparison of the AUC of the ROC curves needs a specification of the AUC. The specification is defined by:

the “auc” field in the “roc” objects if reuse.auc is set to TRUE (default)
passing the specification to auc with … (arguments partial.auc, partial.auc.correct and partial.auc.focus). In this case, you must ensure either that the roc object do not contain an auc field (if you called roc with auc=FALSE), or set reuse.auc=FALSE.

If reuse.auc=FALSE the auc function will always be called with … to determine the specification, even if the “roc” objects do contain an auc field.

As well if the “roc” objects do not contain an auc field, the auc function will always be called with … to determine the specification.

Warning: if the roc object passed to roc.test contains an auc field and reuse.auc=TRUE, auc is not called and arguments such as partial.auc are silently ignored.

Acknowledgements

The authors would like to thank Christophe Combescure and Anne-Sophie Jannot for their help with the implementation of this section of the package.

References

Elisabeth R. DeLong, David M. DeLong and Daniel L. Clarke-Pearson (1988) ``Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach''. Biometrics 44, 837--845.

Nancy A. Obuchowski, Donna K. McClish (1997). ``Sample size determination for diagnostic accurary studies involving binormal ROC curve indices''. Statistics in Medicine, 16, 1529--1542. DOI: 10.1002/(SICI)1097-0258(19970715)16:13<1529::AID-SIM565>3.0.CO;2-H.

Nancy A. Obuchowski, Micharl L. Lieber, Frank H. Wians Jr. (2004). ``ROC Curves in Clinical Chemistry: Uses, Misuses, and Possible Solutions''. Clinical Chemistry, 50, 1118--1125. DOI: 10.1373/clinchem.2004.031823.

Xu Sun and Weichao Xu (2014) ``Fast Implementation of DeLongs Algorithm for Comparing the Areas Under Correlated Receiver Operating Characteristic Curves''. IEEE Signal Processing Letters, 21, 1389--1393. DOI: 10.1109/LSP.2014.2337313.

Examples

Run this code

# NOT RUN {
data(aSAH)

#### One ROC curve ####

# Build a roc object:
rocobj <- roc(aSAH$outcome, aSAH$s100b)

# Determine power of one ROC curve:
power.roc.test(rocobj)
# Same as:
power.roc.test(ncases=41, ncontrols=72, auc=0.73, sig.level=0.05)
# sig.level=0.05 is implicit and can be omitted:
power.roc.test(ncases=41, ncontrols=72, auc=0.73)

# Determine ncases & ncontrols:
power.roc.test(auc=rocobj$auc, sig.level=0.05, power=0.95, kappa=1.7)
power.roc.test(auc=0.73, sig.level=0.05, power=0.95, kappa=1.7)

# Determine sig.level:
power.roc.test(ncases=41, ncontrols=72, auc=0.73, power=0.95, sig.level=NULL)

# Derermine detectable AUC:
power.roc.test(ncases=41, ncontrols=72, sig.level=0.05, power=0.95)


#### Two ROC curves ####

###  Full AUC
roc1 <- roc(aSAH$outcome, aSAH$ndka)
roc2 <- roc(aSAH$outcome, aSAH$wfns)

## Sample size
# With DeLong variance (default)
power.roc.test(roc1, roc2, power=0.9)
# With Obuchowski variance
power.roc.test(roc1, roc2, power=0.9, method="obuchowski")

## Power test
# With DeLong variance (default)
power.roc.test(roc1, roc2)
# With Obuchowski variance
power.roc.test(roc1, roc2, method="obuchowski")

## Significance level
# With DeLong variance (default)
power.roc.test(roc1, roc2, power=0.9, sig.level=NULL)
# With Obuchowski variance
power.roc.test(roc1, roc2, power=0.9, sig.level=NULL, method="obuchowski")

### Partial AUC
roc3 <- roc(aSAH$outcome, aSAH$ndka, partial.auc=c(1, 0.9))
roc4 <- roc(aSAH$outcome, aSAH$wfns, partial.auc=c(1, 0.9))

## Sample size
# With bootstrap variance (default)
# }
# NOT RUN {
power.roc.test(roc3, roc4, power=0.9)
# }
# NOT RUN {
# With Obuchowski variance
power.roc.test(roc3, roc4, power=0.9, method="obuchowski")

## Power test
# With bootstrap variance (default)
# }
# NOT RUN {
power.roc.test(roc3, roc4)
# This is exactly equivalent:
power.roc.test(roc1, roc2, reuse.auc=FALSE, partial.auc=c(1, 0.9))
# }
# NOT RUN {
# With Obuchowski variance
power.roc.test(roc3, roc4, method="obuchowski")

## Significance level
# With bootstrap variance (default)
# }
# NOT RUN {
power.roc.test(roc3, roc4, power=0.9, sig.level=NULL)
# }
# NOT RUN {
# With Obuchowski variance
power.roc.test(roc3, roc4, power=0.9, sig.level=NULL, method="obuchowski")

## With only binormal parameters given
# From example 2 of Obuchowski and McClish, 1997.
ob.params <- list(A1=2.6, B1=1, A2=1.9, B2=1, rn=0.6, ra=0.6, FPR11=0,
FPR12=0.2, FPR21=0, FPR22=0.2, delta=0.037) 

power.roc.test(ob.params, power=0.8, sig.level=0.05)
power.roc.test(ob.params, power=0.8, sig.level=NULL, ncases=107)
power.roc.test(ob.params, power=NULL, sig.level=0.05, ncases=107)

# }

Run the code above in your browser using DataLab