kroc: Kernel receiver operating characteristic (ROC) curve

Description

Kernel receiver operating characteristic (ROC) curve for 1- to 3-dimensional data.

Usage

kroc(x1, x2, H1, h1, hy, gridsize, gridtype, xmin, xmax, supp=3.7, eval.points,
   binned=FALSE, bgridsize, positive=FALSE, adj.positive, w, verbose=FALSE)
## S3 method for class 'kroc':
predict(object, ..., x)
## S3 method for class 'kroc':
summary(object, ...)

Arguments

x,x1,x2

vector/matrix of data values

H1,h1,hy

bandwidth matrix/scalar bandwidths. If these are missing, Hpi.kcde, hpi.kcde is called by default.

gridsize

vector of number of grid points

gridtype

not yet implemented

xmin,xmax

vector of minimum/maximum values for grid

supp

effective support for standard normal

eval.points

not yet implemented

binned

flag for binned estimation. Default is FALSE.

bgridsize

vector of binning grid sizes

positive

flag if 1-d data are positive. Default is FALSE.

adj.positive

adjustment applied to positive 1-d data

vector of weights. Default is a vector of all ones.

verbose

flag to print out progress information. Default is FALSE.

object

object of class kroc, output from kroc

...

other parameters

Value

A kernel ROC curve is an object of class kroc which is a list with fields:
xlist of data values x1, x2 - same as input
eval.pointspoints at which the estimate is evaluated
estimateROC curve estimate at eval.points
gridtype"linear"
griddedflag for estimation on a grid
binnedflag for binned estimation
namesvariable names
wweights
tail"lower.tail"
h1scalar bandwidth for first sample (1-d only)
H1bandwidth matrix for first sample
hyscalar bandwidth for ROC curve
indicessummary indices of ROC curve.

Details

In this set-up, the values in the first sample x1 should be larger in general that those in the second sample x2. The usual method for computing 1-d ROC curves is not valid for multivariate data. Duong (2014), based on Lloyd (1998), develops an alternative formulation $(F_{Y_1}(z), F_{Y_2}(z))$ based on the cumulative distribution functions of $Y_j = \bar{F}_1(\bold{X}_j), j=1,2$.

If the bandwidth H1 is missing from kroc, then the default bandwidth is the plug-in selector Hpi.kcde. Likewise for missing h1,hy. A bandwidth matrix H1 is required for x1 for d>1, but the second bandwidth hy is always a scalar since $Y_j$ are 1-d variables.

The effective support, binning, grid size, grid range, positive data parameters are the same as for kde. --The summary method for kroc objects prints out the summary indices of the ROC curve, as contained in the indices field, namely the AUC (area under the curve) and Youden index.

References

Duong, T. (2015) Non-parametric smoothed estimation of multivariate cumulative distribution and survival functions, and receiver operating characteristic curves. Journal of the Korean Statistical Society. In press. DOI:10.1016/j.jkss.2015.06.002.

Lloyd, C. (1998) Using smoothed receiver operating curves to summarize and compare diagnostic systems. Journal of the American Statistical Association. 93, 1356-1364.

Examples

Run this code

samp <- 1000
x <- rnorm.mixt(n=samp, mus=0, sigmas=1, props=1)
y <- rnorm.mixt(n=samp, mus=0.5, sigmas=1, props=1)
Rhat <- kroc(x1=x, x2=y)
summary(Rhat)
predict(Rhat, x=0.5)

Run the code above in your browser using DataLab