calPlot2: Calibration plots for binary data

Description

Calibration plots for risk prediction models in for a binary endpoint

Usage

calPlot2(object, formula, data, splitMethod = "none", B = 1, M, showY, method = "nne", round = TRUE, bandwidth = NULL, q = 10, density = 55, add = FALSE, diag = !add, legend = !add, axes = !add, xlim, ylim, xlab = "Predicted event probability", ylab = "Observed proportion", col, lwd, lty, pch, cause = 1, percent = TRUE, giveToModel = NULL, na.action = na.fail, cores = 1, verbose = FALSE, ...)

Arguments

object

A named list of prediction models, where allowed entries are (1) R-objects for which a predictStatusProb method exists (see details), (2) a call that evaluates to such an R-object (see examples), (3) a matrix with predicted probabilities having as many rows as data in one column. For cross-validation all objects in this list must include their call.

formula

A survival or event history formula. The left hand side is used to compute the expected event status. If formula is missing, try to extract a formula from the first element in object.

data

A data frame in which to validate the prediction models and to fit the censoring model. If data is missing, try to extract a data set from the first element in object.

splitMethod

Defines the internal validation design:

none/noPlan: Assess the models in the give data, usually either in the same data where they are fitted, or in independent test data.

BootCv: Bootstrap cross validation. The prediction models are trained on B bootstrap samples, that are either drawn with replacement of the same size as the original data or without replacement from data of the size M. The models are assessed in the observations that are NOT in the bootstrap sample.

The number of cross-validation steps.

The size of the subsamples for cross-validation.

showY

If TRUE the observed data are shown as dots on the plot.

method

The method for estimating the calibration curve(s):

"nne": The expected event status is obtained in the nearest neighborhood around the predicted event probabilities.

"quantile": The expected event status is obtained in groups defined by quantiles of the predicted event probabilities.

round

If TRUE predicted probabilities are rounded to two digits before smoothing. This may have a considerable effect on computing efficiency in large data sets.

bandwidth

The bandwidth for method="nne"

The number of quantiles for method="quantile".

density

Gray scale for observations.

add

If TRUE the line(s) are added to an existing plot.

diag

If FALSE no diagonal line is drawn.

legend

If FALSE no legend is drawn.

axes

If FALSE no axes are drawn.

xlim

Limits of x-axis.

ylim

Limits of y-axis.

xlab

Label for y-axis.

ylab

Label for x-axis.

col

Vector with colors, one for each element of object. Passed to lines.

lwd

Vector with line widths, one for each element of object. Passed to lines.

lty

lwd Vector with line style, one for each element of object. Passed to lines.

pch

Passed to points.

cause

For competing risks models, the cause of failure or event of interest

percent

If TRUE axes labels are multiplied by 100 and thus interpretable on a percent scale.

giveToModel

List of with exactly one entry for each entry in object. Each entry names parts of the value of the fitted models that should be extracted and added to the value.

na.action

Passed to model.frame

cores

Number of cores for parallel computing. Passed as the value of the argument mc.cores when calling mclapply.

verbose

if TRUE report details of the progress, e.g. count the steps in cross-validation.

...

Used to control the subroutines: plot, axis, lines, legend. See SmartControl.

Value

list with elements: time, Frame and bandwidth (NULL for method quantile).

Details

For method "nne" the optimal bandwidth with respect to is obtained with the function dpik from the package KernSmooth for a box kernel function.

References

TA Gerds, PA Andersen, and Kattan MW. Calibration plots for risk prediction models in the presence of competing risks. Statistics in Medicine, page to appear, 2014.

Examples

Run this code

set.seed(40)
N=40
Y=rbinom(N,1,.5)
X1=rnorm(N)
X1[Y==1]=rnorm(sum(Y==1),mean=rbinom(sum(Y==1),1,.5))
X2=rnorm(N)
X2[Y==0]=rnorm(sum(Y==0),mean=rbinom(sum(Y==0),3,.5))
dat <- data.frame(Y=Y,X1=X1,X2=X2)
lm1 <- glm(Y~X1,data=dat,family="binomial")
lm2 <- glm(Y~X2,data=dat,family="binomial")
calPlot2(list(lm1,lm2),data=dat)

Run the code above in your browser using DataLab