pec: Prediction error curves

Description

Evaluating the performance of risk prediction models in survival analysis. The Brier score is a weighted average of the squared distances between the observed survival status and the predicted survival probability of a model. Roughly the weights correspond to the probabilities of not being censored. The weights can be estimated depend on covariates. Prediction error curves are obtained when the Brier score is followed over time. Cross-validation based on bootstrap resampling or bootstrap subsampling can be applied to assess and compare the predictive power of various regression modelling strategies on the same set of data.

Usage

pec(object,...)
## S3 method for class 'list':
pec(object,
                     formula,
                     data,
                     times,
                     cause,
                     start,
                     maxtime,
                     exact=TRUE,
                     exactness=100,
                     fillChar=NA,
                     cens.model="cox",
                     ipcw.refit=FALSE,
                     splitMethod="none",
                     B,
                     M,
                     reference=TRUE,
                     model.args=NULL,
                     model.parms=NULL,
                     keep.index=FALSE,
                     keep.matrix=FALSE,
                     keep.models="Call",
                     keep.residuals=FALSE,
                     keep.pvalues=FALSE,
                     noinf.permute=FALSE,
                     multiSplitTest=FALSE,
                     testIBS,
                     testTimes,
                     confInt=FALSE,
                     confLevel=0.95,
                     verbose=TRUE,
                     savePath=NULL,
                     slaveseed=NULL,
                     ...)
## S3 method for class 'coxph':
pec(object,...)
## S3 method for class 'glm':
pec(object,...)
## S3 method for class 'cph':
pec(object,...)
## S3 method for class 'prodlim':
pec(object,...)
## S3 method for class 'survfit':
pec(object,...)
## S3 method for class 'aalen':
pec(object,...)

Arguments

object

A named list of prediction models, where allowed entries are (1) R-objects for which a predictSurvProb method exists (see details), (2) a call that evaluates to such an R-object (see exa

formula

A survival formula. The left hand side is used to finde the status response variable in data. For right censored data, the right hand side of the formula is used to specify conditional censoring models. For example, set

data

A data frame in which to validate the prediction models and to fit the censoring model. If data is missing, the data of the first model in list is used.

times

A vector of timepoints. At each timepoint the prediction error curves are estimated. If exact==TRUE the times are merged with all the unique values of the response variable. If times is missing and <

cause

For competing risks, the event of interest. Defaults to the first state of the response, which is obtained by evaluating the left hand side of formula in data.

start

Minimal time for estimating the prediction error curves. If missing and formula defines a Surv or Hist object then start defaults to 0, otherwise to the smallest observed value

maxtime

Maximal time for estimating the prediction error curves. If missing the largest value of the response variable is used.

exact

Logical. If TRUE estimate the prediction error curves at all the unique values of the response variable. If times are given and exact=TRUE then the times are merged with the unique values of

exactness

An integer that determines how many equidistant gridpoints are used between start and maxtime. The default is 100.

fillChar

Symbol used to fill-in places where the values of the prediction error curves are not available. The default is NA.

cens.model

Method for estimating inverse probability of censoring weigths: cox: A semi-parametric Cox proportional hazard model is fitted to the censoring times marginal: The Kaplan-Meier estimator for the censorin

ipcw.refit

If TRUE the inverse probability of censoring weigths are estimated separately in each training set during cross-validation.

splitMethod

SplitMethod for estimating the prediction error curves. none/noPlan: Assess the models in the same data where they are fitted. boot: DEPRECIATED. cvK: K-fold cross-validation, i.e.

Number of bootstrap samples. The default depends on argument splitMethod. When splitMethod in c("BootCv","Boot632","Boot632plus") the default is 100. For splitMethod="cvK" B is the number of cros

The size of the bootstrap samples for resampling without replacement. Ignored for resampling with replacement.

reference

Logical. If TRUE add the marginal Kaplan-Meier prediction model as a reference to the list of models.

model.args

List of extra arguments that can be passed to the predictSurvProb methods. The list must have an entry for each entry in object.

model.parms

Experimental. List of with exactly one entry for each entry in object. Each entry names parts of the value of the fitted models that should be extracted and added to the value.

keep.index

Logical. If FALSE remove the bootstrap or cross-validation index from the output list which otherwise is included in the splitMethod part of the output list.

keep.matrix

Logical. If TRUE add all B prediction error curves from bootstrapping or cross-validation to the output.

keep.models

Logical. If TRUE keep the models in object. If "Call" keep only the call of these models.

keep.residuals

Logical. If TRUE keep the patient individual residuals at testTimes.

keep.pvalues

For multiSplitTest. If TRUE keep the pvalues from the single splits.

noinf.permute

If TRUE the noinformation error is approximated using permutation.

multiSplitTest

If TRUE the test proposed by van de Wiel et al. (2009) is applied. Requires subsampling bootrap cross-validation, i.e. that splitMethod equals bootcv and that M is specified.

testIBS

A range of time points for testing differences between models in the integrated Brier scores.

testTimes

A vector of time points for testing differences between models in the time-point specific Brier scores.

confInt

Experimental.

confLevel

Experimental.

verbose

if TRUE report details of the progress, e.g. count the steps in cross-validation.

savePath

Place in your filesystem (directory) where training models fitted during cross-validation are saved. If missing training models are not saved.

slaveseed

Vector of seeds, as long as B, to be given to the slaves in parallel computing.

...

Not used.

Value

A pec object. See also the help pages of the corresponding print, summary, and plot methods. The object includes the following components:
PredErrThe estimated prediction error according to the splitMethod. A matrix where each column represents the estimated prediction error of a fit at the time points in time.
AppErrThe training error or apparent error obtained when the model(s) are evaluated in the same data where they were trained. Only if splitMethod is one of "NoInf", "cvK", "BootCv", "Boot632" or "Boot632plus".
BootCvErrThe prediction error when the model(s) are trained in the bootstrap sample and evaluated in the data that are not in the bootstrap sample. Only if splitMethod is one of "Boot632" or "Boot632plus". When splitMethod="BootCv" then the BootCvErr is stored in the component PredErr.
NoInfErrThe prediction error when the model(s) are evaluated in the permuted data. Only if splitMethod is one of "BootCv", "Boot632", or "Boot632plus". For splitMethod="NoInf" the NoInfErr is stored in the component PredErr.
weightThe weight used to linear combine the AppErr and the BootCvErr Only if splitMethod is one of "Boot632", or "Boot632plus".
overfitEstimated overfit of the model(s). See Efron & Tibshirani (1997, Journal of the American Statistical Association) and Gerds & Schumacher (2007, Biometrics). Only if splitMethod is one of "Boot632", or "Boot632plus".
callThe call that produced the object
timeThe time points at which the prediction error curves change.
ipcw.fitThe fitted censoring model that was used for re-weigthing the Brier score residuals. See Gerds & Schumacher (2006, Biometrical Journal)
n.riskThe number of subjects at risk for all time points.
modelsThe prediction models fitted in their own data.
cens.modelThe censoring models.
maxtimeThe latest timepoint where the prediction error curves are estimated.
startThe earliest timepoint where the prediction error curves are estimated.
exactTRUE if the prediction error curves are estimated at all unique values of the response in the full data.
splitMethodThe splitMethod used for estimation of the overfitting bias.

Details

Missing data in the response or in the input matrix cause a failure. The status of the continuous response variable at cutpoints (times), ie status=1 if the response value exceeds the cutpoint and status=0 otherwise, is compared to predicted event status probabilities which are provided by the prediction models on the basis of covariates. The comparison is done with the Brier score: the quadratic difference between 0-1 response status and predicted probability. There are two different sources for bias when estimating prediction error in right censored survival problems: censoring and high flexibility of the prediction model. The first is controlled by inverse probability of censoring weighting, the second can be controlled by special Monte Carlo simulation. In each step, the resampling procedures reevaluate the prediction model. Technically this is done by replacing the argument object$call$data by the current subset or bootstrap sample of the full data.

For each prediction model there must be a predictSurvProb method: for example, to assess a prediction model which evaluates to a myclass object one defines a function called predictSurvProb.myclass with arguments object,newdata,cutpoints,train.data,...

Such a function takes the object which was fitted with train.data and derives a matrix with predicted event status probabilities for each subject in newdata (rows) and each cutpoint (column) of the response variable that defines an event status.

Currently, predictSurvProb methods are available for the following R-objects: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

References

Ulla B. Mogensen, Hemant Ishwaran, Thomas A. Gerds (2012). Evaluating Random Forests for Survival Analysis Using Prediction Error Curves. Journal of Statistical Software, 50(11), 1-23. URL http://www.jstatsoft.org/v50/i11/. E. Graf et al. (1999), Assessment and comparison of prognostic classification schemes for survival data. Statistics in Medicine, vol 18, pp= 2529--2545.

Efron, Tibshirani (1997) Journal of the American Statistical Association 92, 548--560 Improvement On Cross-Validation: The .632+ Bootstrap Method. Gerds, Schumacher (2006), Consistent estimation of the expected Brier score in general survival models with right-censored event times. Biometrical Journal, vol 48, 1029--1040. Thomas A. Gerds, Martin Schumacher (2007) Efron-Type Measures of Prediction Error for Survival Analysis Biometrics, 63(4), 1283--1287 doi:10.1111/j.1541-0420.2007.00832.x

Martin Schumacher, Harald Binder, and Thomas Gerds. Assessment of survival prediction models based on microarray data. Bioinformatics, 23(14):1768-74, 2007.

Mark A. van de Wiel, Johannes Berkhof, and Wessel N. van Wieringen Testing the prediction error difference between 2 predictors Biostatistics (2009) 10(3): 550-560 doi:10.1093/biostatistics/kxp011

Examples

Run this code

# simulate an artificial data frame
# with survival response and two predictors

set.seed(130971)
library(prodlim)
dat <- SimSurv(300)

# fit some candidate Cox models and compute the Kaplan-Meier estimate 

Models <- list("Cox.X1"=coxph(Surv(time,status)~X1,data=dat,y=TRUE),
              "Cox.X2"=coxph(Surv(time,status)~X2,data=dat,y=TRUE),
              "Cox.X1.X2"=coxph(Surv(time,status)~X1+X2,data=dat,y=TRUE))

# compute the apparent prediction error 
PredError <- pec(object=Models,
                  formula=Surv(time,status)~X1+X2,
                  data=dat,
                  exact=TRUE,
                  cens.model="marginal",
                  splitMethod="none",
                  B=0,
                  verbose=TRUE)

print(PredError,times=seq(5,30,5))
summary(PredError)
plot(PredError,xlim=c(0,30))
# compute the .632+ estimate of the generalization error 
set.seed(17100)
PredError.632plus <- pec(object=Models,
                  formula=Surv(time,status)~X1+X2,
                  data=dat,
                  exact=TRUE,
                  cens.model="marginal",
                  splitMethod="Boot632plus",
                  B=100,
                  verbose=TRUE)

print(PredError.632plus,times=seq(5,30,5))
summary(PredError.632plus)
plot(PredError.632plus,xlim=c(0,30))
# do the same again but now parallized
set.seed(17100)
library(doMC)
registerDoMC()
PredError.632plus <- pec(object=Models,
                  formula=Surv(time,status)~X1+X2,
                  data=dat,
                  exact=TRUE,
                  cens.model="marginal",
                  splitMethod="Boot632plus",
                  B=100,
                  verbose=TRUE)

# assessing parametric survival models in learn/validation setting
learndat <- SimSurv(500)
testdat <- SimSurv(300)
library(rms)
f1 <- psm(Surv(time,status)~X1+X2,data=learndat)
f2 <- psm(Surv(time,status)~X1,data=learndat)
pf <- pec(list(f1,f2),formula=Surv(time,status)~1,data=testdat,maxtime=200)
plot(pf)
summary(pf)



# ---------------- competing risks -----------------

library(survival)
library(riskRegression)
library(cmprsk)
library(pec)
data(pbc)
f1  <- CSC(Hist(time,status)~sex+edema,cause=2,data=pbc)
f1a  <- CSC(Hist(time,status)~sex+edema,cause=2,data=pbc,survtype="surv")
f2  <- CSC(Hist(time,status)~sex,data=pbc,cause=2)
f3  <- FGR(Hist(time,status)~sex+edema,cause=2,data=pbc)
f4  <- FGR(Hist(time,status)~sex+edema,cause=2,data=pbc)
p1 <- pec(list(f1,f1a,f2,f3,f4),formula=Hist(time,status)~1,data=pbc,cause=2)

Run the code above in your browser using DataLab