rms (version 2.0-2)

calibrate: Resampling Model Calibration

Description

Uses bootstrapping or cross-validation to get bias-corrected (overfitting- corrected) estimates of predicted vs. observed values based on subsetting predictions into intervals (for survival models) or on nonparametric smoothers (for other models). There are calibration functions for Cox (cph), parametric survival models (psm), binary and ordinal logistic models (lrm) and ordinary least squares (ols). For survival models, "predicted" means predicted survival probability at a single time point, and "observed" refers to the corresponding Kaplan-Meier survival estimate, stratifying on intervals of predicted survival. For logistic and linear models, a nonparametric calibration curve is estimated over a sequence of predicted values. The fit must have specified x=TRUE, y=TRUE. The print and plot methods for lrm and ols models (which use calibrate.default) print the mean absolute error in predictions, the mean squared error, and the 0.9 quantile of the absolute error. Here, error refers to the difference between the predicted values and the corresponding bias-corrected calibrated values.

Below, the second, third, and fourth invocations of calibrate are, respectively, for ols and lrm, cph, and psm. The first and second plot invocation are respectively for lrm and ols fits or all other fits.

Usage

calibrate(fit, ...)
## S3 method for class 'default':
calibrate(fit, predy, 
  method=c("boot","crossvalidation",".632","randomization"),
  B=40, bw=FALSE, rule=c("aic","p"),
  type=c("residual","individual"),
  sls=.05, pr=FALSE, kint, smoother="lowess", ...)
## S3 method for class 'cph':
calibrate(fit, method="boot", u, m=150, cuts, B=40, 
  bw=FALSE, rule="aic", type="residual", sls=0.05, aics=0, 
  pr=FALSE, what="observed-predicted", tol=1e-12, \dots)
## S3 method for class 'psm':
calibrate(fit, method="boot", u, m=150, cuts, B=40,
  bw=FALSE,rule="aic",
  type="residual",sls=.05,aics=0,
  pr=FALSE,what="observed-predicted",tol=1e-12, maxiter=15, 
  rel.tolerance=1e-5, \dots)

## S3 method for class 'calibrate': print(x, \dots) ## S3 method for class 'calibrate.default': print(x, \dots)

## S3 method for class 'calibrate': plot(x, xlab, ylab, subtitles=TRUE, conf.int=TRUE, cex.subtitles=.75, \dots)

## S3 method for class 'calibrate.default': plot(x, xlab, ylab, xlim, ylim, legend=TRUE, subtitles=TRUE, scat1d.opts=NULL, \dots)

Arguments

fit
a fit from ols, lrm, cph or psm
x
an object created by calibrate
method
B
bw
rule
type
sls
aics
u
the time point for which to validate predictions for survival models. For cph fits, you must have specified surv=TRUE, time.inc=u, where u is the constant specifying the time to predict.
m
group predicted u-time units survival into intervals containing m subjects on the average (for survival models only)
cuts
actual cut points for predicted survival probabilities. You may specify only one of m and cuts (for survival models only)
pr
set to TRUE to print intermediate results for each re-sample
what
The default is "observed-predicted", meaning to estimate optimism in this difference. This is preferred as it accounts for skewed distributions of predicted probabilities in outer intervals. You can also specify "observed". This
tol
criterion for matrix singularity (default is 1e-12)
maxiter
for psm, this is passed to survreg.control (default is 15 iterations)
rel.tolerance
parameter passed to survreg.control for psm (default is 1e-5).
predy
a scalar or vector of predicted values to calibrate (for lrm, ols). Default is 50 equally spaced points between the 5th smallest and the 5th largest predicted values. For lrm the predicted values are probabilities
kint
For an ordinal logistic model the default predicted probability that $Y\geq$ the middle level. Specify kint to specify the intercept to use, e.g., kint=2 means to calibrate $Prob(Y\geq b)$, where $b$ is the second level of $Y$
smoother
a function in two variables which produces $x$- and $y$-coordinates by smoothing the input y. The default is to use lowess(x, y, iter=0).
...
other arguments to pass to predab.resample, such as group, cluster, and subset. Also, other arguments for plot.
xlab
defaults to "Predicted x-units Survival" or to a suitable label for other models
ylab
defaults to "Fraction Surviving x-units" or to a suitable label for other models
xlim
ylim
2-vectors specifying x- and y-axis limits, if not using defaults
subtitles
set to FALSE to suppress subtitles in plot describing method and for lrm and ols the mean absolute error and original sample size
conf.int
set to FALSE to suppress plotting 0.95 confidence intervals for Kaplan-Meier estimates
cex.subtitles
character size for plotting subtitles
legend
set to FALSE to suppress legends (for lrm, ols only) on the calibration plot, or specify a list with elements x and y containing the coordinates of the upper left corner of the legend. By d
scat1d.opts
a list containing additional arguments to scat1d

Value

  • matrix specifying mean predicted survival in each interval, the corresponding estimated bias-corrected Kaplan-Meier estimates, number of subjects, and other statistics. For linear and logistic models, the matrix instead has rows corresponding to the prediction points, and the vector of predicted values being validated is returned as an attribute. The returned object has class "calibrate" or "calibrate.default".

Side Effects

prints, and stores an object pred.obs or .orig.cal

concept

  • bootstrap
  • model validation
  • calibration
  • model reliability
  • predictive accuracy

Details

If the fit was created using penalized maximum likelihood estimation, the same penalty and penalty.scale parameters are used during validation.

See Also

validate, predab.resample, groupkm, errbar, scat1d, cph, psm, lowess

Examples

Run this code
set.seed(1)
d.time <- rexp(200)
x1 <- runif(200)
x2 <- factor(sample(c('a','b','c'),200,TRUE))
f <- cph(Surv(d.time) ~ pol(x1,2)*x2, x=TRUE, y=TRUE, surv=TRUE, time.inc=2)
#or f <- psm(S ~ \dots)
cal <- calibrate(f, u=2, m=50, B=20)  # usually B=200 or 300
plot(cal)


y <- sample(0:2, 200, TRUE)
x1 <- runif(200)
x2 <- runif(200)
x3 <- runif(200)
x4 <- runif(200)
f <- lrm(y ~ x1+x2+x3*x4, x=TRUE, y=TRUE)
cal <- calibrate(f, kint=2, predy=seq(.2,.8,length=60), 
                 group=y)
# group= does k-sample validation: make resamples have same 
# numbers of subjects in each level of y as original sample


plot(cal)
#See the example for the validate function for a method of validating
#continuation ratio ordinal logistic models.  You can do the same
#thing for calibrate

Run the code above in your browser using DataCamp Workspace