Resampling Validation of a Fitted Model's Indexes of Fit

The validate function when used on an object created by one of the rms series does resampling validation of a regression model, with or without backward step-down variable deletion.

models, methods, regression, survival
# fit <- fitting.function(formula=response ~ terms, x=TRUE, y=TRUE)
validate(fit, method="boot", B=40,
         bw=FALSE, rule="aic", type="residual", sls=0.05, aics=0, 
         pr=FALSE, ...)
a fit derived by e.g. lrm, cph, psm, ols. The options x=TRUE and y=TRUE must have been specified.
may be "crossvalidation", "boot" (the default), ".632", or "randomization". See predab.resample for details. Can abbreviate, e.g. "cross", "b", ".6".
number of repetitions. For method="crossvalidation", is the number of groups of omitted observations.
TRUE to do fast step-down using the fastbw function, for both the overall model and for each repetition. fastbw keeps parameters together that represent the same factor.
Applies if bw=TRUE. "aic" to use Akaike's information criterion as a stopping rule (i.e., a factor is deleted if the $\chi^2$ falls below twice its degrees of freedom), or "p" to use $P$-values.
"residual" or "individual" - stopping rule is for individual factors or for the residual $\chi^2$ for all variables deleted
significance level for a factor to be kept in a model, or for judging the residual $\chi^2$.
cutoff on AIC when rule="aic".
TRUE to print results of each repetition
parameters for each specific validate function, and parameters to pass to predab.resample (note especially the group, cluster, amd subset parameters). For psm, you can pass

It provides bias-corrected indexes that are specific to each type of model. For validate.cph and validate.psm, see validate.lrm, which is similar. For validate.cph and validate.psm, there is an extra argument dxy, which if TRUE causes the rcorr.cens function to be invoked to compute the Somers' $D_{xy}$ rank correlation to be computed at each resample (this takes a bit longer than the likelihood based statistics). The values corresponting to the row $D_{xy}$ are equal to $2 * (C - 0.5)$ where C is the C-index or concordance probability. For validate.cph with dxy=TRUE, you must specify an argument u if the model is stratified, since survival curves can then cross and $X\beta$ is not 1-1 with predicted survival. There is also validate method for tree, which only does cross-validation and which has a different list of arguments.


  • a matrix with rows corresponding to the statistical indexes and columns for columns for the original index, resample estimates, indexes applied to the whole or omitted sample using the model derived from the resample, average optimism, corrected index, and number of successful re-samples.

Side Effects

prints a summary, and optionally statistics for each re-fit


  • model validation
  • predictive accuracy
  • bootstrap

See Also

validate.ols, validate.cph, validate.lrm, validate.rpart, predab.resample, fastbw, rms, rms.trans, calibrate

  • validate
# See examples for validate.cph, validate.lrm, validate.ols
# Example of validating a parametric survival model:

n <- 1000
age <- 50 + 12*rnorm(n)
label(age) <- "Age"
sex <- factor(sample(c('Male','Female'), n, TRUE))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
dt <- -log(runif(n))/h
e <- ifelse(dt <= cens,1,0)
dt <- pmin(dt, cens)
units(dt) <- "Year"
S <- Surv(dt,e)

f <- psm(S ~ age*sex, x=TRUE, y=TRUE)  # Weibull model
# Validate full model fit
validate(f, B=10)                # usually B=150

# Validate stepwise model with typical (not so good) stopping rule
# bw=TRUE does not preserve hierarchy of terms at present
validate(f, B=10, bw=TRUE, rule="p", sls=.1, type="individual")
Documentation reproduced from package rms, version 2.0-2, License: GPL (>= 2)

Community examples

Looks like there are no examples yet.