validate
Resampling Validation of a Fitted Model's Indexes of Fit
The validate
function when used on an object created by one of the
rms
series does resampling validation of a
regression model, with or without backward step-down variable deletion.
- Keywords
- models, methods, regression, survival
Usage
# fit <- fitting.function(formula=response ~ terms, x=TRUE, y=TRUE)
validate(fit, method="boot", B=40,
bw=FALSE, rule="aic", type="residual", sls=0.05, aics=0,
force=NULL, estimates=TRUE, pr=FALSE, …)
# S3 method for validate
print(x, digits=4, B=Inf, …)
# S3 method for validate
latex(object, digits=4, B=Inf, file='', append=FALSE,
title=first.word(deparse(substitute(x))),
caption=NULL, table.env=FALSE,
size='normalsize', extracolsize=size, …)
# S3 method for validate
html(object, digits=4, B=Inf, caption=NULL, …)
Arguments
- fit
a fit derived by e.g.
lrm
,cph
,psm
,ols
. The optionsx=TRUE
andy=TRUE
must have been specified.- method
may be
"crossvalidation"
,"boot"
(the default),".632"
, or"randomization"
. Seepredab.resample
for details. Can abbreviate, e.g."cross", "b", ".6"
.- B
number of repetitions. For
method="crossvalidation"
, is the number of groups of omitted observations. Forprint.validate
,latex.validate
, andhtml.validate
,B
is an upper limit on the number of resamples for which information is printed about which variables were selected in each model re-fit. Specify zero to suppress printing. Default is to print all re-samples.- bw
TRUE
to do fast step-down using thefastbw
function, for both the overall model and for each repetition.fastbw
keeps parameters together that represent the same factor.- rule
Applies if
bw=TRUE
."aic"
to use Akaike's information criterion as a stopping rule (i.e., a factor is deleted if the \(\chi^2\) falls below twice its degrees of freedom), or"p"
to use \(P\)-values.- type
"residual"
or"individual"
- stopping rule is for individual factors or for the residual \(\chi^2\) for all variables deleted- sls
significance level for a factor to be kept in a model, or for judging the residual \(\chi^2\).
- aics
cutoff on AIC when
rule="aic"
.- force
see
fastbw
- estimates
see
print.fastbw
- pr
TRUE
to print results of each repetition- …
parameters for each specific validate function, and parameters to pass to
predab.resample
(note especially thegroup
,cluster
, amdsubset
parameters). Forlatex
, optional arguments tolatex.default
. Ignored forhtml.validate
.For
psm
, you can pass themaxiter
parameter here (passed tosurvreg.control
, default is 15 iterations) as well as atol
parameter for judging matrix singularity insolvet
(default is 1e-12) and arel.tolerance
parameter that is passed tosurvreg.control
(default is 1e-5).For
print.validate
… is ignored.- x,object
an object produced by one of the
validate
functions- digits
number of decimal places to print
- file
file to write LaTeX output. Default is standard output.
- append
set to
TRUE
to append LaTeX output to an existing file- title, caption, table.env, extracolsize
see
latex.default
. Iftable.env
isFALSE
andcaption
is given, the character string contained incaption
will be placed before the table, centered.- size
size of LaTeX output. Default is
'normalsize'
. Must be a defined LaTeX size when prepended by double slash.
Details
It provides bias-corrected indexes that are specific to each type
of model. For validate.cph
and validate.psm
, see validate.lrm
,
which is similar.
For validate.cph
and validate.psm
, there is
an extra argument dxy
, which if TRUE
causes the dxy.cens
function to be invoked to compute the Somers' \(D_{xy}\) rank correlation
to be computed at each resample. The values corresponding to the row
\(D_{xy}\) are equal to \(2 * (C - 0.5)\) where C is the
C-index or concordance probability.
For validate.cph
with dxy=TRUE
,
you must specify an argument u
if the model is stratified, since
survival curves can then cross and \(X\beta\) is not 1-1 with
predicted survival.
There is also validate
method for
tree
, which only does cross-validation and which has a different
list of arguments.
Value
a matrix with rows corresponding to the statistical indexes and columns for columns for the original index, resample estimates, indexes applied to the whole or omitted sample using the model derived from the resample, average optimism, corrected index, and number of successful re-samples.
Side Effects
prints a summary, and optionally statistics for each re-fit
See Also
validate.ols
, validate.cph
,
validate.lrm
, validate.rpart
,
predab.resample
, fastbw
, rms
,
rms.trans
, calibrate
,
dxy.cens
, survConcordance
Examples
# NOT RUN {
# See examples for validate.cph, validate.lrm, validate.ols
# Example of validating a parametric survival model:
n <- 1000
set.seed(731)
age <- 50 + 12*rnorm(n)
label(age) <- "Age"
sex <- factor(sample(c('Male','Female'), n, TRUE))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
dt <- -log(runif(n))/h
e <- ifelse(dt <= cens,1,0)
dt <- pmin(dt, cens)
units(dt) <- "Year"
S <- Surv(dt,e)
f <- psm(S ~ age*sex, x=TRUE, y=TRUE) # Weibull model
# Validate full model fit
validate(f, B=10) # usually B=150
# Validate stepwise model with typical (not so good) stopping rule
# bw=TRUE does not preserve hierarchy of terms at present
validate(f, B=10, bw=TRUE, rule="p", sls=.1, type="individual")
# }