predab.resample
is a general-purpose
function that is used by functions for specific models.
It computes estimates of optimism of, and bias-corrected estimates of a vector
of indexes of predictive accuracy, for a model with a specified
design matrix, with or without fast backward step-down of predictors. If bw=TRUE
, the design
matrix x
must have been created by ols
, lrm
, or cph
.
If bw=TRUE
, predab.resample
prints a matrix of asterisks showing which
factors were selected at each repetition, along with a frequency distribution
of the number of factors retained across re-samples.predab.resample(fit.orig, fit, measure,
method=c("boot","crossvalidation",".632","randomization"),
bw=FALSE, B=50, pr=FALSE,
rule="aic", type="residual", sls=.05, aics=0,
strata=FALSE, tol=1e-12, non.slopes.in.x=TRUE, kint=1,
cluster, subset, group=NULL, ...)
x=TRUE
and
y=TRUE
options specified to the model fitting function. This model
should be the FULL model including all candidate variables ever excluded
because of poor assox
,y
, iter
, penalty
, penalty.matrix
,
xcol
, and other arguments passed to <method=".632"
or method="crossval"
, it will make the most sense for
measure to compute only indexes that are independent of sample size. The
me"boot"
for ordinary bootstrapping (Efron, 1983, Eq. 2.10).
Use ".632"
for Efron's .632
method (Efron, 1983, Section 6 and Eq. 6.10),
"crossvalidation"
for grouped cross--validation, TRUE
to do fast backward step-down for each training sample. Default is FALSE
.method="crossvalidation"
, this is also
the number of groups the original sample is split into.TRUE
to print results for each sample. Default is FALSE
."aic"
or "p"
. Default is "aic"
to use Akaike's
information criterion."residual"
(the default) or
"individual"
.rule="p"
. Default is .05
.rule="aic"
. Stops deleting factors when
chi-square - 2 times d.f. falls below aics
. Default is 0
.TRUE
if fit.orig
has an x
element that contains a "strata"
attribute which is a vector
that should be sampled the same way as the observations in x
and y
fit
and fastbw
.FALSE
if the design matrix x
does not have columns for intercepts and these columns are neededkint
. This affects the linear
predictor that is passed to measure
.method="boot"
. If it is present, the bootstrap is done using sampling
with replacement from the clusters rather than from the original records.
If this vector is not the smeasure
function compute measures of accuracy on
a subset of the data. The whole dataset is still used for all model development.
For example, yofit
and
measure
.measure
, and the following columns:training-test
except for method=".632"
- is .632 times
(index.orig - test)
index.orig-optimism
method=".632"
, the program stops with an error if every observation
is not omitted at least once from a bootstrap sample. Efron's ".632" method
was developed for measures that are formulated in terms on per-observation
contributions. In general, error measures (e.g., ROC areas) cannot be
written in this way, so this function uses a heuristic extension to
Efron's formulation in which it is assumed that the average error measure
omitting the i
th observation is the same as the average error measure
omitting any other observation. Then weights are derived
for each bootstrap repetition and weighted averages over the B
repetitions
can easily be computed.rms
, validate
, fastbw
,
lrm
, ols
, cph
,
bootcov
# See the code for validate.ols for an example of the use of
# predab.resample
Run the code above in your browser using DataCamp Workspace