This is the version of the `validate`

function specific to models
fitted with `cph`

or `psm`

. Also included is a small
function `dxy.cens`

that retrieves \(D_{xy}\) and its
standard error from the `survival`

package's
`survConcordance.fit`

function. This allows for incredibly fast
computation of \(D_{xy}\) or the c-index even for hundreds of
thousands of observations. `dxy.cens`

negates \(D_{xy}\)
if log relative hazard is being predicted. If `y`

is a
left-censored `Surv`

object, times are negated and a
right-censored object is created, then \(D_{xy}\) is negated.

```
# fit <- cph(formula=Surv(ftime,event) ~ terms, x=TRUE, y=TRUE, \dots)
# S3 method for cph
validate(fit, method="boot", B=40, bw=FALSE, rule="aic",
type="residual", sls=.05, aics=0, force=NULL, estimates=TRUE,
pr=FALSE, dxy=TRUE, u, tol=1e-9, …)
```# S3 method for psm
validate(fit, method="boot",B=40,
bw=FALSE, rule="aic", type="residual", sls=.05, aics=0,
force=NULL, estimates=TRUE, pr=FALSE,
dxy=TRUE, tol=1e-12, rel.tolerance=1e-5, maxiter=15, …)

dxy.cens(x, y, type=c('time','hazard'))

fit

a fit derived `cph`

. The options `x=TRUE`

and `y=TRUE`

must have been specified. If the model contains any stratification factors
and dxy=TRUE,
the options `surv=TRUE`

and `time.inc=u`

must also have been given,
where `u`

is the same value of `u`

given to `validate`

.

method

see `validate`

B

number of repetitions. For `method="crossvalidation"`

, is the
number of groups of omitted observations.

rel.tolerance,maxiter,bw

`TRUE`

to do fast step-down using the `fastbw`

function,
for both the overall model and for each repetition. `fastbw`

keeps parameters together that represent the same factor.

rule

Applies if `bw=TRUE`

. `"aic"`

to use Akaike's information criterion as a
stopping rule (i.e., a factor is deleted if the \(\chi^2\) falls below
twice its degrees of freedom), or `"p"`

to use \(P\)-values.

type

`"residual"`

or `"individual"`

- stopping rule is for
individual factors or for the residual \(\chi^2\) for
all variables deleted. For `dxy.cens`

, specify
`type="hazard"`

if `x`

is on the hazard or cumulative
hazard (or their logs) scale, causing negation of the correlation index.

sls

significance level for a factor to be kept in a model, or for judging the residual \(\chi^2\).

aics

cutoff on AIC when `rule="aic"`

.

force

see `fastbw`

estimates

see `print.fastbw`

pr

`TRUE`

to print results of each repetition

tol,…

see `validate`

or `predab.resample`

dxy

set to `TRUE`

to validate Somers' \(D_{xy}\) using
`dxy.cens`

, which is fast until n > 500,000. Uses the
`survival`

package's `survConcordance.fit`

service
function for `survConcordance`

.

u

must be specified if the model has any stratification factors and
`dxy=TRUE`

.
In that case, strata are not included in \(X\beta\) and the
survival curves may cross. Predictions at time `t=u`

are
correlated with observed survival times. Does not apply to
`validate.psm`

.

x

a numeric vector

y

a `Surv`

object that may be uncensored or
right-censored

matrix with rows corresponding to \(D_{xy}\), Slope, \(D\), \(U\), and \(Q\), and columns for the original index, resample estimates, indexes applied to whole or omitted sample using model derived from resample, average optimism, corrected index, and number of successful resamples.

The values corresponding to the row \(D_{xy}\) are equal to \(2 * (C - 0.5)\) where C is the C-index or concordance probability. If the user is correlating the linear predictor (predicted log hazard) with survival time, \(D_{xy}\) is automatically negated.

prints a summary, and optionally statistics for each re-fit (if
`pr=TRUE`

)

Statistics validated include the Nagelkerke \(R^2\),
\(D_{xy}\), slope shrinkage, the discrimination index \(D\)
[(model L.R. \(\chi^2\) - 1)/L], the unreliability index
\(U\) = (difference in -2 log likelihood between uncalibrated
\(X\beta\) and
\(X\beta\) with overall slope calibrated to test sample) / L,
and the overall quality index \(Q = D - U\). \(g\) is the
\(g\)-index on the log relative hazard (linear predictor) scale.
L is -2 log likelihood with beta=0. The "corrected" slope
can be thought of as shrinkage factor that takes into account overfitting.
See `predab.resample`

for the list of resampling methods.

`validate`

, `predab.resample`

,
`fastbw`

, `rms`

, `rms.trans`

,
`calibrate`

, `rcorr.cens`

,
`cph`

, `survival-internal`

,
`gIndex`

, `survConcordance`

# NOT RUN { n <- 1000 set.seed(731) age <- 50 + 12*rnorm(n) label(age) <- "Age" sex <- factor(sample(c('Male','Female'), n, TRUE)) cens <- 15*runif(n) h <- .02*exp(.04*(age-50)+.8*(sex=='Female')) dt <- -log(runif(n))/h e <- ifelse(dt <= cens,1,0) dt <- pmin(dt, cens) units(dt) <- "Year" S <- Surv(dt,e) f <- cph(S ~ age*sex, x=TRUE, y=TRUE) # Validate full model fit validate(f, B=10) # normally B=150 # Validate a model with stratification. Dxy is the only # discrimination measure for such models, by Dxy requires # one to choose a single time at which to predict S(t|X) f <- cph(S ~ rcs(age)*strat(sex), x=TRUE, y=TRUE, surv=TRUE, time.inc=2) validate(f, u=2, B=10) # normally B=150 # Note u=time.inc # }