survest.cph: Cox Survival Estimates

Description

Compute survival probabilities and optional confidence limits for Cox survival models. If x=TRUE, y=TRUE were specified to cph, confidence limits use the correct formula for any combination of predictors. Otherwise, if surv=TRUE was specified to cph, confidence limits are based only on standard errors of log(S(t)) at the mean value of \(X\beta\). If the model contained only stratification factors, or if predictions are being requested near the mean of each covariable, this approximation will be accurate. Unless times is given, at most one observation may be predicted.

Usage

survest(fit, …)
# S3 method for cph
survest(fit, newdata, linear.predictors, x, times, 
        fun, loglog=FALSE, conf.int=0.95, type, vartype,
        conf.type=c("log", "log-log", "plain", "none"), se.fit=TRUE,
        what=c('survival','parallel'),
        individual=FALSE, ...)

Arguments

fit

a model fit from cph

newdata

a data frame containing predictor variable combinations for which predictions are desired

linear.predictors

a vector of linear predictor values (centered) for which predictions are desired. If the model is stratified, the "strata" attribute must be attached to this vector (see example).

a design matrix at which to compute estimates, with any strata attached as a "strata" attribute. Only one of newdata, linear.predictors, or x may be specified. If none is specified, but times is specified, you will get survival predictions at all subjects' linear predictor and strata values.

times

a vector of times at which to get predictions. If omitted, predictions are made at all unique failure times in the original input data.

loglog

set to TRUE to make the log-log transformation of survival estimates and confidence limits.

fun

any function to transform the estimates and confidence limits (loglog is a special case)

conf.int

set to FALSE or 0 to suppress confidence limits, or e.g. .95 to cause 0.95 confidence limits to be computed

type

see survfit.coxph

vartype

see survfit.coxph

conf.type

specifies the basis for computing confidence limits. "log" is the default as in the survival package.

se.fit

set to TRUE to get standard errors of log predicted survival (no matter what conf.type is). If FALSE, confidence limits are suppressed.

individual

set to TRUE to have survfit interpret newdata as specifying a covariable path for a single individual (represented by multiple records).

what

Normally use what="survival" to estimate survival probabilities at times that may not correspond to the subjects' own times. what="parallel" assumes that the length of times is the number of subjects (or one), and causes survest to estimate the ith subject's survival probability at the ith value of times (or at the scalar value of times). what="parallel" is used by val.surv for example.

…

unused

Value

If times is omitted, returns a list with the elements time, n.risk, n.event, surv, call (calling statement), and optionally std.err, upper, lower, conf.type, conf.int. The estimates in this case correspond to one subject. If times is specified, the returned list has possible components time, surv, std.err, lower, and upper. These will be matrices (except for time) if more than one subject is being predicted, with rows representing subjects and columns representing times. If times has only one time, these are reduced to vectors with the number of elements equal to the number of subjects.

Details

The result is passed through naresid if newdata, linear.predictors, and x are not specified, to restore placeholders for NAs.

Examples

Run this code

# NOT RUN {
# Simulate data from a population model in which the log hazard
# function is linear in age and there is no age x sex interaction
# Proportional hazards holds for both variables but we
# unnecessarily stratify on sex to see what happens
n <- 1000
set.seed(731)
age <- 50 + 12*rnorm(n)
label(age) <- "Age"
sex <- factor(sample(c('Male','Female'), n, TRUE))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
dt <- -log(runif(n))/h
label(dt) <- 'Follow-up Time'
e <- ifelse(dt <= cens,1,0)
dt <- pmin(dt, cens)
units(dt) <- "Year"
dd <- datadist(age, sex)
options(datadist='dd')
Srv <- Surv(dt,e)


f <- cph(Srv ~ age*strat(sex), x=TRUE, y=TRUE) #or surv=T
survest(f, expand.grid(age=c(20,40,60),sex=c("Male","Female")),
	    times=c(2,4,6), conf.int=.9)
f <- update(f, surv=TRUE)
lp <- c(0, .5, 1)
f$strata   # check strata names
attr(lp,'strata') <- rep(1,3)  # or rep('sex=Female',3)
survest(f, linear.predictors=lp, times=c(2,4,6))

# Test survest by comparing to survfit.coxph for a more complex model
f <- cph(Srv ~ pol(age,2)*strat(sex), x=TRUE, y=TRUE)
survest(f, data.frame(age=median(age), sex=levels(sex)), times=6)

age2 <- age^2
f2 <- coxph(Srv ~ (age + age2)*strata(sex))
new <- data.frame(age=median(age), age2=median(age)^2, sex='Male')
summary(survfit(f2, new), times=6)
new$sex <- 'Female'
summary(survfit(f2, new), times=6)

options(datadist=NULL)
# }

Run the code above in your browser using DataLab