survmean: Compute mean survival times using extrapolation

Description

Compute mean survival times using extrapolation

Usage

survmean(data, surv.breaks = NULL, by.vars = NULL, pophaz = NULL, r = 1,
  agegr.w.breaks = NULL, agegr.w.weights = NULL, ext.breaks = NULL,
  subset = NULL, ...)

Arguments

data

a data set of splitted records; e.g. output of lexpand

surv.breaks

passed on to survtab; see that help

by.vars

a character vector of variables names; e.g. by.vars = "sex" will calculate mean survival separately for each unique combination of these variables

pophaz

a data set of appropriate population hazards as given to lexpand; will be used in extrapolation - see Details

a numeric of length one; multiplies population hazard in pophaz by this number; used e.g. r = 1.1 if excess hazard of 10 percent should persist in extrapolation

agegr.w.breaks

a numeric vector of fractional years as [a,b) breaks as in survtab; will be used to determine standardization age group

agegr.w.weights

a numeric vector of weights breaks as in survtab; will be used to standardize by age group

ext.breaks

advanced; a list of breaks (see lexpand); used as breaks for extrapolation; see Details

subset

a logical condition; e.g. subset = sex == 1; subsets the data before computations

...

any other arguments passed on to survtab such as surv.method = "lifetable" for actuarial estimates of observed survival

Details

survmean computes mean survival times. This is done using a) observed survival estimates computed with survtab and b) extrapolated survival probabilities using EdererI method expected survivals for subjects surviving beyond the roof of surv.breaks, up to 100 years forward but only up to the 125th birthday by default. The area under the resulting extrapolated curve is computed via trapezoidal integration, which is the mean survival time. For extrapolation, the user must supply a pophaz data set of population hazards. The extrapolation itself is essentially done by splitting the extrapolated observations and merging population hazards to those records using lexpand. The user may compute age-standardized mean survival time estimates using the agegr.w.breaks and agegr.w.weights parameters, though this is also fairly simple to do by hand via using the by.vars argument and merging in the weights yourself. Note that mean survival is based by default on hazard-based estimates of observed survival as outlined in survtab. Unlike with actuarial estimates, observed survival can never fall to zero using this method. However, the bias caused by this is likely to be small, and hazard-based estimation allows for e.g. period method estimates of mean survival time. Extrapolation tweaks One may tweak the accuracy and length of extrapolation by using ext.breaks: By default the survivals of any survivors beyond the roof of surv.breaks are extrapolated up to 100 years from the roof of surv.breaks or up to their 125th birthday, whichever comes first. The extrapolation is by default based on the assumption that population hazards supplied by pophaz are constant in time periods of length 1/12, 0.25, or 1 years: if ext.breaks = NULL, it is internally substituted by list(fot = c(0:6/12, 0.75, 1:100), age = c(0, 125)) to be supplied internally to a lexpand call. Hence, alternate specifications allow for longer/shorter and more/less accurate extrapolations. E.g. ext.breaks = list(fot = seq(0,100,1/12), age = 0:125, per = 1900:2100) would ensure a smooth extrapolation and perfect usage of pophaz. This will probably not produce results much different from the default, though.

Examples

Run this code

## take first 5000 subjects in sire data for demonstration
sr <- sire[1:5000, ]
sr$agegr <- cut(sr$dg_age, c(0,45,60,Inf), right=FALSE)
x <- lexpand(sr, breaks=list(fot=seq(0,10,1/12)), pophaz=popmort)
sm <- survmean(x, pophaz=popmort)
## for each level of "agegr" separately:
#sma<- survmean(x, pophaz=popmort, by.vars="agegr")
## automated age-standardised results:
#sms<- survmean(x, pophaz=popmort, agegr.w.breaks=c(0,45,60,Inf))

## visual inspection of how realistic extrapolation is for each stratum;
## grey vertical line points to start of extrapolation;
## solid lines are observed and extrapolated survivals;
## dashed lines are expected survivals
plot(sm)
# plot(sma)
# plot(sms) plots precisely the same as plot(sma)

Run the code above in your browser using DataLab