survtab: Survival tables

Description

Given a data set processed by lexpand, estimates various survival time functions as requested by the user.

Usage

survtab(data, surv.breaks = NULL, by.vars = NULL, event.values = NULL,
  surv.type = "surv.rel", surv.method = "hazard", relsurv.method = "e2",
  subset = NULL, agegr.w.breaks = NULL, agegr.w.weights = NULL,
  conf.level = 0.95, conf.type = "log-log", format = TRUE,
  verbose = FALSE)

Arguments

data

a dataset processed by lexpand

surv.breaks

breaks as explicitly left inclusive and right exclusive - e.g. [a,b).

by.vars

a character string vector; defines names of variables by which survivals are calculated separately; e.g. by.vars = c('sex', 'area') computes survivals separately for each combination of 'sex' and 'area'.

event.values

a vector of values present in the datasets's lex.Xst column; if NULL, uses all but the first level of as.factor(lex.Xst) as event.values; these values are events and all other values are considered t

surv.type

either 'surv.obs', 'surv.cause', 'surv.rel', 'cif.obs' or 'cif.rel'; defines what kind of survival time function(s) is/are estimated; see Details

surv.method

either 'lifetable' or 'hazard'; determines the method of calculating survival time functions

relsurv.method

either 'e2' or 'pp'; defines whether to compute relative survival using the EdererII method or using Pohar-Perme weighting; ignored if surv.type != "surv.rel"

subset

a logical condition; e.g. subset = sex == 1; subsets the data before computations

agegr.w.breaks

optional; if not NULL, given breaks will be used to define age groups in age group weighting; given as left inclusive and right exclusive, e.g. [a,b)

agegr.w.weights

optional; if agegr.w.breaks is not NULL, the user can define a vector of weights to give to each age group defined by agegr.w.breaks; if agegr.w.weights is NULL, internal weights (see E

conf.level

confidence level used in confidence intervals; e.g. 0.95 for 95 percent confidence intervals

conf.type

character string; must be one of "plain", "log-log" and "log"; defines the transformation used on the survival (and/or relative survival) function to yield confidence intervals via the delta method

format

logical; if TRUE, output is formatted into a neat table; otherwise you get all the raw results

verbose

logical; if TRUE, the function is chatty and returns some messages and timings along the process

Value

Returns a table of life time function values and other information with survival intervals as rows. Returns some of the following estimates of survival time functions:
- surv.obs- observed (raw) survival
- CIF_k- cumulative incidence function for causek
- CIF.rel- cumulative incidence function using excess cases
- r.e2- relative survival, EdererII
- r.pp- relative survival, Pohar-Perme weighted
The suffix .as implies agegroup-standardisation, and .lo and .hi imply lower and upper confidence limits, respectively. The prefix SE. stands for standard error.

Details

Basics survtab creates survival tables using data split with e.g. lexpand. We recommend using lexpand since it is well tested and one usually needs to merge in population hazards to ocmpute relative survivals. By default survtab makes use of the exact same breaks that were used in splitting (with e.g. lexpand), so it is not necessary to specify any surv.breaks. If specified, the surv.breaks must be a subset of the pertinent breaks given in lexpand. The function supplies surv.breaks to cut to create survival intervals in the data, e.g. surv.breaks=0:5 -> [0,1),[1,2), ..., [4,5). Interval lengths (deltas) are also calculated based on surv.breaks. The upper limit of the breaks should be meaningful and never e.g. Inf. if surv.type = 'surv.obs', only 'raw' observed survival is calculated over the chosen time intervals. With surv.type = 'surv.rel', also relative survival estimates are supplied in addition to observed survival figures. surv.type = 'cif.obs' requests cumulative incidence functions (CIF) to be estimated, where all unique event.values are seen as competing risks indicators (others are random censoring indicators); CIFs are estimated for each competing risk are computed based on a survival-interval-specific proportional hazards assumption as described by Chiang (1968) using the chosen surv.method. With surv.type = 'cif.rel', a CIF is estimated with using excess cases as the ''cause-specific'' cases. if surv.type = 'surv.cause', cause-specific survivals are estimated separately for each unique value of event.values. Relative / net survival When surv.type = 'surv.rel', the user can choose relsurv.method = 'pp', whereupon additional Pohar-Perme weighting is used to get closer to a true net survival measure. By default relsurv.method = 'e2'. Age-standardised survival The user can also apply age standardisation on top of everything else. Then the requested survival figures are calculated for each age group separately, and then a weighted average of the age-group-specific survivals is presented. The user must define the age-standardisation age groups and their weights with the agegr.w.breaks and agegr.w.weights arguments. The numbers of age groups and weights should match; e.g. with agegr.w.breaks = c(0,45,65,75,Inf) the weights vector must then have 4 elements. The agegr.w.weights do not have to sum to one as they are processed internally to do so. If one wishes to use one of the three integrated international standard weighting schemes available, one must specify the weighting scheme by using e.g. agegr.w.weights = "ICSS1", and also by specifying the used agegr.w.breaks. However, as the weights are available only for 5-year age groups, the agegr.w.breaks must all (except the last) be divisible by 5; e.g. agegr.w.breaks = c(0, 45, 65, 85, Inf). You can see the weights integrated into popEpi by typing ICSS into the console. See also ICSS. Note that the by.vars should not be confused with age-standardisation. by.vars simply determine variables, for the unique combinations of which survivals are computed and outputted separately. Period analysis / delayed entry If one wishes to calculate period analysis / delayed entry estimates, one should limit the data accordingly when expanding the data; see lexpand. Data requirements This function requires the data to contain, at minimum, the variables lex.id, lex.dur,lex.Cst, lex.Xst, and fot; these will be enough to calculate observed survivals. Relative survivals require additional information. EdererII relative survival requires the presence of a pop.haz variable in the data, and Pohar-Perme weighting requires pp (the inverse cumulative population survival). Both can be computed with lexpand. You may take a look at a simulated cohort sire as an example of the minimum required information for when processing data to be used in calculating relative survival (in the Finnish context).

References

Perme, Maja Pohar, Janez Stare, and Jacques Estève. "On estimation in relative survival." Biometrics 68.1 (2012): 113-120. Hakulinen, Timo, Karri Seppa, and Paul C. Lambert. "Choosing the relative survival method for cancer survival estimation." European Journal of Cancer 47.14 (2011): 2202-2210. Seppa, Karri, Timo Hakulinen, and Arun Pokhrel. "Choosing the net survival method for cancer survival estimation." European Journal of Cancer (2013). CHIANG, Chin Long. Introduction to stochastic processes in biostatistics. 1968.

Examples

Run this code

## see more examples with explanations in vignette("survtab_examples")

## prepare data for e.g. 5-year "period analysis" for 2008-2012
## note: sire is a simulated cohort integrated into popEpi.
BL <- list(fot=seq(0, 5, by = 1/12),
           per = c("2008-01-01", "2013-01-01"))
x <- lexpand(sire, birth = bi_date, entry = dg_date, exit = ex_date,
             status = status %in% 1:2,
             breaks = BL,
             pophaz = popmort)

## calculate relative EdererII period method
## survivals using the fot.breaks given in lexpand()
st <- survtab(x)

Run the code above in your browser using DataLab