etm: Computation of the empirical transition matrix

Description

This function computes the empirical transition matrix, also called Aalen-Johansen estimator, of the transition probability matrix of any multistate model. The covariance matrix is also computed.

Usage

# S3 method for data.frame
etm(data, state.names, tra, cens.name, s, t = "last",
    covariance = TRUE, delta.na = TRUE, modif = FALSE,
    c = 1, alpha = NULL, strata, ...)

Value

est: Transition probability estimates. This is a 3 dimension array with the first dimension being the state from where transitions occur, the second the state to which transitions occur, and the last one being the event times.
cov: Estimated covariance matrix. Each cell of the matrix gives the covariance between the transition probabilities given by the rownames and the colnames, respectively.
time: Event times at which the transition probabilities are computed. That is all the observed times between \((s, t]\).
s: Start of the time interval.
t: End of the time interval.
trans: A data.frame giving the possible transitions.
state.names: A vector of character giving the state names.
cens.name: How the censored observation are coded in the data set.
n.risk: Matrix indicating the number of individuals at risk just before an event
n.event: Array containing the number of transitions at each times
delta.na: A 3d array containing the increments of the Nelson-Aalen estimator.
ind.n.risk: When modif is true, risk set size for which the indicator function is 1

If the analysis is stratified, a list of etm objects is returned.

Arguments

data

data.frame of the form data.frame(id,from,to,time) or (id,from,to,entry,exit)

id:: patient id

from:

the state from where the transition occurs

to:

the state to which a transition occurs

time:

time when a transition occurs

entry:

entry time in a state

exit:

exit time from a state

This data.frame is transition-oriented, i.e. it contains one row per transition, and possibly several rows per patient. Specifying an entry and exit time permits to take into account left-truncation.

state.names

A vector of characters giving the states names.

tra

A quadratic matrix of logical values describing the possible transitions within the multistate model.

cens.name

A character giving the code for censored observations in the column 'to' of data. If there is no censored observations in your data, put 'NULL'.

Starting value for computing the transition probabilities.

Ending value. Default is "last", meaning that the transition probabilities are computed over \((s, t]\), \(t\) being the last time in the data set.

covariance

Logical. Decide whether or not computing the covariance matrix. May be useful for, say, simulations, as the variance computation is a bit long. Default is TRUE.

delta.na

Logical. Whether to export the array containing the increments of the Nelson-Aalen estimator. Default is TRUE.

modif

Logical. Whether to apply the modification of Lai and Ying for small risk sets

Constant for the Lai and Ying modification. Either c contains only one value that will be used for all the states, otherwise c should be the same length as state.names.

alpha

Constant for the Lai and Ying modification. If NULL (the default) then only c is used and the Lai and Ying modification discards the event times for which \(Y(t) \geq t\). Otherwise \(cn^\alpha\) is used. It is recommanded to let alpha equal NULL for multistate models.

strata

Character vector giving variables on which to stratify the analysis.

...

Not used

Author

Arthur Allignol, arthur.allignol@gmail.com

Details

Data are considered to arise from a time-inhomogeneous Markovian multistate model with finite state space, and possibly subject to independent right-censoring and left-truncation.

The matrix of the transition probabilities is estimated by the Aalen-Johansen estimator / empirical transition matrix (Andersen et al., 1993), which is the product integral over the time period \((s, t]\) of I + the matrix of the increments of the Nelson-Aalen estimates of the cumulative transition hazards. The \((i, j)-th\) entry of the empirical transition matrix estimates the transition probability of being in state \(j\) at time \(t\) given that one has been in state j at time \(s\).

The covariance matrix is computed using the recursion formula (4.4.19) in Anderson et al. (1993, p. 295). This estimator of the covariance matrix is an estimator of the Greenwood type.

If the multistate model is not Markov, but censorship is entirely random, the Aalen-Johansen estimator still consistently estimates the state occupation probabilities of being in state \(i\) at time \(t\) (Datta & Satten, 2001; Glidden, 2002)

Recent versions of R have changed the data.frame function, where the default for the stringsAsFactors argument from TRUE to FALSE. etm currently depends on the states being factors, so that the user should use data.frame(..., stringsAsFactors=TRUE).

References

Beyersmann J, Allignol A, Schumacher M: Competing Risks and Multistate Models with R (Use R!), Springer Verlag, 2012 (Use R!)

Allignol, A., Schumacher, M. and Beyersmann, J. (2011). Empirical Transition Matrix of Multi-State Models: The etm Package. Journal of Statistical Software, 38.

Andersen, P.K., Borgan, O., Gill, R.D. and Keiding, N. (1993). Statistical models based on counting processes. Springer Series in Statistics. New York, NY: Springer.

Aalen, O. and Johansen, S. (1978). An empirical transition matrix for non-homogeneous Markov chains based on censored observations. Scandinavian Journal of Statistics, 5: 141-150.

Gill, R.D. and Johansen, S. (1990). A survey of product-integration with a view towards application in survival analysis. Annals of statistics, 18(4): 1501-1555.

Datta, S. and Satten G.A. (2001). Validity of the Aalen-Johansen estimators of stage occupation probabilities and Nelson-Aalen estimators of integrated transition hazards for non-Markov models. Statistics and Probability Letters, 55(4): 403-411.

Glidden, D. (2002). Robust inference for event probabilities with non-Markov data. Biometrics, 58: 361-368.

Examples

Run this code

data(sir.cont)

# Modification for patients entering and leaving a state
# at the same date
# Change on ventilation status is considered
# to happen before end of hospital stay
sir.cont <- sir.cont[order(sir.cont$id, sir.cont$time), ]
for (i in 2:nrow(sir.cont)) {
  if (sir.cont$id[i]==sir.cont$id[i-1]) {
    if (sir.cont$time[i]==sir.cont$time[i-1]) {
      sir.cont$time[i-1] <- sir.cont$time[i-1] - 0.5
    }
  }
}

### Computation of the transition probabilities
# Possible transitions.
tra <- matrix(ncol=3,nrow=3,FALSE)
tra[1, 2:3] <- TRUE
tra[2, c(1, 3)] <- TRUE

# etm
tr.prob <- etm(sir.cont, c("0", "1", "2"), tra, "cens", 1)

tr.prob
summary(tr.prob)

# plotting
if (require("lattice")) {
xyplot(tr.prob, tr.choice=c("0 0", "1 1", "0 1", "0 2", "1 0", "1 2"),
       layout=c(2, 3), strip=strip.custom(bg="white",
         factor.levels=
     c("0 to 0", "1 to 1", "0 to 1", "0 to 2", "1 to 0", "1 to 2")))
}

### example with left-truncation

data(abortion)

# Data set modification in order to be used by etm
names(abortion) <- c("id", "entry", "exit", "from", "to")
abortion$to <- abortion$to + 1

## computation of the matrix giving the possible transitions
tra <- matrix(FALSE, nrow = 5, ncol = 5)
tra[1:2, 3:5] <- TRUE

## etm
fit <- etm(abortion, as.character(0:4), tra, NULL, s = 0)

## plot
xyplot(fit, tr.choice = c("0 0", "1 1", "0 4", "1 4"),
       ci.fun = c("log-log", "log-log", "cloglog", "cloglog"),
       strip = strip.custom(factor.levels = c("P(T > t) -- control",
                                              "P(T > t) -- exposed",
                                 "CIF spontaneous abortion -- control",
                                 "CIF spontaneous abortion --
exposed")))

Run the code above in your browser using DataLab