dataLongMultiSpell: Data Long Transformation for multi spell analysis

Description

Transform data from short format into long format for discrete multi spell survival analysis and right censoring.

Usage

dataLongMultiSpell(dataSet, timeColumn, censColumn, 
idColumn, timeAsFactor=FALSE)

Arguments

dataSet

Original data in short format. Must be of class "data.frame".

timeColumn

Character giving the column name of the observed times. It is required that the observed times are discrete (integer).

censColumn

Character giving the column name of the event status. The event can take multiple values on a discrete scale (0, 1, 2, ...) and repetion of events is allowed. It is assumed that the number zero corresponds to censoring and all number > 0 represent the observed states between transitions.

idColumn

Name of column of identification number of persons as character.

timeAsFactor

Should the time intervals be coded as factor? Default is to use factor. If the argument is false, the column is coded as numeric.

Value

Original data.frame with three additional columns:

obj: Index of persons as integer vector
timeInt: Index of time intervals (formated as factor or integer)
e0: Response in long format as binary vector. Event "e0" is assumed to correspond to censoring. If "e0" is coded one in the in the last observed time interval "timeInt" of a person, then this observation was censored.
e1: Response in long format as binary vector. The event "e1" is the first of the set of possible states "1, 2, 3, ..., X".
... Response in long format as binary vectors. These events correspond to the following states "e2, e3, ...".
eX Response in long format as binary vector. The event "eX" is the last state out of the set of possible states "1, 2, 3, ..., X".

Details

If the data has continuous survival times, the response may be transformed to discrete intervals using function contToDisc. The discrete time variable needs to be strictly increasing for each person, because otherwise the order of the events is not distinguishable. Here is an example data structure in short format prior augmentation with three possible states: \ idColumn=1, 1, ... , 1, 2, 2, ... , n \ timeColumn= t_ID1_1 < t_ID1_1 < ... < t_ID1_k, t_ID2_1 < t_ID2_2 < ... < t_ID2_k, ... \ censColumn=0, 1, ... , 2, 1, 0, ... , 0

References

Gerhard Tutz and Matthias Schmid, (2016), Modeling discrete time-to-event data, Springer series in statistics, Doi: 10.1007/978-3-319-28158-2

Ludwig Fahrmeir, (1997), Discrete failure time models, LMU Sonderforschungsbereich 386, Paper 91, https://epub.ub.uni-muenchen.de/

W. A. Thompson Jr., (1977), On the Treatment of Grouped Observations in Life Studies, Biometrics, Vol. 33, No. 3

Examples

Run this code

# NOT RUN {
############################
# Example of artificial data

# Seed specification
set.seed(-2578)

# Three possible states (0, 1, 2) including censoring
# Discrete time intervals (1, 2, ... , 10)

datFrame <- data.frame(
ID=c(rep(1, 5), rep(2, 3), rep(3, 2), rep(4, 1), rep(5, 3)), 
time=c(c(2, 5, 6, 8, 10), c(1, 6, 7), c(9, 10), c(6), c(2, 3, 4)), 
state=c(c(0, 0, 2, 1, 0), c(1, 2, 2), c(0, 1), c(2), c(0, 2, 1)), 
x=rnorm(n=5+3+2+1+3) )

# Transformation to long format
datFrameLong <- dataLongMultiSpell(dataSet=datFrame, timeColumn="time", 
censColumn="state", idColumn="ID")
head(datFrameLong, 25)

# Fit multi state model without autoregressive terms
library(VGAM)
cRm <- vglm(cbind(e0, e1, e2) ~ timeInt + x, data=datFrameLong, 
family="multinomial")
summary(cRm)
# -> There is no significant effect of x (as expected).

# }

Run the code above in your browser using DataLab