preptdc: Prepare Survival Data With Time-Dependent Covariates

Description

This function prepares a counting-process style survival dataset for analyses with time-dependent covariates. It merges baseline and longitudinal data, fills in missing covariate values using last-observation-carried-forward (LOCF), restricts to time points where covariates change (optional), and constructs tstart, tstop, and event variables suitable for use in survival models.

Usage

preptdc(
  adsl,
  adtdc,
  id = "SUBJID",
  randdt = "RANDDT",
  trtsdt = "TRTSDT",
  pddt = "PDDT",
  xodt = "XODT",
  osdt = "OSDT",
  died = "DIED",
  dcutdt = "DCUTDT",
  adt = "ADT",
  paramcd = "PARAMCD",
  aval = "AVAL",
  nodup = TRUE,
  offset = TRUE
)

Value

A data set with one row per subject and time interval, including:

tstart, tstop — interval start and stop times (days from randomization).
event — event indicator (0/1).
Covariates expanded to wide format.
Auxiliary variables such as progression indicator (pd), treatment switch indicator (swtrt), and administrative censoring time.

Arguments

adsl: A data set containing baseline subject-level information. It should include, at a minimum, subject ID (id), randomization date (randdt), treatment start date (trtsdt), survival outcome (osdt, died), progression date (pddt), treatment switch date (xodt), and data cut-off date (dcutdt).
adtdc: A data set containing longitudinal time-dependent covariate data, with subject ID (id), parameter code (paramcd), analysis date (adt), and covariate value (aval).
id: Character string specifying the column name for subject ID.
randdt: Character string specifying the column name for randomization date.
trtsdt: Character string specifying the column name for treatment start date.
pddt: Character string specifying the column name for progression date.
xodt: Character string specifying the column name for treatment crossover/switch date.
osdt: Character string specifying the column name for overall survival date (death date or last known alive date).
died: Character string specifying the column name for death indicator (0 = alive/censored, 1 = died).
dcutdt: Character string specifying the column name for data cut-off date.
adt: Character string specifying the column name for analysis date in the time-dependent covariate dataset.
paramcd: Character string specifying the column name for parameter code (identifying different covariates).
aval: Character string specifying the column name for analysis value (covariate values).
nodup: Logical; if TRUE (default), only rows where at least one covariate changes compared to the previous row (within each subject) are retained, along with the first row per subject (baseline).
offset: Logical; if TRUE (default), add 1-day offset when computing analysis day variables (ady, osdy, etc.).

Details

The function performs the following steps:

Merge adsl and adtdc to obtain randomization date and treatment start date.
Define adt2 as adt if adt > trtsdt, and randdt if adt <= trtsdt (i.e., baseline time point). This ensures that the baseline covariate value is the last non-missing value at or before the treatment start date. Post-baseline covariate values are anchored at their actual analysis dates.
Keep the last record per subject, adt2, and paramcd.
Construct a complete skeleton so all covariates are present for each subject and time point.
Fill missing covariate values using LOCF.
Pivot to wide format with one row per subject and time point.
Optionally drop rows without covariate changes (nodup = TRUE).
Merge survival outcomes from adsl.
Compute time-to-event variables (ady, osdy, etc.), as well as counting-process style variables tstart, tstop, and event.

Examples

Run this code


surv_data <- preptdc(adsl, adtdc, nodup = TRUE)
head(surv_data)

Run the code above in your browser using DataLab