This function prepares a counting-process style survival dataset
for analyses with time-dependent covariates. It merges baseline
and longitudinal data, fills in missing covariate values using
last-observation-carried-forward (LOCF), restricts to time points
where covariates change (optional), and constructs tstart
, tstop
,
and event
variables suitable for use in survival models.
preptdc(
adsl,
adtdc,
id = "SUBJID",
randdt = "RANDDT",
trtsdt = "TRTSDT",
pddt = "PDDT",
xodt = "XODT",
osdt = "OSDT",
died = "DIED",
dcutdt = "DCUTDT",
adt = "ADT",
paramcd = "PARAMCD",
aval = "AVAL",
nodup = TRUE,
offset = TRUE
)
A data set with one row per subject and time interval, including:
tstart
, tstop
— interval start and stop times
(days from randomization).
event
— event indicator (0/1).
Covariates expanded to wide format.
Auxiliary variables such as progression indicator (pd
),
treatment switch indicator (swtrt
), and administrative
censoring time.
A data set containing baseline subject-level information.
It should include, at a minimum, subject ID (id
),
randomization date (randdt
), treatment start date (trtsdt
),
survival outcome (osdt
, died
), progression date (pddt
),
treatment switch date (xodt
), and data cut-off date (dcutdt
).
A data set containing longitudinal
time-dependent covariate data, with subject ID (id
),
parameter code (paramcd
), analysis date (adt
), and covariate
value (aval
).
Character string specifying the column name for subject ID.
Character string specifying the column name for randomization date.
Character string specifying the column name for treatment start date.
Character string specifying the column name for progression date.
Character string specifying the column name for treatment crossover/switch date.
Character string specifying the column name for overall survival date (death date or last known alive date).
Character string specifying the column name for death indicator (0 = alive/censored, 1 = died).
Character string specifying the column name for data cut-off date.
Character string specifying the column name for analysis date in the time-dependent covariate dataset.
Character string specifying the column name for parameter code (identifying different covariates).
Character string specifying the column name for analysis value (covariate values).
Logical; if TRUE
(default), only rows where at least
one covariate changes compared to the previous row (within each subject)
are retained, along with the first row per subject (baseline).
Logical; if TRUE
(default), add 1-day offset when
computing analysis day variables (ady
, osdy
, etc.).
The function performs the following steps:
Merge adsl
and adtdc
to obtain randomization date and
treatment start date.
Define adt2
as adt
if adt > trtsdt
,
and randdt
if adt <= trtsdt
(i.e., baseline time point).
This ensures that the baseline covariate value is the last
non-missing value at or before the treatment start date.
Post-baseline covariate values are anchored at their actual
analysis dates.
Keep the last record per subject, adt2
, and paramcd
.
Construct a complete skeleton so all covariates are present for each subject and time point.
Fill missing covariate values using LOCF.
Pivot to wide format with one row per subject and time point.
Optionally drop rows without covariate changes (nodup = TRUE
).
Merge survival outcomes from adsl
.
Compute time-to-event variables (ady
, osdy
, etc.), as well as
counting-process style variables tstart
, tstop
, and event
.
surv_data <- preptdc(adsl, adtdc, nodup = TRUE)
head(surv_data)
Run the code above in your browser using DataLab