tsegest: The Two-Stage Estimation (TSE) Method Using g-estimation for Treatment Switching

Description

Obtains the causal parameter estimate of the logistic regression switching model and the hazard ratio estimate of the Cox model to account for treatment switching.

Usage

tsegest(
  data,
  id = "id",
  stratum = "",
  tstart = "tstart",
  tstop = "tstop",
  event = "event",
  treat = "treat",
  censor_time = "censor_time",
  pd = "pd",
  pd_time = "pd_time",
  swtrt = "swtrt",
  swtrt_time = "swtrt_time",
  swtrt_time_upper = "",
  base_cov = "",
  conf_cov = "",
  low_psi = -3,
  hi_psi = 3,
  strata_main_effect_only = TRUE,
  firth = FALSE,
  flic = FALSE,
  recensor = TRUE,
  admin_recensor_only = FALSE,
  swtrt_control_only = TRUE,
  alpha = 0.05,
  ties = "efron",
  tol = 1e-06,
  boot = TRUE,
  n_boot = 1000,
  seed = NA
)

Value

A list with the following components:

psi: The estimated causal parameter for the control group.
psi_CI: The confidence interval for psi.
psi_CI_type: The type of confidence interval for psi, i.e., "logistic model" or "bootstrap".
logrank_pvalue: The two-sided p-value of the log-rank test based on the treatment policy strategy.
cox_pvalue: The two-sided p-value for treatment effect based on the Cox model.
hr: The estimated hazard ratio from the Cox model.
hr_CI: The confidence interval for hazard ratio.
hr_CI_type: The type of confidence interval for hazard ratio, either "Cox model" or "bootstrap".
settings: A list with the following components:
- low_psi: The lower limit of the causal parameter.
- hi_psi: The upper limit of the causal parameter.
- strata_main_effect_only: Whether to only include the strata main effects in the logistic regression switching model.
- firth: Whether the firth's penalized likelihood is used.
- flic: Whether to apply intercept correction.
- recensor: Whether to apply recensoring to counter-factual survival times.
- admin_recensor_only: Whether to apply recensoring to administrative censoring time only.
- swtrt_control_only: Whether treatment switching occurred only in the control group.
- alpha: The significance level to calculate confidence intervals.
- ties: The method for handling ties in the Cox model.
- tol: The desired accuracy (convergence tolerance).
- boot: Whether to use bootstrap to obtain the confidence interval for hazard ratio.
- n_boot: The number of bootstrap samples.
- seed: The seed to reproduce the simulation results.
psi_trt: The estimated causal parameter for the treatment group if swtrt_control_only is FALSE.
psi_trt_CI: The confidence interval for psi_trt if swtrt_control_only is FALSE.
hr_boots: The bootstrap hazard ratio estimates if boot is TRUE.
psi_boots: The bootstrap psi estimates if boot is TRUE.
psi_trt_boots: The bootstrap psi_trt estimates if boot is TRUE and swtrt_control_only is FALSE.

Arguments

data

The input data frame that contains the following variables:

id: The id to identify observations belonging to the same subject for counting process data with time-dependent covariates.
stratum: The stratum.
tstart: The starting time of the time interval for counting-process data with time-dependent covariates.
tstop: The stopping time of the time interval for counting-process data with time-dependent covariates.
event: The event indicator, 1=event, 0=no event.
treat: The randomized treatment indicator, 1=treatment, 0=control.
censor_time: The administrative censoring time. It should be provided for all subjects including those who had events.
pd: The disease progression indicator, 1=PD, 0=no PD.
pd_time: The time from randomization to PD.
swtrt: The treatment switch indicator, 1=switch, 0=no switch.
swtrt_time: The time from randomization to treatment switch.
swtrt_time_upper: The upper bound of treatment switching time.
base_cov: The values of baseline covariates (excluding treat).
conf_cov: The values of confounding variables for predicting treatment switching (excluding treat).

id

The name of the id variable in the input data.

stratum

The name(s) of the stratum variable(s) in the input data.

tstart

The name of the tstart variable in the input data.

tstop

The name of tstop variable in the input data.

event

The name of the event variable in the input data.

treat

The name of the treatment variable in the input data.

censor_time

The name of the censor_time variable in the input data.

pd

The name of the pd variable in the input data.

pd_time

The name of the pd_time variable in the input data.

swtrt

The name of the swtrt variable in the input data.

swtrt_time

The name of the swtrt_time variable in the input data.

swtrt_time_upper

The name of the swtrt_time_upper variable in the input data.

base_cov

The vector of names of base_cov variables (excluding treat) in the input data for the Cox model.

conf_cov

The vector of the names of conf_cov variables (excluding treat) in the input data for the logistic regression switching model.

low_psi

The lower limit of the causal parameter.

hi_psi

The upper limit of the causal parameter.

strata_main_effect_only

Whether to only include the strata main effects in the logistic regression switching model. Defaults to TRUE, otherwise all possible strata combinations will be considered in the switching model.

firth

Whether the firth's bias reducing penalized likelihood should be used. The default is FALSE.

flic

Whether to apply intercept correction to obtain more accurate predicted probabilities. The default is FALSE.

recensor

Whether to apply recensoring to counter-factual survival times. Defaults to TRUE.

admin_recensor_only

Whether to apply recensoring to administrative censoring time only. Defaults to FALSE, in which case, recensoring will be applied to the actual censoring time for dropouts.

swtrt_control_only

Whether treatment switching occurred only in the control group.

alpha

The significance level to calculate confidence intervals.

ties

The method for handling ties in the Cox model, either "breslow" or "efron" (default).

tol

The desired accuracy (convergence tolerance) for psi.

boot

Whether to use bootstrap to obtain the confidence interval for hazard ratio. Defaults to TRUE.

n_boot

The number of bootstrap samples.

seed

The seed to reproduce the bootstrap results. The seed from the environment will be used if left unspecified.

Author

Kaifeng Lu, kaifenglu@gmail.com

Details

We use the following steps to obtain the hazard ratio estimate and confidence interval had there been no treatment switching:

Use a pooled logistic regression switching model to estimate the causal parameter $\psi$ based on the patients in the control group who had disease progression: $$\textrm{logit}(p(E_{ik})) = \alpha U_{i,\psi} + \sum_{j} \beta_j x_{ijk}$$ where $E_{ik}$ is observed switch status for individual $i$ at observation $k$, $$U_{i,\psi} = T_{C_i} + e^{\psi}T_{E_i}$$ is the counterfactual survival time for individual $i$ given a specific value for $\psi$, and $x_{ijk}$ are all confounders for individual $i$ at observation $k$. The visit-specific intercepts can be modeled using a natural cubic spline with specified degrees of freedom. The boundary knots and inner knots can be based on the range and percentiles of treatment switching times. When applied from a secondary baseline, $U_{i,\psi}$ refers to post-secondary baseline counterfactual survival, where $T_{C_i}$ refers to the time spent after the secondary baseline on control treatment, and $T_{E_i}$ refers to the time spent after the secondary baseline on the experimental treatment.

In the presence of censoring, let $U_{i,\psi} = T_{C_i} + e^{\psi} T_{E_i}$ and $V_{i,\psi} = \min(\tau_i, e^{\psi}\tau_i)$, where $\tau_i$ is the administrative censoring time for the subject. In addition, let $\Delta_i$ denote the observed event indicators, and let $W_{i,\psi} = \min(U_{i,\psi}, V_{i,\psi})$ and $\Delta_{i,\psi} = \Delta_i I(U_{i,\psi} \leq V_{i,\psi})$ be the recensored survival times and event indicators. Fit a null Cox model to $(W_{i,\psi}, \Delta_{i,\psi})$ to control patients with disease progression, and use the martingale residuals to replace the counterfactual survival times $U_{i,\psi}$ in the pooled logistic regression switching model.
Search for $\psi$ such that the estimate of $\alpha$ is close to zero. This will be the estimate of the caual parameter. The confidence interval for $\psi$ can be obtained as the value of $\psi$ such that the corresponding two-sided p-value for testing $H_0:\alpha = 0$ in the switching model is equal to the nominal significance level.
Derive the counter-factual survival times for control patients had there been no treatment switching. The counter-factual survival times are relative to randomization.
Fit the Cox model to the observed survival times on the treatment arm and the counter-factual untreated survival times on the control arm to obtain the hazard ratio estimate.
Use bootstrap to construct the p-value and confidence interval for hazard ratio.

References

NR Latimer, IR White, K Tilling, and U Siebert. Improved two-stage estimation to adjust for treatment switching in randomised trials: g-estimation to address time-dependent confounding. Statistical Methods in Medical Research. 2020;29(10):2900-2918.

Examples

Run this code


sim1 <- tsegestsim(
  n = 500, allocation1 = 2, allocation2 = 1, pbprog = 0.5, 
  trtlghr = -0.5, bprogsl = 0.3, shape1 = 1.8, 
  scale1 = 0.000025, shape2 = 1.7, scale2 = 0.000015, 
  pmix = 0.5, admin = 5000, pcatnotrtbprog = 0.5, 
  pcattrtbprog = 0.25, pcatnotrt = 0.2, pcattrt = 0.1, 
  catmult = 0.5, tdxo = 1, ppoor = 0.1, pgood = 0.04, 
  ppoormet = 0.4, pgoodmet = 0.2, xomult = 1.4188308, 
  milestone = 546, swtrt_control_only = TRUE,
  outputRawDataset = 1, seed = 2000)
  
fit1 <- tsegest(
  data = sim1$paneldata, id = "id", 
  tstart = "tstart", tstop = "tstop", event = "died", 
  treat = "trtrand", censor_time = "censor_time", 
  pd = "progressed", pd_time = "timePFSobs", swtrt = "xo", 
  swtrt_time = "xotime", swtrt_time_upper = "xotime_upper",
  base_cov = "bprog", conf_cov = "bprog*catlag", 
  low_psi = -3, hi_psi = 3, strata_main_effect_only = TRUE,
  recensor = TRUE, admin_recensor_only = FALSE, 
  swtrt_control_only = TRUE, alpha = 0.05, ties = "efron", 
  tol = 1.0e-6, boot = FALSE)
  
c(fit1$hr, fit1$hr_CI)

sim2 <- tsegestsim(
  n = 500, allocation1 = 2, allocation2 = 1, pbprog = 0.5, 
  trtlghr = -0.5, bprogsl = 0.3, shape1 = 1.8, 
  scale1 = 0.000025, shape2 = 1.7, scale2 = 0.000015, 
  pmix = 0.5, admin = 5000, pcatnotrtbprog = 0.5, 
  pcattrtbprog = 0.25, pcatnotrt = 0.2, pcattrt = 0.1, 
  catmult = 0.5, tdxo = 1, ppoor = 0.1, pgood = 0.04, 
  ppoormet = 0.4, pgoodmet = 0.2, xomult = 1.4188308, 
  milestone = 546, swtrt_control_only = FALSE,
  outputRawDataset = 1, seed = 2000)
  
fit2 <- tsegest(
  data = sim2$paneldata, id = "id", 
  tstart = "tstart", tstop = "tstop", event = "died", 
  treat = "trtrand", censor_time = "censor_time", 
  pd = "progressed", pd_time = "timePFSobs", swtrt = "xo", 
  swtrt_time = "xotime", swtrt_time_upper = "xotime_upper",
  base_cov = "bprog", conf_cov = "bprog*catlag", 
  low_psi = -3, hi_psi = 3, strata_main_effect_only = TRUE,
  recensor = TRUE, admin_recensor_only = FALSE, 
  swtrt_control_only = FALSE, alpha = 0.05, ties = "efron", 
  tol = 1.0e-6, boot = FALSE)
  
c(fit2$hr, fit2$hr_CI)

Run the code above in your browser using DataLab