Learn R Programming

Surrogate (version 3.0)

ica_SurvSurv_sens: Sensitivity analysis for individual causal association

Description

The ica_SurvSurv_sens() function performs the sensitivity analysis for the individual causal association (ICA) as described by Stijven et al. (2022).

Usage

ica_SurvSurv_sens(
  fitted_model,
  n_sim,
  n_prec,
  minfo_prec = 0,
  restr = TRUE,
  copula_family2,
  ncores = 1,
  get_marg_tau = FALSE,
  cond_ind = FALSE
)

Value

A data frame is returned. Each row represents one replication in the sensitivity analysis. The returned data frame always contains the following columns:

  • kendall, sp_rho, minfo: ICA as quantified by Kendall's \(\tau\), Spearman's \(\rho\), and the mutual information, respectively.

  • c23, c13_2, c24_3, c14_23: sampled copula parameters of the unidentifiable copulas in the D-vine copula. The parameters correspond to the parameterization of the copula_family2 copula as in the copula R-package.

  • r23, r13_2, r24_3, r14_23: sampled rotation parameters of the unidentifiable copulas in the D-vine copula. These values are constant for the Gaussian copula family since that copula is invariant to rotations.

The returned data frame also contains the following columns when get_marg_tau

is TRUE:

  • sp_s0s1, sp_s0t0, sp_s0t1, sp_s1t0, sp_s1t1, sp_t0t1: Spearman's \(\rho\) between the corresponding potential outcomes. Note that these associations refer to the potential time-to-composite events and/or time-to-true endpoint event. In contrary, the estimated association parameters from fit_model_SurvSurv() refer to associations between the time-to-surrogate event and time-to true endpoint event.

  • prop_harmed, prop_protected, prop_always, prop_never: proportions of the corresponding population strata in each replication. These are defined in Nevo and Gorfine (2022).

Arguments

fitted_model

Returned value from fit_model_SurvSurv(). This object contains the estimated identifiable part of the joint distribution for the potential outcomes.

n_sim

Number of replications in the sensitivity analysis. This value should be large enough to sufficiently explore all possible values of the ICA. The minimally sufficient number depends to a large extent on which inequality assumptions are subsequently imposed (see Additional Assumptions).

n_prec

Number of Monte-Carlo samples for the numerical approximation of the ICA in each replication of the sensitivity analysis.

minfo_prec

Number of quasi Monte-Carlo samples for the numerical integration to obtain the mutual information. If this value is 0 (default), the mutual information is not computed and NA is returned for that column.

restr

Default value should not be modified by the user.

copula_family2

Parametric family of the unidentifiable copulas in the D-vine copula. One of the following parametric copula families: "clayton", "frank", "gaussian", or "gumbel".

ncores

Number of cores used in the sensitivity analysis. The computations are computationally heavy, and this option can speed things up considerably.

get_marg_tau

Boolean.

  • TRUE: Return marginal association measures in each replication in terms of Spearman's rho. The proportion of harmed, protected, never diseased, and always diseased is also returned. See also Value.

  • FALSE (default): No additional measures are returned.

cond_ind

Boolean.

  • TRUE: Assume conditional independence (see Additional Assumptions).

  • FALSE (default): Conditional independence is not assumed.

Quantifying Surrogacy

In the causal-inference framework to evaluate surrogate endpoints, the ICA is the measure of primary interest. This measure quantifies the association between the individual causal treatment effects on the surrogate (\(\Delta S\)) and on the true endpoint (\(\Delta T\)). Stijven et al. (2022) proposed to quantify this association through the squared informational coefficient of correlation (SICC or \(R^2_H\)), which is based on information-theoretic principles. Indeed, \(R^2_H\) is a transformation of the mutual information between \(\Delta S\) and \(\Delta T\), $$R^2_H = 1 - e^{-2 \cdot I(\Delta S; \Delta T)}.$$ By token of that transformation, \(R^2_H\) is restricted to the unit interval where 0 indicates independence, and 1 a functional relationship between \(\Delta S\) and \(\Delta T\). The mutual information is returned by ica_SurvSurv_sens() if a non-zero value is specified for minfo_prec (see Arguments).

The association between \(\Delta S\) and \(\Delta T\) can also be quantified by Spearman's \(\rho\) (or Kendall's \(\tau\)). This quantity requires appreciably less computing time than the mutual information. This quantity is therefore always returned for every replication of the sensitivity analysis.

Sensitivity Analysis

Because \(S_0\) and \(S_1\) are never simultaneously observed in the same patient, \(\Delta S\) is not observable, and analogously for \(\Delta T\). Consequently, the ICA is unidentifiable. This is solved by considering a (partly identifiable) model for the full vector of potential outcomes, \((T_0, S_0, S_1, T_1)'\). The identifiable parameters are estimated. The unidentifiable parameters are sampled from their parameters space in each replication of a sensitivity analysis. If the number of replications (n_sim) is sufficiently large, the entire parameter space for the unidentifiable parameters will be explored/sampled. In each replication, all model parameters are "known" (either estimated or sampled). Consequently, the ICA can be computed in each replication of the sensitivity analysis.

The sensitivity analysis thus results in a set of values for the ICA. This set can be interpreted as all values for the ICA that are compatible with the observed data. However, the range of this set is often quite broad; this means there remains too much uncertainty to make judgements regarding the worth of the surrogate. To address this unwieldy uncertainty, additional assumptions can be used that restrict the parameter space of the unidentifiable parameters. This in turn reduces the uncertainty regarding the ICA.

Additional Assumptions

There are two possible types of assumptions that restrict the parameter space of the unidentifiable parameters: (i) equality type of assumptions, and (ii) inequality type of assumptions. These are discussed in turn in the next two paragraphs.

The equality assumptions have to be incorporated into the sensitivity analysis itself. Only one type of equality assumption has been implemented; this is the conditional independence assumption which can be specified to ica_SurvSurv_sens() through the cond_ind argument: $$\tilde{S}_0 \perp \!\!\! \perp T_1 | \tilde{S}_1 \; \text{and} \; \tilde{S}_1 \perp \!\!\! \perp T_0 | \tilde{S}_0 .$$ This can informally be interpreted as ``what the control treatment does to the surrogate does not provide information on the survival time under experimental treatment if we already know what the experimental treatment does to the surrogate", and analogously when control and experimental treatment are interchanged.

The inequality type of assumptions have to be imposed on the data frame that is returned by the ica_SurvSurv_sens() function; those assumptions are thus imposed after running the sensitivity analysis. If get_marg_tau is set to TRUE, the returned data frame contains two types of additional unverifiable quantities that differ across replications of the sensitivity analysis: (i) the unconditional Spearman's \(\rho\) for all pairs of potential outcomes, and (ii) the proportions of the population strata as defined by Nevo and Gorfine (2022). More details on the interpretation and use of these assumptions can be found in Stijven et al. (2022).

References

Stijven, F., Alonso, a., Molenberghs, G., Van Der Elst, W., Van Keilegom, I. (2022). An information-theoretic approach to the evaluation of time-to-event surrogates for time-to-event true endpoints based on causal inference.

Nevo, D., & Gorfine, M. (2022). Causal inference for semi-competing risks data. Biostatistics, 23 (4), 1115-1132

Examples

Run this code
library(Surrogate)
data("Ovarian")
# For simplicity, data is not recoded to semi-competing risks format, but the
# data are left in the composite event format.
data = data.frame(
  Ovarian$Pfs,
  Ovarian$Surv,
  Ovarian$Treat,
  Ovarian$PfsInd,
  Ovarian$SurvInd
)
ovarian_fitted =
    fit_model_SurvSurv(data = data,
                       copula_family = "clayton",
                       nknots = 1)
# Illustration with small number of replications and low precision
ica_SurvSurv_sens(ovarian_fitted,
                  n_sim = 5,
                  n_prec = 2000,
                  copula_family2 = "clayton")


Run the code above in your browser using DataLab