fit_copula_OrdCont() fits the ordinal-continuous vine copula model. See
Details for more information about this model.
fit_copula_OrdCont(
data,
copula_family,
marginal_S0,
marginal_S1,
K_T,
start_copula,
method = "BFGS",
...
)Returns an S3 object that can be used to perform the sensitivity
analysis with sensitivity_analysis_copula().
data frame with three columns in the following order: surrogate
endpoint, true endpoint, and treatment indicator (0/1 coding). Ordinal endpoints
should be integers starting from 1.
One of the following parametric copula families:
"clayton", "frank", "gaussian", or "gumbel". The first element in
copula_family corresponds to the control group, the second to the
experimental group.
List with the following three elements (in order):
Density function with first argument x and second argument para the parameter
vector for this distribution.
Distribution function with first argument x and second argument para the parameter
vector for this distribution.
Inverse distribution function with first argument p and second argument para the parameter
vector for this distribution.
The number of elements in para.
A vector of starting values for para.
Number of categories in the true endpoint.
Starting value for the copula parameter.
Optimization algorithm for maximizing the objective function.
For all options, see ?maxLik::maxLik. Defaults to "BFGS".
Arguments passed on to fit_copula_submodel_OrdCont
names_XYNames for X and Y, respectively.
twostep(boolean) If TRUE, the starting values are fixed for the
marginal distributions and only the copula parameter is estimated.
start_YStarting values for the marginal distribution paramters for Y.
XFirst variable (Ordinal with \(K\) categories)
YSecond variable (Continuous)
KNumber of categories in X.
marginal_YList with the following five elements (in order):
Density function with first argument x and second argument para the parameter
vector for this distribution.
Distribution function with first argument x and second argument para.
Inverse distribution function with first argument p and second argument para.
The number of elements in para.
Starting values for para.
Florian Stijven
Following the Neyman-Rubin potential outcomes framework, we assume that each
patient has four potential outcomes, two for each arm, represented by
\(\boldsymbol{Y} = (T_0, S_0, S_1, T_1)'\). Here, \(\boldsymbol{Y_z} =
(S_z, T_z)'\) are the potential surrogate and true endpoints under treatment
\(Z = z\). We will further assume that \(T\) is ordinal and \(S\) is
continuous; consequently, the function argument X corresponds to \(T\) and
Y to \(S\). (The roles of \(S\) and \(T\) can be interchanged without
loss of generality.)
We introduce latent variables to model \(\boldsymbol{Y}\). Latent variables will be denoted by a tilde. For instance, if \(T_z\) is ordinal with \(K_T\) categories, then \(T_z\) is a function of the latent \(\tilde{T}_z \sim N(0, 1)\) as follows: $$ T_z = g_{T_z}(\tilde{T}_z; \boldsymbol{c}^{T_z}) = \begin{cases} 1 & \text{ if } -\infty = c_0^{T_z} < \tilde{T_z} \le c_1^{T_z} \\ \vdots \\ k & \text{ if } c_{k - 1}^{T_z} < \tilde{T_z} \le c_k^{T_z} \\ \vdots \\ K & \text{ if } c_{K_{T} - 1}^{T_z} < \tilde{T_z} \le c_{K_{T}}^{T_z} = \infty, \\ \end{cases} $$ where \(\boldsymbol{c}^{T_z} = (c_1^{T_z}, \cdots, c_{K_T - 1}^{T_z})\). The latent counterpart of \(\boldsymbol{Y}\) is again denoted by a tilde; for example, \(\tilde{\boldsymbol{Y}} = (\tilde{T}_0, S_0, S_1, \tilde{T}_1)'\) if \(T_z\) is ordinal and \(S_z\) is continuous.
The vector of latent potential outcome \(\tilde{\boldsymbol{Y}}\) is modeled with a D-vine copula as follows: $$ f_{\tilde{\boldsymbol{Y}}} = f_{\tilde{T}_0} \, f_{S_0} \, f_{S_1} \, f_{\tilde{T}_1} \cdot c_{\tilde{T}_0, S_0 } \, c_{S_0, S_1} \, c_{S_1, \tilde{T}_1} \cdot c_{\tilde{T}_0, S_1; S_0} \, c_{S_0, \tilde{T}_1; S_1} \cdot c_{\tilde{T}_0, \tilde{T}_1; S_0, S_1}, $$ where (i) \(f_{T_0}\), \(f_{S_0}\), \(f_{S_1}\), and \(f_{T_1}\) are univariate density functions, (ii) \(c_{T_0, S_0}\), \(c_{S_0, S_1}\), and \(c_{S_1, T_1}\) are unconditional bivariate copula densities, and (iii) \(c_{T_0, S_1; S_0}\), \(c_{S_0, T_1; S_1}\), and \(c_{T_0, T_1; S_0, S_1}\) are conditional bivariate copula densities (e.g., \(c_{T_0, S_1; S_0}\) is the copula density of \((T_0, S_1)' \mid S_0\). We also make the simplifying assumption for all copulas.
In practice, we only observe \((S_0, T_0)'\) or \((S_1, T_1)'\). Hence, to
estimate the (identifiable) parameters of the D-vine copula model, we need
to derive the observed-data likelihood. The observed-data loglikelihood for
\((S_z, T_z)'\) is as follows:
$$
f_{\boldsymbol{Y_z}}(s, t; \boldsymbol{\beta}) =
\int_{c^{T_z}_{t - 1}}^{+ \infty} f_{\boldsymbol{\tilde{Y}_z}}(s, x; \boldsymbol{\beta}) \, dx - \int_{c^{T_z}_{t}}^{+ \infty} f_{\boldsymbol{\tilde{Y}_z}}(s, x; \boldsymbol{\beta}) \, dx.
$$
The above expression is used in ordinal_continuous_loglik() to compute the
loglikelihood for the observed values for \(Z = 0\) or \(Z = 1\). In this
function, X and Y correspond to \(T_z\) and \(S_z\) if \(T_z\) is
ordinal and \(S_z\) continuous. Otherwise, X and Y correspond to
\(S_z\) and \(T_z\).
sensitivity_analysis_copula(), print.vine_copula_fit(),
plot.vine_copula_fit()