frailtyCmprsk: Fit a Weibull Competing Risks Model with Optional Shared Frailty

Description

Fit a Weibull competing risks model with shared gamma frailty between all transitions (0->1,0->2,...,0->k). Handles left-truncated and right-censored data. The model considers transitions from an initial state (0) to (k) competing absorbing states.

Usage

frailtyCmprsk(formulas, data, maxit = 300,
init.B, init.Theta, init.hazard.weib,
LIMparam = 1e-3, LIMlogl = 1e-3, LIMderiv = 1e-3,
x0, print.info = FALSE, print.result = TRUE,
partialH, blinding = TRUE)

Value

An object of class 'frailtyCmprsk' containing:

b: Vector of the estimated parameters. Order is: (scale(0->1), shape(0->1), scale(0->2), shape(0->2), ..., scale(0->k), shape(0->k),$\hat {\theta}$ (if frailty), $\hat{\beta}_1$, $\hat{\beta}_2$,...,$\hat{\beta}_k$).
call: The matched function call.
coef: Vector of estimated regression coefficients.
loglik: The marginal log-likelihood value at the final parameter estimates.
grad: Gradient vector of the log-likelihood at the final parameter estimates.
n: The number of observations used in the fit.
n.events: Vector containing the number of observed events: count for 0->1, count for 0->2,..., count for 0->k, count for censoring.
n.iter: Number of iterations.
vcov: Variance-covariance matrix for the parameters listed in b.
npar: Total number of estimated parameters.
nvar: Total number of regression coefficients.
shape.weib: Vector of estimated Weibull baseline shape parameters (shape(0->1),shape(0->2),...,shape(0->k)).
scale.weib: Vector of estimated Weibull baseline scale parameters (scale(0->1),scale(0->2),...,scale(0->k)).
crit: Convergence status code: 1=converged, 2=maximum iterations reached, 3=converged using partial Hessian, 4=the algorithm encountered a problem in the loglikelihood computation.
Frailty: Logical. TRUE if a model with shared frailty (cluster(.)) was fitted.
beta_p.value: Vector of p-values from Wald tests for the regression coefficients in coef.
AIC: Akaike Information Criterion, calculated as $AIC=\frac{1}{n}(np - l(.))$, where np is the number of parameters and l is the log-likelihood.
x0: List of numeric vectors, where each vector contains the time points used for calculating the baseline functions for each transition.
lam0: List of matrices, where each matrix contains baseline hazard estimates and 95% confidence intervals at the corresponding time points in x0, for each transition.
surv0: List of matrices, where each matrix contains baseline survival estimates and 95% confidence intervals at the corresponding time points in x0, for each transition.
medians: List of matrices, where each matrix contains the estimated median baseline survival time and its 95% confidence interval for each transition.
linear.pred: List of numeric vectors, where each vector contains to the linear predictors for transition 0->l ($l=1,...,k$). For non-frailty models, each element is of the form $\hat{\beta}_l^{t}X_l$. For frailty models, it includes the estimated log-frailty: $\hat{\beta}_l^{t}X_l + \log(\hat{\omega}_i)$.
names.factor: List of character vectors, where each vector contains the factor covariates included in the model for the corresponding transition.
global_chisq: List of numeric vectors, where each vector contains the chi-squared statistics from global Wald tests for factor variables for the corresponding transition.
dof_chisq: List of integer vectors, where each vector contains the degrees of freedom for the global Wald tests for the corresponding transition.
p.global_chisq: List of numeric vectors, where each vector contains the p-values for the global Wald tests for the corresponding transition.
global_chisq.test: Binary vector of length k indicating whether any global factor tests were performed for each transition.

If Frailty is TRUE, the following components related to frailty are also included:

groups: The number of unique groups specified by cluster(.).
theta: The estimated variance ($\hat{\theta}$) of the Gamma frailty distribution.
theta_p.value: The p-value from a Wald test for the null hypothesis $H_0: \theta=0$.
VarTheta: The estimated variance of the frailty variance estimator: $\hat{Var}(\hat{\theta})$.
frailty.pred: Vector containing the empirical Bayes predictions of the frailty term for each group.
frailty.var: Vector containing the variances of the empirical Bayes frailty predictions.
frailty.sd: Vector containing the standard errors of the empirical Bayes frailty predictions.

Arguments

formulas: A list of formula objects. The first formula must include a response on the left-hand side of a ~ operator. The response must be a survival object as returned by the Surv function (e.g., survival::Surv). The argument type = "mstate" must be specified within the Surv function. The status indicator in the Surv object should be: 0 for right-censoring, 1 for the first competing event, 2 for the second, ..., and k for the kth competing event. Covariates for the transition from 0 to 1 are specified on the right-hand side (RHS) of the first formula. The remaining elements of the list should be one-sided formulas (e.g., ~ var1 + var2), used only to specify the covariates. The second formula corresponds to the transition 0 -> 2, the third to 0 -> 3, and so on, up to the kth formula for transition 0 -> k. Left-truncation is supported and should be specified using the three-argument Surv(time1, time2, status) notation. Shared frailty can be specified via cluster(group_variable) on the RHS of the first formula only. It should not be included in the other formulas.
data: A data.frame containing the variables named in formulas.
maxit: Maximum number of iterations for the Marquardt optimization algorithm. Default is 300.
init.B: Optional. A vector of initial values for the regression coefficients. It should contain the coefficients for each transition from 0->1, ..., up to 0->k (i.e., $\beta_1$, $\beta_2$, ..., $\beta_k$), concatenated in order. The total length must match the total number of covariates specified across all formulas. If omitted, the regression coefficients are initialized from default values. These defaults are obtained by fitting k independent Weibull proportional hazards models (without frailty), one for each transition.
init.Theta: Optional. Initial value for the frailty variance $\theta$. Default is 0.1. This parameter is only used if cluster() is present in formulas.
init.hazard.weib: Optional. A vector of initial values for the Weibull baseline hazard parameters. It must be of length 2 * k, with the values ordered as: scale(0->1), shape(0->1), scale(0->2), shape(0->2), ..., scale(0->k), shape(0->k). If omitted, the baseline hazard parameters are initialized from default values. These defaults are obtained by fitting k independent Weibull proportional hazards models (without frailty), one for each transition.
LIMparam: Convergence threshold for the parameters based on the maximum absolute difference between successive iterations ($10^{-3}$ by default).
LIMlogl: Convergence threshold for the log-likelihood based on the absolute difference between successive iterations ($10^{-3}$ by default).
LIMderiv: Convergence threshold based on the relative distance to the optimum (related to gradient and Hessian) ($10^{-3}$ by default). See Details.
x0: Optional. A list of numeric vectors, where each vector specifies the time points at which to compute the baseline hazard and survival functions for a given transition. The order must follow the transitions: 0->1, 0->2, ..., 0->k. If not provided, defaults to a list where each element is a sequence of 99 time points from 0 to the maximum observed time for the corresponding transition.
print.info: Logical. If TRUE, prints information at each iteration of the optimization algorithm. Default is FALSE.
print.result: Logical. If TRUE, prints a formatted summary of the results. Default is TRUE.
partialH: Optional. Integer vector specifying the indices of parameters to exclude from the Hessian matrix when calculating the relative distance convergence criterion (LIMderiv). This is only considered if the first two criteria (LIMparam, LIMlogl) are met and the full Hessian is problematic (e.g., not invertible). Default is NULL.
blinding: Logical. If TRUE, the algorithm attempts to continue even if the log-likelihood calculation produces non-finite values (e.g., Inf, NaN) at some iteration. Setting to FALSE may cause the algorithm to stop earlier in such cases. Default is TRUE.

Details

Let $T$ be the time to event and $L \in \{1,2,...,k\}$ the indicator of the cause of the event.

The cause-specific hazard rate for cause $l \in \{1,2,...,k\}$ is: $$ \lambda_{l}(t) = \lim_{\Delta t \to 0^+} \frac{\mathbb{P}(t \leq T \leq t + \Delta t, L=l \mid T \geq t)}{\Delta t} $$

A proportional hazards model with a shared frailty term $\omega_i$ is assumed for each transition within group $i$. For the $j^{th}$ subject ($j=1,...,n_i$) in the $i^{th}$ group ($i=1,...,G$), the $l^{th}$ ($l=1,...,k$) transition intensity is defined as follows:

$$ \lambda_{l}^{ij}(t |\omega_i,X_{l}^{ij}) = \lambda_{0l}(t) \omega_i \exp(\beta_l^{T} X_{l}^{ij}) $$ where $\omega_i \sim\Gamma(\frac{1}{\theta},\frac{1}{\theta})$ with $\bold{E}(\omega_i)=1$ and $\bold{Var}(\omega_i)=\theta$.

$\omega_i$ is the frailty term for the $i^{th}$ group. For subject-specific frailties, use cluster(id) where id is unique ($n_i=1$). $\beta_l$ ($l=1,...,k$) is the vector of time fixed regression coefficients for the transition 0->l. $X_{l}^{ij}$ ($l=1,...,k$) is the vector of time fixed covariates for the $j^{th}$ subject in the $i^{th}$ group for the transition 0->l. $\lambda_{0l}(.)$ ($l=1,...,k$) is the baseline hazard function for the transition 0->l.

The Weibull baseline hazard parameterization is: $$\lambda(t) = \frac{\gamma}{\lambda^\gamma} \cdot t^{\gamma - 1}$$ where $\lambda$ is the scale parameter and $\gamma$ the shape parameter

References

Marquardt, D. W. (1963). An algorithm for least-squares estimation of nonlinear parameters. SIAM Journal on Applied Mathematics, 11(2), 431-441.

Liquet, B., Timsit, J. F., & Rondeau, V. (2012). Investigating hospital heterogeneity with a multi-state frailty model: application to nosocomial pneumonia disease in intensive care units. BMC Medical Research Methodology, 12(1), 1-14.

Examples

Run this code

# \donttest{


data(CPRSKbmtcrr)

##-- Weibull competing risks model with 
##   group frailty shared between transitions --##

ModCmprsk_Group <- frailtyCmprsk(
  formulas = list(
    Surv(observed_time, Status, type = "mstate") ~ cluster(group) + Sex,
    ~ Sex
  ),
  data        = CPRSKbmtcrr,
  print.info  = FALSE,
  maxit       = 100
)

##-- Weibull competing risks model with subject-specific 
##   frailty shared between transitions --##

ModCmprsk_Subject <- frailtyCmprsk(
  formulas = list(
    Surv(observed_time, Status, type = "mstate") ~ cluster(id) + Sex,
    ~ Sex
  ),
  data        = CPRSKbmtcrr,
  print.info  = FALSE,
  maxit       = 100
)

##--- Simple Weibull competing risks model with left truncation ---##

ModCmprsk_LeftTrunc <- frailtyCmprsk(
  formulas = list(
    Surv(Age, observed_time, Status, type = "mstate") ~ Source,
    ~ Sex
  ),
  data       = CPRSKbmtcrr,
  print.info = FALSE
)

##--- Simple Weibull competing risks model with a factor and 
##    no covariates for the first competing event (left truncation) ---##

ModCmprsk_Factor_LeftTrunc <- frailtyCmprsk(
  formulas = list(
    Surv(Age, observed_time, Status, type = "mstate") ~ Source,
    ~ factor(Phase)
  ),
  data       = CPRSKbmtcrr,
  print.info = FALSE
)  # }

Run the code above in your browser using DataLab