BayesID: The function to fit parametric and semi-parametric hierarchical models to semi-competing risks data.

Description

Independent (not cluster-correlated) semi-competing risks data can be analyzed using hierarchical models. The prior of baseline hazard functions can be specified by either parametric (Weibull) model or non-parametric mixture of piecewise exponential models (PEM). The conditional hazard function for time to the terminal event given time to non-terminal can be modeled as Markov (it does not depend on the timing of the non-terminal event) or semi-Markov (it does depend on the timing).

Usage

BayesID(survData, nCov, hyperParams, startValues, mcmcParams, nGam_save, numReps, 
thin, path, burninPerc = 0.5, hz.type = "Weibull", model = "Markov", nChain = 1)

Arguments

survData

The data frame containing semi-competing risks data with covariate matrices from n subjects. It is of dimension $n\times(4+p_1+p_2+p_3)$, where $p_1$,$p_2$,$p_3$ are the number of covariates included in the conditional hazard functions $h_1$,

nCov

a numeric vector that contains dimensions of covariate space: c($p_1$,$p_2$,$p_3$)

hyperParams

a vector containing hyperparameter values for hierarchical model. For PEM model (hz.type="PEM"), it is of length 14; c($a_1$, $a_2$, $a_3$, $b_1$, $b_2$, $b_3$, $\alpha_1$, $\alpha_2$, $\alpha_3$, $c_{\lambda1}$, $c_{\lambda2}$, $c_{\lambda3}

startValues

a list containing vectors of starting values of parameters. The length of list must be equal to the number of chains (nChain). For PEM model (hz.type="PEM"), each of list components is of length $p_1+p_2+p_3+10+n+2(J_1+1)+2(J_2+1

mcmcParams

a vector containing variables required for Metropolis-Hastings-Green (MHG) algorithm. For PEM model (hz.type="PEM"), it is of length $(18+num\_s\_propBI1+num\_s\_propBI2+num\_s\_propBI3+nTime\_lambda1+nTime\_lambda2+nTime\_lambda3$); c($C_1$

nGam_save

the number of $\gamma$ to be stored

numReps

total number of scans

thin

extent of thinning

path

the name of directory where the results are saved

burninPerc

the proportion of burn-in

hz.type

prior specification for baseline hazard functions : either "Weibull" or "PEM"

model

assumptions on $h_{03}$: either "Markov" or "semi-Markov"

nChain

the number of chains

Value

BayesID returns an object of class BayesID. names(object$chain1) shows the list of posterior samples of model parameters, the number of acceptance in MHG algorithm, etc.

Details

We view the semi-competing risks data as arising from an underlying illness-death model system in which individuals may undergo one or more of three transitions: 1) no event to non-terminal event, 2) no event to terminal event, 3) non-terminal event to terminal event. Let $t_{1i}$, $t_{2i}$ denote time to non-terminal and terminal event from subject $i=1,...,n$. The system of transitions is modeled via the specification of three hazard functions: $$h_1(t_{1i} | \gamma_i, x_{1i}) = \gamma_i h_{01}(t_{1i})\exp(\beta_1 x_{1i}), t_{1i}>0,$$ $$h_2(t_{2i} | \gamma_i, x_{2i}) = \gamma_i h_{02}(t_{2i})\exp(\beta_2 x_{2i}), t_{2i}>0,$$ $$h_3(t_{2i} | t_{1i}, \gamma_i, x_{3i}) = \gamma_i h_{03}(t_{2i})\exp(\beta_3 x_{3i}), 0

where $I(\cdot)$ is the indicator function and $s_{g,0} \equiv 0$. Note, this specification is general in that the partitions of the time axes differ across the three hazard functions. In our proposed Bayesian framework, our prior choices are, for $g\in{1,2,3}$:

$$\pi(\beta_g) \propto 1,$$ $$\lambda_g | J_g, \mu_{\lambda_g}, \sigma_{\lambda_g}^2 \sim \mathcal{N}_{J_g+1}(\mu_{\lambda_g}1, \sigma_{\lambda_g}^2\Sigma_{\lambda_g}),$$ $$J_g \sim \mathcal{P}(\alpha_g),$$ $$\pi(s_g | J_g) \propto \frac{(2J_g+1)! \prod_{j=1}^{J_g+1}(s_{g,j}-s_{g,j-1})}{(s_{g,J_g+1})^{(2J_g+1)}},$$ $$\pi(\mu_{\lambda_g}) \propto1,$$ $$\sigma_{\lambda_g}^{-2} \sim \mathcal{G}(a_g, b_g),$$ $$\gamma_i|\theta \sim \mathcal{G}(\theta^{-1}, \theta^{-1}),$$ $$\theta^{-1} \sim \mathcal{G}(\psi, \omega).$$ Note that $J_g$ and $s_g$ are treated as random and the priors for $J_g$ and $s_g$ jointly form a time-homogeneous Poisson process prior for the partition. The number of time splits and their positions are therefore updated within our computational scheme using reversible jump MCMC. For parametric Weibull prior specification for baseline hazard functions, $h_{0g}(t) = \alpha_g \kappa_g t^{\alpha_g-1}$. In our Bayesian framework, our prior choices are, for $g\in{1,2,3}$: $$\pi(\beta_g) \propto 1,$$ $$\pi(\alpha_g) \sim \mathcal{G}(a_g, b_g),$$ $$\pi(\kappa_g) \sim \mathcal{G}(c_g, d_g),$$ $$\gamma_i|\theta \sim \mathcal{G}(\theta^{-1}, \theta^{-1}),$$ $$\theta^{-1} \sim \mathcal{G}(\psi, \omega).$$

References

Lee, K. H., Haneuse, S., Schrag, D., and Dominici, F. (2014), Bayesian semiparametric analysis of semicompeting risks data: investigating hospital readmission after a pancreatic cancer diagnosis, Journal of the Royal Statistical Society: Series C, in press.

Examples

Run this code

# loading a data set
data(scrData)
	
n = dim(scrData)[1]	
p1 = 2	
p2 = 2
p3 = 2
nCov <- c(p1, p2, p3) 	

###############################
# setting hyperparameter values for semi-parametric analysis
	
a1		<- 0.7	# prior parameters for 1/sigma_1^2
b1		<- 0.7
a2		<- 0.7	# prior parameters for 1/sigma_2^2
b2		<- 0.7
a3		<- 0.7	# prior parameters for 1/sigma_3^2
b3		<- 0.7
alpha1	<- 10	# prior parameter for J1
alpha2	<- 10	# prior parameter for J2
alpha3	<- 10	# prior parameter for J3
c_lam1 	<- 1	# prior parameter for MVN-ICAR specification
c_lam2 	<- 1	
c_lam3 	<- 1
psi	<- 0.7	# prior parameters for 1/theta
omega	<- 0.7	

nGam_save <- 5  # the number of subjects whose gamma parameters are being saved

hyperParams <- c(a1, b1, a2, b2, a3, b3, alpha1, alpha2, alpha3, c_lam1, c_lam2, c_lam3, 
psi, omega)

#########################
# setting starting values
			
s1_max		<- max(scrData$time1[scrData$event1 == 1])
s2_max		<- max(scrData$time2[scrData$event1 == 0 & scrData$event2 == 1])
s3_max		<- max(scrData$time2[scrData$event1 == 1 & scrData$event2 == 1])

beta1		<- c(0.1, 0.1)
beta2		<- c(0.1, 0.1)
beta3		<- c(0.1, 0.1)
s1		<- c(seq(4, 36, 4), s1_max)
s2		<- c(seq(4, 36, 4), s2_max)
s3		<- c(seq(4, 36, 4), s3_max)
J1		<- length(s1) - 1
J2		<- length(s2) - 1
J3		<- length(s3) - 1
lambda1		<- runif(J1+1, -2, -1)
lambda2		<- runif(J2+1, -2, -1)
lambda3		<- runif(J3+1, -2, -1)

sigSq_lam1	<- var(lambda1)
sigSq_lam2	<- var(lambda2)
sigSq_lam3	<- var(lambda3)
mu_lam1		<- mean(lambda1)
mu_lam2		<- mean(lambda2)
mu_lam3		<- mean(lambda3)
theta		<- 3.3
gamma		<- rgamma(n, 1/theta, 1/theta)

# chain 1

startValues <- list()
startValues[[1]] <- as.vector(c(beta1, beta2, beta3, J1, J2, J3, mu_lam1, mu_lam2, mu_lam3,
 sigSq_lam1, sigSq_lam2, sigSq_lam3, theta, gamma, lambda1, lambda2, lambda3, s1, s2, s3))

beta1		<- c(0.2, 0.3)
beta2		<- c(0.2, 0.3)
beta3		<- c(0.2, 0.3)
lambda1		<- runif(J1+1, -2, -1)
lambda2		<- runif(J2+1, -2, -1)
lambda3		<- runif(J3+1, -2, -1)

# chain 2

startValues[[2]] <- as.vector(c(beta1, beta2, beta3, J1, J2, J3, mu_lam1, mu_lam2, mu_lam3, 
sigSq_lam1, sigSq_lam2, sigSq_lam3, theta, gamma, lambda1, lambda2, lambda3, s1, s2, s3))

##################################################
# setting variable values needed for MHG algorithm

C1		<- 0.20
C2		<- 0.20
C3		<- 0.20
delPert1	<- 0.5
delPert2	<- 0.5
delPert3	<- 0.5
s_propBI1	<- floor(sort(unique(scrData$time1[scrData$event1 == 1])))
s_propBI1	<- unique(s_propBI1[s_propBI1 < s1_max])
s_propBI2	<- floor(sort(unique(scrData$time2[scrData$event1 == 0 & scrData$event2 == 1])))
s_propBI2	<- unique(s_propBI2[s_propBI2 < s2_max])
s_propBI3	<- floor(sort(unique(scrData$time2[scrData$event1 == 1 & scrData$event2 == 1])))
s_propBI3	<- unique(s_propBI3[s_propBI3 < s3_max])

num_s_propBI1	<- length(s_propBI1)
num_s_propBI2	<- length(s_propBI2)
num_s_propBI3	<- length(s_propBI3)
J1_max 		<- 50
J2_max 		<- 50
J3_max 		<- 50
time_lambda1	<- 1:36
time_lambda2	<- 1:36
time_lambda3	<- 1:36
nTime_lambda1	<- length(time_lambda1)
nTime_lambda2	<- length(time_lambda2)
nTime_lambda3	<- length(time_lambda3)

mhProp_theta_var <- 0.05

mcmcParams <- c(C1, C2, C3, delPert1, delPert2, delPert3, num_s_propBI1, num_s_propBI2, 
num_s_propBI3, J1_max, J2_max, J3_max, s1_max, s2_max, s3_max, nTime_lambda1, 
nTime_lambda2, nTime_lambda3, s_propBI1, s_propBI2, s_propBI3, time_lambda1, 
time_lambda2, time_lambda3, mhProp_theta_var)

##################################################
# number of chains

numReps		= 2e6
thin		= 20
burninPerc	= 0.5
path1		= "outcome/"
hz.type 	= "PEM"
model 		= "Markov"
nChain		= 2

# fitting Bayesian semi-parametric regression model to semi-competing risks data	

fitScr <- BayesID(scrData, nCov, hyperParams, startValues, mcmcParams, nGam_save, 
numReps, thin, path1, burninPerc, hz.type, model, nChain)
		
print(fitScr)
summary(fitScr)

## plot for estimates of baseline hazard functions
plot(fitScr)	
	
## calculating conditional explanatory hazard ratio		

vals <- ehr(c(1.5, 0), c(2, 1), fitScr)
plot(vals)

Run the code above in your browser using DataLab