BayesIDcor: The function to fit parametric and semi-parametric hierarchical models to cluster-correlated semi-competing risks data.

Description

Cluster-correlated semi-competing risks data can be analyzed using hierarchical models. The prior of baseline hazard functions can be specified by either parametric (Weibull) model or non-parametric mixture of piecewise exponential models (PEM). The option to choose between parametric multivariate normal (MVN) and non-parametric Dirichlet process mixture of multivariate normals (DPM) is available for the prior of cluster-specific random effects distribution. The conditional hazard function for time to the terminal event given time to non-terminal can be modeled as Markov (it does not depend on the timing of the non-terminal event) or semi-Markov (it does depend on the timing)

Usage

BayesIDcor(survData, nCov, hyperParams, startValues, mcmcParams, nGam_save, numReps, 
thin, path = "results/", burninPerc = 0.5, hz.type = "Weibull", re.type = "MVN",
model = "Markov", storeV= rep(TRUE, 3), nChain = 1)

Arguments

survData

The data frame containing semi-competing risks data with cluster information and covariate matrices from n subjects. It is of dimension $n\times(5+p_1+p_2+p_3)$, where $p_1$,$p_2$,$p_3$ are the number of covariates included in the conditional

nCov

a numeric vector that contains dimensions of covariate space: c($p_1$,$p_2$,$p_3$)

hyperParams

a vector containing hyperparameter values for hierarchical model. For Weibull-MVN, (hz.type="Weibull", re.type="MVN"), it is of length 24; c($a_1$, $b_1$, $a_2$, $b_2$, $a_3$, $b_3$, $c_1$, $d_1$, $c_2$, $d_2$, $c_3$, $d_3$,$\psi

startValues

a list containing vectors of starting values of parameters. The length of list must be equal to the number of chains (nChain). For Weibull-MVN, (hz.type="Weibull", re.type="MVN"), each of list components is of lengt

mcmcParams

a vector containing variables required for Metropolis-Hastings-Green (MHG) algorithm. For Weibull models (Weibull-MVN and Weibull-DPM: hz.type="Weibull"), it is of length 7; c(mhProp_alpha1_var, mhProp_alpha2_var, mhProp_alpha3_var, mhProp_t

nGam_save

the number of $\gamma$ to be stored

numReps

total number of scans

thin

extent of thinning

path

the name of directory where the results are saved

burninPerc

the proportion of burn-in

hz.type

prior specification for baseline hazard functions : either "Weibull" or "PEM"

re.type

prior specification for cluster-specific random effects distribution: either "MVN" or "DPM"

model

assumptions on $h_{03}$: either "Markov" or "semi-Markov"

storeV

logical values to determine whether all the posterior samples of $V_1$, $V_2$, $V_3$ are to be stored.

nChain

the number of chains

Value

BayesIDcor returns an object of class BayesIDcor. names(object$chain1) shows the list of posterior samples of model parameters, the number of acceptance in MHG algorithm, etc.

Details

We view the semi-competing risks data as arising from an underlying illness-death model system in which individuals may undergo one or more of three transitions: 1) no event to non-terminal event, 2) no event to terminal event, 3) non-terminal event to terminal event. Let $t_{ji1}$, $t_{ji2}$ denote time to non-terminal and terminal event from subject $i=1,...,n_j$ in cluster $1,...,J$. The system of transitions is modeled via the specification of three hazard functions: $$h_1(t_{ji1} | \gamma_{ji}, V_{j1}, x_{ji1}) = \gamma_{ji} h_{01}(t_{ji1})\exp(\beta_1 x_{ji1}+V_{j1}), t_{ji1}>0,$$ $$h_2(t_{ji2} | \gamma_{ji}, V_{j2}, x_{ji2}) = \gamma_{ji} h_{02}(t_{ji2})\exp(\beta_2 x_{ji2}+V_{j2}), t_{ji2}>0,$$ $$h_3(t_{ji2} | t_{ji1}, \gamma_{ji}, V_{j3}, x_{ji3}) = \gamma_{ji} h_{03}(t_{ji2})\exp(\beta_3 x_{ji3}+V_{j3}), 0

References

Lee, K. H., Haneuse, S., Schrag, D., and Dominici, F. (2014), Bayesian semiparametric analysis of semicompeting risks data: investigating hospital readmission after a pancreatic cancer diagnosis, Journal of the Royal Statistical Society: Series C, in press. Lee, K. H., Dominici, F., Schrag, D., and Haneuse, S., Hierarchical models for cluster-correlated semicompeting risks data with application to quality of end-of-life care for pancreatic cancer, arXiv:1502.00526; submitted.

Examples

Run this code

# loading a data set
data(scrCorData)
    
scrData <- scrCorData
	
n = dim(scrData)[1]
p1 = 3
p2 = 3
p3 = 3
nCov <- c(p1, p2, p3)
J = length(unique(scrData[,5]))

###################################################################################
# analysis of clustered semi-competing risks data using semi-Markov PEM-MVN model #
###################################################################################
	
a1		<- 0.7	# prior parameters for 1/sigma_1^2
b1		<- 0.7
a2		<- 0.7	# prior parameters for 1/sigma_2^2
b2		<- 0.7
a3		<- 0.7	# prior parameters for 1/sigma_3^2
b3		<- 0.7
alpha1	<- 10	# prior parameter for K1
alpha2	<- 10	# prior parameter for K2
alpha3	<- 10	# prior parameter for K3
c_lam1 	<- 1	# prior parameter for MVN-ICAR specification
c_lam2 	<- 1
c_lam3 	<- 1
psi		<- 0.7	# prior parameters for 1/theta
omega 	<- 0.7

rho_v 	<- 100
Psi_v	<- diag(1, 3)
    
nGam_save <- 10 # the number of subjects whose gamma parameters are being saved
    
hyperParams <- c(a1, b1, a2, b2, a3, b3, alpha1, alpha2, alpha3, c_lam1, c_lam2, c_lam3,
psi, omega, rho_v, Psi_v)


#########################
# setting starting values
			
s1_max		<- max(scrData$time1[scrData$event1 == 1])
s2_max		<- max(scrData$time2[scrData$event1 == 0 & scrData$event2 == 1])
s3_max		<- max(scrData$time2[scrData$event1 == 1 & scrData$event2 == 1])
    
grid1 <- s1_max/50
grid2 <- s2_max/50
grid3 <- s3_max/50

# chain 1    
    
beta1		<- rep(0.1, p1)
beta2		<- rep(0.1, p2)
beta3		<- rep(0.1, p3)
    
s1			<- unique(sort(c(sample(1:s1_max, 10), s1_max)))
s2			<- unique(sort(c(sample(1:s2_max, 10), s2_max)))
s3			<- unique(sort(c(sample(1:s3_max, 10), s3_max)))
K1			<- length(s1) - 1
K2			<- length(s2) - 1
K3			<- length(s3) - 1
lambda1		<- runif(K1+1, -3, -2)
lambda2		<- runif(K2+1, -3, -2)
lambda3		<- runif(K3+1, -3, -2)
    
sigSq_lam1	<- var(lambda1)
sigSq_lam2	<- var(lambda2)
sigSq_lam3	<- var(lambda3)
mu_lam1		<- mean(lambda1)
mu_lam2		<- mean(lambda2)
mu_lam3		<- mean(lambda3)
theta		<- 1
gamma		<- rep(1, n)
    
V1	<- rep(0, J)
V2	<- rep(0, J)
V3	<- rep(0, J)
    
Sigma_V <- diag(1, 3)

startValues <- list()
startValues[[1]] <- as.vector(c(beta1, beta2, beta3, K1, K2, K3, mu_lam1, mu_lam2, mu_lam3,
sigSq_lam1, sigSq_lam2, sigSq_lam3, theta, gamma, lambda1, lambda2, lambda3, s1, s2, s3,
V1, V2, V3, Sigma_V))
    
# chain 2    

beta1		<- rep(-0.1, p1)
beta2		<- rep(-0.1, p2)
beta3		<- rep(-0.1, p3)
    
K1			<- length(s1) - 1
K2			<- length(s2) - 1
K3			<- length(s3) - 1
lambda1		<- runif(K1+1, -3, -2)
lambda2		<- runif(K2+1, -3, -2)
lambda3		<- runif(K3+1, -3, -2)
    
sigSq_lam1	<- var(lambda1)
sigSq_lam2	<- var(lambda2)
sigSq_lam3	<- var(lambda3)
mu_lam1		<- mean(lambda1)
mu_lam2		<- mean(lambda2)
mu_lam3		<- mean(lambda3)
theta		<- 0.5
gamma		<- rep(1.1, n)
    
V1	<- rep(0.1, J)
V2	<- rep(0.1, J)
V3	<- rep(0.1, J)
    
Sigma_V <- diag(1.1, 3)
    
startValues[[2]] <- as.vector(c(beta1, beta2, beta3, K1, K2, K3, mu_lam1, mu_lam2, mu_lam3,
sigSq_lam1, sigSq_lam2, sigSq_lam3, theta, gamma, lambda1, lambda2, lambda3, s1, s2, s3,
V1, V2, V3, Sigma_V))
    
    
    
    
    

##################################################
# setting variable values needed for MHG algorithm

C1				<- 0.20
C2				<- 0.20
C3				<- 0.20
delPert1		<- 0.5
delPert2		<- 0.5
delPert3		<- 0.5
s_propBI1		<- seq(1, s1_max, 1)
s_propBI1		<- s_propBI1[s_propBI1 < s1_max]
s_propBI2		<- seq(1, s2_max, 1)
s_propBI2		<- s_propBI2[s_propBI2 < s2_max]
s_propBI3		<- seq(1, s3_max, 1)
s_propBI3		<- s_propBI3[s_propBI3 < s3_max]
    
num_s_propBI1	<- length(s_propBI1)
num_s_propBI2	<- length(s_propBI2)
num_s_propBI3	<- length(s_propBI3)
K1_max 			<- 50
K2_max 			<- 50
K3_max 			<- 50
    
time_lambda1	<- seq(grid1, s1_max, grid1)
time_lambda2	<- seq(grid2, s2_max, grid2)
time_lambda3	<- seq(grid3, s3_max, grid3)
nTime_lambda1	<- length(time_lambda1)
nTime_lambda2	<- length(time_lambda2)
nTime_lambda3	<- length(time_lambda3)
    
mhProp_theta_var <- 0.05
mhProp_V1_var <- 0.05
mhProp_V2_var <- 0.05
mhProp_V3_var <- 0.05
    
mcmcParams <- c(C1, C2, C3, delPert1, delPert2, delPert3, num_s_propBI1, num_s_propBI2,
num_s_propBI3, K1_max, K2_max, K3_max, s1_max, s2_max, s3_max, nTime_lambda1, nTime_lambda2,
nTime_lambda3, s_propBI1, s_propBI2, s_propBI3, time_lambda1, time_lambda2, time_lambda3,
mhProp_theta_var,  mhProp_V1_var, mhProp_V2_var, mhProp_V3_var)
    
    


##################################################
# number of chains

numReps		= 2e6
thin		= 20
burninPerc	= 0.5
path        = "results/"
hz.type 	= "PEM"
re.type     = "MVN"
model 		= "semi-Markov"
storeV      = rep(TRUE, 3)
nChain		= 2

# fitting semi-Markov PEM-MVN model to clustered semi-competing risks data	

fit <- BayesIDcor(scrData, nCov, hyperParams, startValues, mcmcParams, nGam_save, numReps,
    thin, path, burninPerc, hz.type, re.type, model, storeV, nChain)
		
print(fit)
summary(fit)

## plot for estimates of baseline hazard functions
plot(fit)

Run the code above in your browser using DataLab