comire.gibbs: Gibbs sampler for CoMiRe model

Description

Posterior inference via Gibbs sampler for CoMiRe model

Usage

comire.gibbs(y, x, z = NULL, family = 'continuous', 
       grid = NULL, mcmc, prior, 
       state = NULL, seed, max.x = max(x), z.val = NULL, verbose = TRUE)

Value

An object of the class classCoMiRe, i.e. a list of arguments for generating posterior output. It contains:

callthe model formula
post.means a list containing the posterior mean density beta over the grid, of all the mixture parameters and, if family = "continuous" and z = NULL, of \(f_0\) and \(f_{inf}\) over the y.grid.
ci a list containing the 95% credible intervals for all the quantities stored in post.means.
mcmc a list containing all the MCMC chains.
z the same of the input
z.val the same of the input
f0,f1 MCMC replicates of the density in the two extremes (only if family = 'continuous')
nrep,nb the same values of the list mcmc in the input arguments
bin logical, equal to TRUE if family = 'binary'
univariate logical, equal to TRUE if z is null or a vector

Arguments

y

numeric vector for the response: when family="continuous" y must be a numeric vector; if family="binary" y must assume values 0 or 1.

x

numeric vector for the covariate relative to the dose of exposure.

z

numeric vector for the confunders; a vector if there is only one confounder or a matrix for two or more confunders

family

type of y. This can be "continuous" or "binary". Default "continuous".

grid

a list giving the parameters for plotting the posterior mean density and the posterior mean \(\beta(x)\) over finite grids if family="continuous" and z=NULL. It must include the following values:

grids, logical value (if TRUE the provided grids are used, otherwise standard grids are automatically used);
xgrid and ygrid, numerical vectors with the actual values of the grid for y and x.

mcmc

a list giving the MCMC parameters. It must include the following integers: nb giving the number of burn-in iterations, nrep giving the total number of iterations, thin giving the thinning interval, ndisplay giving the multiple of iterations to be displayed on screen while the algorithm is running (a message will be printed every ndisplay iterations).

prior

a list containing the values of the hyperparameters.

If family = "continuous", it must include the following values:

mu.theta, the prior mean \(\mu_\theta\) for each location parameter \(\theta_{0h}\) and \(\theta_1\),
k.theta, the prior variance \(k_\theta\) for each location paramter \(\theta_{0h}\) and \(\theta_1\),
mu.gamma (if p confounding covariates are included in the model) a p-dimentional vector of prior means \(\mu_\gamma\) of the parameters \(\gamma\) corresponding to the confounders,
k.gamma, the prior variance \(k_\gamma\) for parameter corresponding to the confounding covariate (if p=1) or sigma.gamma (if p>1), that is the covariance matrix \(\Sigma_\gamma\) for the parameters corresponding to the p confounding covariates; this must be a symmetric positive definite matrix.
eta, numeric vector of size J for the Dirichlet prior on the beta basis weights,
alpha, prior for the mixture weights,
a and b, prior scale and shape parameter for the gamma distribution of each precision parameter,
J, parameter controlling the number of elements of the I-spline basis,
H, total number of components in the mixture at \(x_0\).

If family="binary" it must include the following values:

eta, numeric vector of size J for the Dirichlet prior on the beta basis weights,
a.pi0 and b.pi0, the prior parameters of the prior beta distribution for \(\pi_0\),
J, parameter controlling the number of elements of the Ispline basis.

state

if family="continuous", a list giving the current value of the parameters. This list is used if the current analysis is the continuation of a previous analysis or if we want to start the MCMC algorithm from some particular value of the parameters.

seed

seed for random initialization.

max.x

maximum value allowed for x.

z.val

optional numeric vector containing a fixed value of interest for each of the confounding covariates to be used for the plots. Default value is mean(z) for numeric covariates or the mode for factorial covariates.

verbose

logical, if TRUE a message on the status of the MCMC algorithm is printed to the console. Default is TRUE.

Author

Antonio Canale [aut, cre], Daniele Durante [ctb], Arianna Falcioni [aut], Luisa Galtarossa [aut], Tommaso Rigon [ctb]

Details

The function fit a convex mixture regression (CoMiRe) model (Canale, Durante, Dunson, 2018) via Gibbs sampler. For continuous outcome \(y \in \mathcal{Y}\), adverse esposure level \(x \in \mathcal{X}\) and no confunding variables, one can set family = 'continuous' and z = NULL and fit model
\( f_x(y) = \{1-\beta(x)\} \sum_{h=1}^{H}\nu_{0h} \phi(y; \theta_{0h}, \tau_{0h}^{-1}) + \beta(x) \phi(y; \theta_{\infty}, \tau_{\infty}^{-1})\) ;

where \(\beta(x) = \sum_{j=1}^{J} \omega_j \psi_j(x), x\ge0,\) is a a monotone nondecreasing interpolation function, constrained between 0 and 1 and \(\psi_1,...,\psi_J\) are monotone nondecreasing I-splines basis.
If \(p \ge 1\) confounding covariates \(z \in \mathcal{Z}\) are available, passing the argument z the function fits model

\(f(y; x,z) = \{1-\beta(x)\} f_0(y;z) + \beta(x) f_\infty(y;z)\) ;

where:
\(f_0(y;z)= \sum_{h=1}^{H} \nu_{0h} \phi(y;\theta_{0h}+z^\mathsf{T}\gamma,\tau_{0h}^{-1})\), and \(f_\infty(y;z)= \phi(y;\theta_\infty+ z^\mathsf{T}\gamma,\tau_{\infty}^{-1})\).

Finally, if \(y\) is a binary response, one can set family = 'binary' and fit model

\(p_x(y) = (\pi_x)^y (1 - \pi_x)^{1-y}\) ;

where \(\pi_x = P(Y=1 | x)\) is \(\pi_x = \{1-\beta(x)\} \pi_0 + \beta(x) \pi_\infty\).

References

Canale, A., Durante, D., and Dunson, D. (2018), Convex Mixture Regression for Quantitative Risk Assessment, Biometrics, 74, 1331-1340

Galtarossa, L., Canale, A., (2019), A Convex Mixture Model for Binomial Regression, Book of Short Papers SIS 2019

Examples

Run this code

{

data(CPP)
attach(CPP)

n <- NROW(CPP)
J <- H <- 10

premature <- as.numeric(gestage<=37)

mcmc <- list(nrep=5000, nb=2000, thin=5, ndisplay=4)

## too few iterations to be meaningful. see below for safer and more comprehensive results

mcmc <- list(nrep=10, nb=2, thin=1, ndisplay=4) 


prior <- list(mu.theta=mean(gestage), k.theta=10, eta=rep(1, J)/J, 
              alpha=rep(1,H)/H, a=2, b=2, J=J, H=H)
              
fit.dummy <- comire.gibbs(gestage, dde, family="continuous", 
                     mcmc=mcmc, prior=prior, seed=1, max.x=180)
                     
summary(fit.dummy)
 

# \donttest{
## safer procedure with more iterations (it may take some time)

mcmc <- list(nrep=5000, nb=2000, thin=5, ndisplay=4) 

## 1. binary case ##

prior <- list(pi0=mean(gestage), eta=rep(1, J)/J, 
             a.pi0=27, b.pi0=360, J=J)
             
fit_binary<- comire.gibbs(premature, dde, family="binary", 
                          mcmc=mcmc, prior=prior, seed=5, max.x=180)
                          
                          
## 2. continuous case ##

prior <- list(mu.theta=mean(gestage), k.theta=10, eta=rep(1, J)/J, 
              alpha=rep(1,H)/H, a=2, b=2, J=J, H=H)
              
fit1 <- comire.gibbs(gestage, dde, family="continuous", 
                     mcmc=mcmc, prior=prior, seed=5, max.x=180)


## 2.2 One confunder ##

mage_std <- scale(mage, center = TRUE, scale = TRUE) 

prior <- list(mu.theta=mean(gestage), k.theta=10, mu.gamma=0, k.gamma=10, 
              eta=rep(1, J)/J, alpha=1/H, a=2, b=2, H=H, J=J)
              
fit2 <- comire.gibbs(gestage, dde, mage_std, family="continuous", 
              mcmc=mcmc, prior=prior, seed=5, max.x=180)


## 2.3 More confunders ##

Z <- cbind(mage, mbmi, sei)
Z <- scale(Z, center = TRUE, scale = TRUE)
Z <- as.matrix(cbind(Z, CPP$smoke))
colnames(Z) <- c("age", "BMI", "sei", "smoke")

mod <- lm(gestage ~ dde + Z)
prior <- list(mu.theta = mod$coefficients[1], k.theta = 10,
              mu.gamma = mod$coefficients[-c(1, 2)], sigma.gamma = diag(rep(10, 4)),
              eta = rep(1, J)/J, alpha = 1/H, a = 2, b = 2, H = H, J = J)
              
fit3 <- comire.gibbs(y = gestage, x = dde, z = Z, family = "continuous", mcmc = mcmc, 
                     prior = prior, seed = 5)

# } 
}

Run the code above in your browser using DataLab