Fits the Bayesian kernel machine regression (BKMR) model using Markov chain Monte Carlo (MCMC) methods.
kmbayes(
y,
Z,
X = NULL,
iter = 1000,
family = "gaussian",
id = NULL,
verbose = TRUE,
Znew = NULL,
starting.values = NULL,
control.params = NULL,
varsel = FALSE,
groups = NULL,
knots = NULL,
ztest = NULL,
rmethod = "varying",
est.h = FALSE
)
a vector of outcome data of length n
.
an n
-by-M
matrix of predictor variables to be included in the h
function. Each row represents an observation and each column represents an predictor.
an n
-by-K
matrix of covariate data where each row represents an observation and each column represents a covariate. Should not contain an intercept column.
number of iterations to run the sampler
a description of the error distribution and link function to be used in the model. Currently implemented for gaussian
and binomial
families.
optional vector (of length n
) of grouping factors for fitting a model with a random intercept. If NULL then no random intercept will be included.
TRUE or FALSE: flag indicating whether to print intermediate diagnostic information during the model fitting.
optional matrix of new predictor values at which to predict h
, where each row represents a new observation. This will slow down the model fitting, and can be done as a post-processing step using SamplePred
list of starting values for each parameter. If not specified default values will be chosen.
list of parameters specifying the prior distributions and tuning parameters for the MCMC algorithm. If not specified default values will be chosen.
TRUE or FALSE: indicator for whether to conduct variable selection on the Z variables in h
optional vector (of length M
) of group indicators for fitting hierarchical variable selection if varsel=TRUE. If varsel=TRUE without group specification, component-wise variable selections will be performed.
optional matrix of knot locations for implementing the Gaussian predictive process of Banerjee et al. (2008). Currently only implemented for models without a random intercept.
optional vector indicating on which variables in Z to conduct variable selection (the remaining variables will be forced into the model).
for those predictors being forced into the h
function, the method for sampling the r[m]
values. Takes the value of 'varying' to allow separate r[m]
for each predictor; 'equal' to force the same r[m]
for each predictor; or 'fixed' to fix the r[m]
to their starting values
TRUE or FALSE: indicator for whether to sample from the posterior distribution of the subject-specific effects h_i within the main sampler. This will slow down the model fitting.
an object of class "bkmrfit" (containing the posterior samples from the model fit), which has the associated methods:
print
(i.e., print.bkmrfit
)
summary
(i.e., summary.bkmrfit
)
Bobb, JF, Valeri L, Claus Henn B, Christiani DC, Wright RO, Mazumdar M, Godleski JJ, Coull BA (2015). Bayesian Kernel Machine Regression for Estimating the Health Effects of Multi-Pollutant Mixtures. Biostatistics 16, no. 3: 493-508.
Banerjee S, Gelfand AE, Finley AO, Sang H (2008). Gaussian predictive process models for large spatial data sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(4), 825-848.
For guided examples, go to https://jenfb.github.io/bkmr/overview.html
# NOT RUN {
## First generate dataset
set.seed(111)
dat <- SimData(n = 50, M = 4)
y <- dat$y
Z <- dat$Z
X <- dat$X
## Fit model with component-wise variable selection
## Using only 100 iterations to make example run quickly
## Typically should use a large number of iterations for inference
set.seed(111)
fitkm <- kmbayes(y = y, Z = Z, X = X, iter = 100, verbose = FALSE, varsel = TRUE)
# }
Run the code above in your browser using DataLab