The JointAI package performs simultaneous imputation and inference for incomplete or complete data under the Bayesian framework. Models for incomplete covariates, conditional on other covariates, are specified automatically and modelled jointly with the analysis model. MCMC sampling is performed in 'JAGS' via the R package rjags.
JointAI provides the following main functions that facilitate analysis with different models:
lm_imp
for linear regression
glm_imp
for generalized linear regression
betareg_imp
for regression using a beta distribution
lognorm_imp
for regression using a log-normal
distribution
clm_imp
for (ordinal) cumulative logit models
mlogit_imp
for multinomial models
betamm_imp
for mixed models using a beta distribution
lognormmm_imp
for mixed models using a log-normal
distribution
clmm_imp
for (ordinal) cumulative logit mixed models
survreg_imp
for parametric (Weibull) survival models
coxph_imp
for (Cox) proportional hazard models
JM_imp
for joint models of longitudinal and survival data
As far as possible, the specification of these functions is analogous to the
specification of widely used functions for the analysis of complete data,
such as
lm
, glm
,
lme
(from the package
nlme),
survreg
(from the package
survival) and
coxph
(from the package
survival).
Computations can be performed in parallel to reduce computational time,
using the packages future (and doFuture),
the argument shrinkage
allows the user to impose a penalty on the
regression coefficients of some or all models involved,
and hyper-parameters can be changed via the argument hyperpars
.
To obtain summaries of the results, the functions
summary()
,
coef()
and
confint()
are available, and
results can be visualized with the help of
traceplot()
or
densplot()
.
The function predict()
allows
prediction (including credible intervals) from JointAI
models.
Two criteria for evaluation of convergence and precision of the posterior estimate are available:
GR_crit
implements the Gelman-Rubin criterion
('potential scale reduction factor') for convergence
MC_error
calculates the Monte Carlo error to evaluate
the precision of the MCMC sample
Imputed data can be extracted (and exported to SPSS) using
get_MIdat()
.
The function plot_imp_distr()
allows
visual comparison of the distribution of observed and imputed values.
parameters
and list_models
to gain
insight in the specified model
plot_all
and md_pattern
to visualize the
distribution of the data and the missing data pattern
The following vignettes are available
Minimal Example:
A minimal example demonstrating the use of
lm_imp
,
summary.JointAI
,
traceplot
and densplot
.
Visualizing Incomplete Data:
Demonstrations of the options in plot_all
(plotting histograms
and bar plots for all variables in the data) and md_pattern
(plotting or printing the missing data pattern).
Model Specification:
Explanation and demonstration of all parameters that are required or optional
to specify the model structure in lm_imp
,
glm_imp
and lme_imp
.
Among others, the functions parameters
,
list_models
and set_refcat
are used.
Parameter Selection:
Examples on how to select the parameters/variables/nodes
to follow using the argument monitor_params
and the
parameters/variables/nodes displayed in the summary
,
traceplot
, densplot
or when using
GR_crit
or MC_error
.
MCMC Settings:
Examples demonstrating how to set the arguments controlling settings
of the MCMC sampling,
i.e., n.adapt
, n.iter
, n.chains
, thin
,
inits
.
After Fitting:
Examples on the use of functions to be applied after the model has
been fitted, including traceplot
, densplot
,
summary
, GR_crit
, MC_error
,
predict
, predDF
and
get_MIdat
.
Theoretical Background: Explanation of the statistical method implemented in JointAI.
Nicole S. Erler, Dimitris Rizopoulos and Emmanuel M.E.H. Lesaffre (2019). JointAI: Joint Analysis and Imputation of Incomplete Data in R. arXiv e-prints, arXiv:1907.10867. URL https://arxiv.org/abs/1907.10867.
Erler, N.S., Rizopoulos, D., Rosmalen, J., Jaddoe, V.W.V., Franco, O. H., & Lesaffre, E.M.E.H. (2016). Dealing with missing covariates in epidemiologic studies: A comparison between multiple imputation and a full Bayesian approach. Statistics in Medicine, 35(17), 2955-2974. 10.1002/sim.6944
Erler, N.S., Rizopoulos D., Jaddoe, V.W.V., Franco, O.H. & Lesaffre, E.M.E.H. (2019). Bayesian imputation of time-varying covariates in linear mixed models. Statistical Methods in Medical Research, 28(2), 555<U+2013>568. 10.1177/0962280217730851