pool()
function pools estimates from n
repeated data analyses. The typical sequence of steps to do a matching procedure on the imputed datasets are:
Impute the missing values by the mice()
function (from the mice package) or the amelia()
function (from the Amelia package), resulting in a multiple imputed dataset (an object of the mids
or amelia
class);
Match each imputed dataset using a matching model by the matchthem()
function, resulting in an object of the mimids
class;
Check the extent of balance of covariates across the matched datasets;
Fit the statistical model of interest on each matched dataset by the with()
function, resulting in an object of the mira
class; and
Pool the estimates from each model into a single set of estimates and standard errors, resulting in an object of the mipo
class.
pool(object, dfcom = NULL)
This argument specifies an object of the mira
class (produced by a previous call to with()
function).
This argument specifies a positive number representing the degrees of freedom in the data analysis. The default is NULL
, which means to extract this information from the fitted model with the lowest number of observations or the first fitted model (when that fails the warning The function cannot extract the dfcom from the datasets, hence, large sample is assumed.
is printed and the parameter is set to 999999
).
This function returns an object of the mipo
class (multiple imputation pooled outcome).
pool()
function averages the estimates of the model and computes the total variance over the repeated analyses by Rubin<U+2019>s rules.
Stef van Buuren and Karin Groothuis-Oudshoorn (2011). mice
: Multivariate Imputation by Chained Equations in R
. Journal of Statistical Software, 45(3): 1-67. https://www.jstatsoft.org/v45/i03/
# NOT RUN {
#Loading the dataset
data(osteoarthritis)
#Multiply imputing the missing values
imputed.datasets <- mice(osteoarthritis, m = 5, maxit = 10,
method = c("", "", "mean", "polyreg",
"logreg", "logreg", "logreg"))
#Matching the multiply imputed datasets
matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets,
approach = 'within', method = 'nearest')
#Analyzing the matched datasets
models <- with(data = matched.datasets,
exp = glm(KOA ~ OSP, family = binomial))
#Pooling results obtained from analysing the datasets
results <- pool(models)
# }
Run the code above in your browser using DataLab