pool: Combines Estimates by Rubin<U+2019>s Rules

Description

The pool() function combines the estimates from n repeated complete data analyses. The typical sequence of steps to do a matching procedure on the imputed datasets are:

Impute the missing values by the mice function (from the mice package) or the amelia function (from the Amelia package), resulting in a multiple imputed dataset (an object of the mids or amelia class);
Match each imputed dataset using a matching model by the matchthem() function, resulting in an object of the mimids class;
Fit the statistical model of interest on each matched dataset by the with() function, resulting in an object of the mira class;
Pool the estimates from each model into a single set of estimates and standard errors, resulting in an object of the mipo class.

Usage

pool(object, dfcom = NULL)

Arguments

object

This argument specifies an object of the mira class (produced by a previous call to with() function) or a list with model fits.

dfcom

This argument specifies a positive number representing the degrees of freedom in the complete data analysis. The default is NULL, which means to extract this information from the first fitted model or the fitted model with the lowest number of observations (when that fails the warning Large sample assumed is printed and the parameter is set to 999999).

Value

This function returns an object of the mipo class (multiple imputation pooled outcome).

Details

The pool() function averages the estimates of the complete data model and computes the total variance over the repeated analyses by Rubin<U+2019>s rules.

References

Stef van Buuren and Karin Groothuis-Oudshoorn (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3): 1-67. https://www.jstatsoft.org/v45/i03/

Examples

Run this code

# NOT RUN {
#Loading the dataset
data(osteoarthritis)

#Multiply imputing the missing values
imputed.datasets <- mice(osteoarthritis, m = 5, maxit = 10,
                         method = c("", "", "mean", "polyreg", "logreg", "logreg", "logreg"))

#Matching the multiply imputed datasets
matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK, imputed.datasets,
                              approach = 'within', method = 'nearest')

#Analyzing the matched datasets
models <- with(data = matched.datasets,
               exp = glm(KOA ~ OSP, family = binomial))

#Pooling results obtained from analysing the datasets
pool(models)
# }

Run the code above in your browser using DataLab