Learn R Programming

⚠️There's a newer version (3.16.0) of this package.Take me there.

MICE: Multivariate Imputation by Chained Equations

The mice package implements a method to deal with missing data. The package creates multiple imputations (replacement values) for multivariate missing data. The method is based on Fully Conditional Specification, where each incomplete variable is imputed by a separate model. The MICE algorithm can impute mixes of continuous, binary, unordered categorical and ordered categorical data. In addition, MICE can impute continuous two-level data, and maintain consistency between imputations by means of passive imputation. Many diagnostic plots are implemented to inspect the quality of the imputations.

Installation

The mice package can be installed from CRAN as follows:

install.packages("mice")

The latest version is can be installed from GitHub as follows:

install.packages("devtools")
devtools::install_github(repo = "stefvanbuuren/mice")

Overview

The mice package contains functions to

  • Inspect the missing data pattern
  • Impute the missing data m times, resulting in m completed data sets
  • Diagnose the quality of the imputed values
  • Analyze each completed data set _ Pool the results of the repeated analyses
  • Store and export the imputed data in various formats
  • Generate simulated incomplete data
  • Incorporate custom imputation methods

Main functions

The main functions in the mice package are:

Function nameDescription
mice()Impute the missing data m times
with()Analyze completed data sets
pool()Combine parameter estimates
complete()Export imputed data
ampute()Generate missing data

Further reading

The mice software was published in the Journal of Statistical Software (Buuren and Groothuis-Oudshoorn 2011). See https://www.jstatsoft.org/article/view/v045i03. The first application of the method concerned missing blood pressure data (Buuren, Boshuizen, and Knook 1999). The term Fully Conditional Specification was introduced in 2006 to describe a general class of methods that specify imputations model for multivariate data as a set of conditional distributions (Buuren et al. 2006). Further details and applications can be found in the book Flexible Imputation of Missing Data (Buuren 2012).

References

Buuren, S. van. 2012. Flexible Imputation of Missing Data. Boca Raton, FL: Chapman & Hall/CRC Press.

Buuren, S. van, and K. Groothuis-Oudshoorn. 2011. “MICE: Multivariate Imputation by Chained Equations in R.” Journal of Statistical Software 45 (3): 1–67.

Buuren, S. van, H. C. Boshuizen, and D. L. Knook. 1999. “Multiple Imputation of Missing Blood Pressure Covariates in Survival Analysis.” Statistics in Medicine 18 (6): 681–94.

Buuren, S. van, J. P. L. Brand, C. G. M. Groothuis-Oudshoorn, and D. B. Rubin. 2006. “Fully Conditional Specification in Multivariate Imputation.” Journal of Statistical Computation and Simulation 76 (12): 1049–64.

Copy Link

Version

Install

install.packages('mice')

Monthly Downloads

60,676

Version

2.30

License

GPL-2 | GPL-3

Maintainer

Last Published

February 18th, 2017

Functions in mice (2.30)

ampute.default.patterns

Default patterns in ampute
ampute.discrete

Multivariate Amputation Based On Discrete Probability Functions
ampute.mcar

Multivariate Amputation In A MCAR Manner
ampute.default.odds

Default odds in ampute()
ampute.default.weights

Default weights in ampute
ampute.continuous

Multivariate Amputation Based On Continuous Probability Functions
ampute.default.type

Default type in ampute()
appendbreak

Appends specified break to the data
ampute.default.freq

Default freq in ampute
ampute

Generate Missing Data for Simulation Purposes
bwplot.mads

Box-and-whisker plot of amputed and non-amputed data
densityplot.mids

Density plot of observed and imputed data
as.mira

Create a mira object from repeated analyses
cci

Complete case indicator
as.mids

Converts an multiply imputed dataset (long format) into a mids object
cbind.mids

Columnwise combination of a mids object.
boys

Growth of Dutch boys
cc

Select complete cases
complete

Creates imputed data sets from a mids object
bwplot.mids

Box-and-whisker plot of observed and imputed data
ibind

Combine imputations fitted to the same data
extractBS

Extract broken stick estimates from a lmer object
getfit

Extracts fit objects from mira object
fdd

SE Fireworks disaster data
fico

Fraction of incomplete cases among cases with observed
ic

Select incomplete cases
glm.mids

Generalized linear model for mids object
fdgs

Fifth Dutch growth study 2009
fluxplot

Fluxplot of the missing data pattern
flux

Influx and outflux of multivariate missing data patterns
is.mids

Check for mids object
leiden85

Leiden 85+ study
lm.mids

Linear regression for mids object
ifdo

Conditional imputation helper
is.mipo

Check for mipo object
is.mira

Check for mira object
ici

Incomplete case indicator
long2mids

Conversion of a imputed data set (long form) to a mids object
is.mads

Check for mads object
mads-class

Multivariate Amputed Data Set (mads)
mice.impute.2l.norm

Imputation by a two-level normal model
mice.impute.2lonly.pmm

Imputation at level 2 by predictive mean matching
mice.impute.2lonly.norm

Imputation at level 2 by Bayesian linear regression
mice.impute.2l.pan

Imputation by a two-level normal model using pan
mice.impute.2lonly.mean

Imputation of the mean within the class
md.pattern

Missing data pattern
md.pairs

Missing data pattern by variable pairs
mammalsleep

Mammal sleep data
mdc

Graphical parameter for missing data plots.
mice.impute.cart

Imputation by classification and regression trees
mice.impute.mean

Imputation by the mean
mice.impute.norm.nob

Imputation by linear regression (non Bayesian)
mice.impute.norm.boot

Imputation by linear regression, bootstrap method
mice.impute.midastouch

Predictive Mean Matching with distance aided selection of donors
mice.impute.norm

Imputation by Bayesian linear regression
mice.impute.logreg

Imputation by logistic regression
mice.impute.norm.predict

Imputation by linear regression, prediction method
mice.impute.logreg.boot

Imputation by logistic regression using the bootstrap
mice.impute.lda

Imputation by linear discriminant analysis
mice.impute.fastpmm

Imputation by fast predictive mean matching
mice.impute.passive

Passive imputation
mice.impute.polr

Imputation by polytomous regression - ordered
mice.impute.sample

Imputation by simple random sampling
mice.impute.quadratic

Imputation of quadratric terms
mice.impute.rf

Imputation by random forests
mice.impute.ri

Imputation by the random indicator method for nonignorable data
mice.mids

Multivariate Imputation by Chained Equations (Iteration Step)
mice.impute.polyreg

Imputation by polytomous regression - unordered
mice.impute.pmm

Imputation by predictive mean matching
mice

Multivariate Imputation by Chained Equations (MICE)
mipo-class

Multiply imputed pooled analysis (mipo)
mice.theme

Set the theme for the plotting Trellis functions
mids-class

Multiply imputed data set (mids)
nhanes2

NHANES example - mixed numerical and discrete variables
mira-class

Multiply imputed repeated analyses (mira)
nelsonaalen

Cumulative hazard rate or Nelson-Aalen estimator
ncc

Number of complete cases
nhanes

NHANES example - all variables numerical
mids2spss

Export mids object to SPSS
mids2mplus

Export mids object to Mplus
.pmm.match

Finds an imputed value from matches in the predictive metric
pool.scalar

Multiple imputation pooling: univariate version
pattern

Datasets with various missing data patterns
norm.draw

Draws values of beta and sigma by Bayesian linear regression
nic

Number of incomplete cases
pool.r.squared

Pooling: R squared
pool.compare

Compare two nested models fitted to imputed data
plot.mids

Plot the trace lines of the MICE algorithm
pool

Multiple imputation pooling
popmis

Hox pupil popularity data with missing popularity scores
print.mads

Print a mads object
potthoffroy

Potthoff-Roy data
summary.mira

Summary of a mira object
print.mids

Print a mids object
selfreport

Self-reported and measured BMI
stripplot.mids

Stripplot of observed and imputed data
quickpred

Quick selection of predictors from the data
squeeze

Squeeze the imputed values to be within specified boundaries.
rbind.mids

Rowwise combination of a mids object.
pops

Project on preterm and small for gestational age infants (POPS)
walking

Walking disability data
xyplot.mads

Scatterplot of amputed and non-amputed data against weighted sum scores
tbc

Terneuzen birth cohort
version

Echoes the package version number
supports.transparent

Supports semi-transparent foreground colors?
windspeed

Subset of Irish wind speed data
with.mids

Evaluate an expression in multiple imputed datasets
xyplot.mids

Scatterplot of observed and imputed data