Learn R Programming

mice

Multivariate Imputation by Chained Equations

The mice package implements a method to deal with missing data. The package creates multiple imputations (replacement values) for multivariate missing data. The method is based on Fully Conditional Specification, where each incomplete variable is imputed by a separate model. The MICE algorithm can impute mixes of continuous, binary, unordered categorical and ordered categorical data. In addition, MICE can impute continuous two-level data, and maintain consistency between imputations by means of passive imputation. Many diagnostic plots are implemented to inspect the quality of the imputations.

Installation

The mice package can be installed from CRAN as follows:

install.packages("mice")

The latest version can be installed from GitHub as follows:

install.packages("devtools")
devtools::install_github(repo = "amices/mice")

Minimal example

library(mice, warn.conflicts = FALSE)

# show the missing data pattern
md.pattern(nhanes)
#>    age hyp bmi chl   
#> 13   1   1   1   1  0
#> 3    1   1   1   0  1
#> 1    1   1   0   1  1
#> 1    1   0   0   1  2
#> 7    1   0   0   0  3
#>      0   8   9  10 27

The table and the graph summarize where the missing data occur in the nhanes dataset.

# multiple impute the missing values
imp <- mice(nhanes, maxit = 2, m = 2, seed = 1)
#> 
#>  iter imp variable
#>   1   1  bmi  hyp  chl
#>   1   2  bmi  hyp  chl
#>   2   1  bmi  hyp  chl
#>   2   2  bmi  hyp  chl

# inspect quality of imputations
stripplot(imp, chl, pch = 19, xlab = "Imputation number")

In general, we would like the imputations to be plausible, i.e., values that could have been observed if they had not been missing.

# fit complete-data model
fit <- with(imp, lm(chl ~ age + bmi))

# pool and summarize the results
summary(pool(fit))
#>          term estimate std.error statistic    df p.value
#> 1 (Intercept)     9.08     73.09     0.124  4.50  0.9065
#> 2         age    35.23     17.46     2.017  1.36  0.2377
#> 3         bmi     4.69      1.94     2.417 15.25  0.0286

The complete-data is fit to each imputed dataset, and the results are combined to arrive at estimates that properly account for the missing data.

mice 3.0

Version 3.0 represents a major update that implements the following features:

  1. blocks: The main algorithm iterates over blocks. A block is simply a collection of variables. In the common MICE algorithm each block was equivalent to one variable, which - of course - is the default; The blocks argument allows mixing univariate imputation method multivariate imputation methods. The blocks feature bridges two seemingly disparate approaches, joint modeling and fully conditional specification, into one framework;

  2. where: The where argument is a logical matrix of the same size of data that specifies which cells should be imputed. This opens up some new analytic possibilities;

  3. Multivariate tests: There are new functions D1(), D2(), D3() and anova() that perform multivariate parameter tests on the repeated analysis from on multiply-imputed data;

  4. formulas: The old form argument has been redesign and is now renamed to formulas. This provides an alternative way to specify imputation models that exploits the full power of R’s native formula’s.

  5. Better integration with the tidyverse framework, especially for packages dplyr, tibble and broom;

  6. Improved numerical algorithms for low-level imputation function. Better handling of duplicate variables.

  7. Last but not least: A brand new edition AND online version of Flexible Imputation of Missing Data. Second Edition.

See MICE: Multivariate Imputation by Chained Equations for more resources.

I’ll be happy to take feedback and discuss suggestions. Please submit these through Github’s issues facility.

Resources

Books

  1. Van Buuren, S. (2018). Flexible Imputation of Missing Data. Second Edition.. Chapman & Hall/CRC. Boca Raton, FL.

Course materials

  1. Handling Missing Data in R with mice
  2. Statistical Methods for combined data sets

Vignettes

  1. Ad hoc methods and the MICE algorithm
  2. Convergence and pooling
  3. Inspecting how the observed data and missingness are related
  4. Passive imputation and post-processing
  5. Imputing multilevel data
  6. Sensitivity analysis with mice
  7. Generate missing values with ampute
  8. futuremice: Wrapper for parallel MICE imputation through futures

Code from publications

  1. Flexible Imputation of Missing Data. Second edition.

Acknowledgement

The cute mice sticker was designed by Jaden M. Walters. Thanks Jaden!

Code of Conduct

Please note that the mice project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Copy Link

Version

Install

install.packages('mice')

Monthly Downloads

107,017

Version

3.19.0

License

GPL (>= 2)

Issues

Pull Requests

Stars

Forks

Maintainer

Stef van Buuren

Last Published

December 10th, 2025

Functions in mice (3.19.0)

appendbreak

Appends specified break to the data
bwplot.mads

Box-and-whisker plot of amputed and non-amputed data
brandsma

Brandsma school data used Snijders and Bosker (2012)
boys

Growth of Dutch boys
complete.mids

Extracts the completed data from a mids object
estimice

Computes least squares parameters
employee

Employee selection data
densityplot.mids

Density plot of observed and imputed data
convergence

Computes convergence diagnostics for a mids object
cci

Complete case indicator
cc

Select complete cases
construct.blocks

Construct blocks from formulas and predictorMatrix
extend.formula

Extends a formula with predictors
bwplot.mids

Box-and-whisker plot of observed and imputed data
extend.formulas

Extends formula's with predictor matrix settings
fdgs

Fifth Dutch growth study 2009
fico

Fraction of incomplete cases among cases with observed
flux

Influx and outflux of multivariate missing data patterns
cbind

Combine R objects by rows and columns
getqbar

Extract estimate from mipo object
ic

Select incomplete cases
ici

Incomplete case indicator
glm.mids

Generalized linear model for mids object
fluxplot

Fluxplot of the missing data pattern
ibind

Enlarge number of imputations by combining mids objects
make.blocks

Creates a blocks argument
make.blots

Creates a blots argument
mads

Multivariate amputed data set (mads)
is.mads

Check for mads object
leiden85

Leiden 85+ study
extractBS

Extract broken stick estimates from a lmer object
make.calltype

Create calltype of the imputation model
md.pairs

Missing data pattern by variable pairs
lm.mids

Linear regression for mids object
mcar

Jamshidian and Jalal's Non-Parametric MCAR Test
fdd

SE Fireworks disaster data
ifdo

Conditional imputation helper
make.formulas

Creates a formulas argument
futuremice

Wrapper function that runs MICE in parallel
glance.mipo

Glance method to extract information from a mipo object
make.visitSequence

Creates a visitSequence argument
make.where

Creates a where argument
make.method

Creates a method argument
is.mipo

Check for mipo object
matchindex

Find index of matched donor units
is.mids

Check for mids object
mammalsleep

Mammal sleep data
getfit

Extract list of fitted models
mice.impute.2l.pan

Imputation by a two-level normal model using pan
md.pattern

Missing data pattern
mice.impute.2lonly.mean

Imputation of most likely value within the class
mice.impute.2l.bin

Imputation by a two-level logistic model using glmer
mdc

Graphical parameter for missing data plots
mice

mice: Multivariate Imputation by Chained Equations
mice.impute.2l.lmer

Imputation by a two-level normal model using lmer
mice.impute.2l.norm

Imputation by a two-level normal model
mice.impute.2lonly.norm

Imputation at level 2 by Bayesian linear regression
mice.impute.lasso.select.norm

Imputation by indirect use of lasso linear regression
mice.impute.lda

Imputation by linear discriminant analysis
mice.impute.logreg

Imputation by logistic regression
is.mira

Check for mira object
fix.coef

Fix coefficients and update model
filter.mids

Subset rows of a mids object
mice.impute.2lonly.pmm

Imputation at level 2 by predictive mean matching
is.mitml.result

Check for mitml.result object
mice.impute.lasso.norm

Imputation by direct use of lasso linear regression
mice.impute.lasso.logreg

Imputation by direct use of lasso logistic regression
mice.impute.lasso.select.logreg

Imputation by indirect use of lasso logistic regression
mice.impute.jomoImpute

Multivariate multilevel imputation using jomo
mice.impute.cart

Imputation by classification and regression trees
mice.impute.passive

Passive imputation
mice.impute.norm.predict

Imputation by linear regression through prediction
mice.impute.pmm

Imputation by predictive mean matching
mice.impute.panImpute

Impute multilevel missing data using pan
mice.impute.norm

Imputation by Bayesian linear regression
make.predictorMatrix

Creates a predictorMatrix argument
make.post

Creates a post argument
mice.impute.mpmm

Imputation by multivariate predictive mean matching
mice.impute.norm.boot

Imputation by linear regression, bootstrap method
mice.impute.norm.nob

Imputation by linear regression without parameter uncertainty
mice.impute.mean

Imputation by the mean
mice.impute.logreg.boot

Imputation by logistic regression using the bootstrap
mice.theme

Set the theme for the plotting Trellis functions
mice.impute.quadratic

Imputation of quadratic terms
mice.impute.rf

Imputation by random forests
mice.mids

Multivariate Imputation by Chained Equations (Iteration Step)
mice.impute.midastouch

Imputation by predictive mean matching with distance aided donor selection
mice.impute.mnar.logreg

Imputation under MNAR mechanism by NARFCS
mids2mplus

Export mids object to Mplus
ncc

Number of complete cases
mice.impute.polr

Imputation of ordered data by polytomous regression
mids

Multiply imputed data set (mids)
nelsonaalen

Cumulative hazard rate or Nelson-Aalen estimator
mnar_demo_data

MNAR demo data
mice.impute.polyreg

Imputation of unordered data by polytomous regression
mids2spss

Export mids object to SPSS
mira

Create an object of class "mira"
mipo

mipo: Multiple imputation pooled object
pool.scalar

Multiple imputation pooling: univariate version
pool.r.squared

Pools R^2 of m models fitted to multiply-imputed data
nimp

Number of imputations per block
nic

Number of incomplete cases
mice.impute.ri

Imputation by the random indicator method for nonignorable data
name.blocks

Name imputation blocks
name.formulas

Name formula list elements
pool

Combine estimates by pooling rules
pool.table

Combines estimates from a tidy table
.pmm.match

Finds an imputed value from matches in the predictive metric (deprecated)
mice.impute.sample

Imputation by simple random sampling
pattern

Datasets with various missing data patterns
nhanes2

NHANES example - mixed numerical and discrete variables
popmis

Hox pupil popularity data with missing popularity scores
nhanes

NHANES example - all variables numerical
pool.compare

Compare two nested models fitted to imputed data
potthoffroy

Potthoff-Roy data
pops

Project on preterm and small for gestational age infants (POPS)
selfreport

Self-reported and measured BMI
parlmice

Wrapper function that runs MICE in parallel
norm.draw

Draws values of beta and sigma by Bayesian linear regression
walking

Walking disability data
print.mira

Print a mira object
predict_mi

Predict method for linear models with multiply imputed data
toenail2

Toenail data
windspeed

Subset of Irish wind speed data
quickpred

Quick selection of predictors from the data
reexports

Objects exported from other packages
supports.transparent

Supports semi-transparent foreground colors?
tbc

Terneuzen birth cohort
squeeze

Squeeze the imputed values to be within specified boundaries.
xyplot.mids

Scatterplot of observed and imputed data
toenail

Toenail data
tidy.mipo

Tidy method to extract results from a mipo object
summary.mira

Summary of a mira object
xyplot.mads

Scatterplot of amputed and non-amputed data against weighted sum scores
version

Echoes the package version number
with.mids

Evaluate an expression in multiple imputed datasets
stripplot.mids

Stripplot of observed and imputed data
D1

Compare two nested models using D1-statistic
ampute.default.weights

Default weights in ampute
ampute.continuous

Multivariate amputation based on continuous probability functions
D2

Compare two nested models using D2-statistic
ampute

Generate missing data for simulation purposes
D3

Compare two nested models using D3-statistic
as.mitml.result

Converts into a mitml.result object
as.mira

Create a mira object from repeated analyses
as.mids

Converts an imputed dataset (long format) into a mids object
ampute.default.odds

Default odds in ampute()
ampute.default.freq

Default freq in ampute
ampute.discrete

Multivariate amputation based on discrete probability functions
ampute.default.patterns

Default patterns in ampute
ampute.default.type

Default type in ampute()
ampute.mcar

Multivariate amputation under a MCAR mechanism
anova.mira

Compare several nested models