Learn R Programming

⚠️There's a newer version (2.6.1) of this package.Take me there.

missRanger

The missRanger package uses the ranger package to do fast missing value imputation by chained random forest. As such, it serves as an alternative implementation of the beautiful 'MissForest' algorithm, see vignette.

missRanger offers the option to combine random forest imputation with predictive mean matching. This firstly avoids the generation of values not present in the original data (like a value 0.3334 in a 0-1 coded variable). Secondly, this step tends to raise the variance in the resulting conditional distributions to a realistic level, a crucial element to apply multiple imputation frameworks.

Installation

From CRAN:

install.packages("missRanger")

Latest version from github:

library(devtools)
install_github("mayer79/missRanger", subdir = "release/missRanger")

Examples

We first generate a data set with about 10% missing values in each column. Then those gaps are filled by missRanger. In the end, the resulting data frame is displayed.

library(missRanger)
 
# Generate data with missing values in all columns
irisWithNA <- generateNA(iris, seed = 347)
 
# Impute missing values with missRanger
irisImputed <- missRanger(irisWithNA, pmm.k = 3, num.trees = 100)
 
# Check results
head(irisImputed)
head(irisWithNA)
head(iris)

# With extra trees algorithm
irisImputed_et <- missRanger(irisWithNA, pmm.k = 3, splitrule = "extratrees", num.trees = 100)

# With `dplyr` syntax
library(dplyr)

iris %>% 
  generateNA() %>% 
  missRanger(verbose = 0) %>% 
  head()

Copy Link

Version

Install

install.packages('missRanger')

Monthly Downloads

2,436

Version

2.1.1

License

GPL (>= 2)

Issues

Pull Requests

Stars

Forks

Maintainer

Michael Mayer

Last Published

March 20th, 2021

Functions in missRanger (2.1.1)

revert

Revert conversion.
convert

Conversion of non-factor/non-numeric variables.
imputeUnivariate

Univariate Imputation
pmm

Predictive Mean Matching
missRanger

Fast Imputation of Missing Values by Chained Random Forests
typeof2

A version of typeof internally used by missRanger.
generateNA

Adds Missing Values to a Vector, Matrix or Data Frame