# rfImpute

##### Missing Value Imputations by randomForest

Impute missing values in predictor data using proximity from randomForest.

- Keywords
- regression, classif, tree

##### Usage

```
# S3 method for default
rfImpute(x, y, iter=5, ntree=300, ...)
# S3 method for formula
rfImpute(x, data, ..., subset)
```

##### Arguments

- x
A data frame or matrix of predictors, some containing

`NA`

s, or a formula.- y
Response vector (

`NA`

's not allowed).- data
A data frame containing the predictors and response.

- iter
Number of iterations to run the imputation.

- ntree
Number of trees to grow in each iteration of randomForest.

- ...
Other arguments to be passed to

`randomForest`

.- subset
A logical vector indicating which observations to use.

##### Details

The algorithm starts by imputing `NA`

s using
`na.roughfix`

. Then `randomForest`

is called
with the completed data. The proximity matrix from the randomForest
is used to update the imputation of the `NA`

s. For continuous
predictors, the imputed value is the weighted average of the
non-missing obervations, where the weights are the proximities. For
categorical predictors, the imputed value is the category with the
largest average proximity. This process is iterated `iter`

times.

Note: Imputation has not (yet) been implemented for the unsupervised case. Also, Breiman (2003) notes that the OOB estimate of error from randomForest tend to be optimistic when run on the data matrix with imputed values.

##### Value

A data frame or matrix containing the completed data matrix, where
`NA`

s are imputed using proximity from randomForest. The first
column contains the response.

##### References

Leo Breiman (2003). Manual for Setting Up, Using, and Understanding Random Forest V4.0. https://www.stat.berkeley.edu/~breiman/Using_random_forests_v4.0.pdf

##### See Also

##### Examples

```
# NOT RUN {
data(iris)
iris.na <- iris
set.seed(111)
## artificially drop some data values.
for (i in 1:4) iris.na[sample(150, sample(20)), i] <- NA
set.seed(222)
iris.imputed <- rfImpute(Species ~ ., iris.na)
set.seed(333)
iris.rf <- randomForest(Species ~ ., iris.imputed)
print(iris.rf)
# }
```

*Documentation reproduced from package randomForest, version 4.6-14, License: GPL (>= 2)*