Please note that functions with names starting with "mice.impute" are exported to be visible for the mice sampler functions. Please do not call these functions directly unless you know exactly what you are doing.
For continuous variables only.
This function is for RfPred.Norm
multiple imputation method, adapter for mice samplers.
In the mice() function, set method = "rfpred.norm" to call it.
The function performs multiple imputation based on normality assumption using out-of-bag mean squared error as the estimate for the variance.
mice.impute.rfpred.norm(
y,
ry,
x,
wy = NULL,
num.trees.cont = 10,
norm.err.cont = TRUE,
alpha.oob = 0,
pre.boot = TRUE,
num.threads = NULL,
...
)Vector with imputed data, same type as y, and of length
sum(wy).
Vector to be imputed.
Logical vector of length length(y) indicating the
the subset y[ry] of elements in y to which the imputation
model is fitted. The ry generally distinguishes the observed
(TRUE) and missing values (FALSE) in y.
Numeric design matrix with length(y) rows with predictors for
y. Matrix x may have no missing values.
Logical vector of length length(y). A TRUE value
indicates locations in y for which imputations are created.
Number of trees to build for continuous variables.
The default is num.trees = 10.
Use normality assumption for prediction errors of random
forests. The default is norm.err.cont = TRUE, and normality will be
assumed for the distribution for the prediction errors, the variance estimate
equals to overall out-of-bag prediction error, i.e. out-of-bag mean squared
error (see Shah et al. 2014). If FALSE, then the predictions of random
forest are used.
The "significance level" for individual out-of-bag
prediction errors used for the calculation for out-of-bag mean squared error,
useful when presence of extreme values.
For example, set alpha = 0.05 to use 95% confidence level.
The default is alpha.oob = 0.0, and all the individual out-of-bag
prediction errors will be kept intact.
If TRUE, bootstrapping prior to imputation will be
performed to perform 'proper' multiple imputation, for accommodating sampling
variation in estimating population regression parameters
(see Shah et al. 2014).
It should be noted that if TRUE, this option is in effect even if the
number of trees is set to one.
Number of threads for parallel computing. The default is
num.threads = NULL and all the processors available can be used.
Other arguments to pass down.
Shangzhi Hong
RfPred.Norm imputation sampler.
Shah, Anoop D., et al. "Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study." American journal of epidemiology 179.6 (2014): 764-774.
# Users can set method = "rfpred.norm" in call to mice to use this method
data("airquality")
impObj <- mice(airquality, method = "rfpred.norm", m = 5,
maxit = 5, maxcor = 1.0, eps = 0,
remove.collinear = FALSE, remove.constant = FALSE,
printFlag = FALSE)
Run the code above in your browser using DataLab