Impute missing values on newdata based on an object of class "missRanger".
For multivariate imputation, use missRanger(..., keep_forests = TRUE).
For univariate imputation, no forests are required.
This can be enforced by predict(..., iter = 0) or via missRanger(. ~ 1, ...).
Note that out-of-sample imputation works best for rows in newdata with only one
missing value (counting only missings in variables used as covariates
in random forests). We call this the "easy case". In the "hard case",
even multiple iterations (set by iter) can lead to unsatisfactory results.
Number of candidate predictions of the original dataset
for predictive mean matching (PMM). By default the same value as during fitting.
iter
Number of iterations for "hard case" rows. 0 for univariate imputation.
num.threads
Number of threads used by ranger's predict function.
The default NULL uses all threads.
seed
Integer seed used for initial univariate imputation and PMM.
verbose
Should info be printed? (1 = yes/default, 0 for no).
...
Passed to the predict function of ranger.
Details
The out-of-sample algorithm works as follows:
Impute univariately all relevant columns by randomly drawing values
from the original unimputed data. This step will only impact "hard case" rows.
Replace univariate imputations by predictions of random forests. This is done
sequentially over variables, where the variables are sorted to minimize the impact
of univariate imputations. Optionally, this is followed by predictive mean matching (PMM).
Repeat Step 2 for "hard case" rows multiple times.