mice.impute.pmm(y, ry, x, donors = 5, type = 1, ridge = 1e-05, version = "", ...)y (TRUE=observed,
FALSE=missing)length(y) rows and p columns
containing complete covariates.donors = 5. Setting donors = 1 always selects the closest match. Values
between 3 and 10 provide the best results. Note: The default was changed from
3 to 5 in version 2.19, based on simulation work by Tim Morris.type = 1 calculates the distance between the predicted value of yobs and the drawn values of ymis. Other choices are type = 0 (distance between predicted values) and type = 2 (distance between drawn values). The current version supports only type = 1..norm.draw() to prevent problems with multicollinearity. The default is ridge = 1e-05, which means that 0.01 percent of the diagonal is added to the cross-product. Larger ridges may result in more biased estimates. For highly noisy data (e.g. many junk variables), set ridge = 1e-06 or even lower to reduce bias. For highly collinear data, set ridge = 1e-04 or higher.version = "2.21" calls .pmm.match() instead of the default
matcher() function.sum(!ry) with imputations
y by predictive mean matching, based on Rubin (1987, p.
168, formulas a and b). The procedure is as follows:
yobs beta and
ymis beta*
ymis, find donors observations with
closest predicted values, randomly sample one of these,
and take its observed value in y as the imputation.
y, NOT on
observed y.Rubin, D.B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.
Van Buuren, S., Brand, J.P.L., Groothuis-Oudshoorn C.G.M., Rubin, D.B. (2006) Fully conditional specification in multivariate imputation. Journal of Statistical Computation and Simulation, 76, 12, 1049--1064.
Van Buuren, S., Groothuis-Oudshoorn, K. (2011). mice: Multivariate
Imputation by Chained Equations in R. Journal of Statistical
Software, 45(3), 1-67. http://www.jstatsoft.org/v45/i03/