Missing values are iterarively updated via an EM algorithm.
imputeEM(data, impute.ncomps = 2, pca.ncomps = 2, CV = TRUE, Init = "mean",
scale = TRUE, iters = 25, tol = .Machine$double.eps^0.25)
imputeEM
returns a list containing the following components:
A list of imputed data frames across impute.comps
A list of imputed values, at each EM iteration, across impute.comps
Cross-validation results across impute.comps
impute.comps
a dataset with missing values.
integer corresponding to the minimum number of components to test.
minimum number of components to use in the imputation.
Use cross-validation in determining the optimal number of components to retain for the final imputation.
For continous variables impute either the mean or median.
Scale variables to unit variance.
For continous variables impute either the mean or median.
the threshold for assessing convergence.
Nelson Lee Afanador (nelson.afanador@mvdalab.com), Thanh Tran (thanh.tran@mvdalab.com)
A completed data frame is returned that mirrors a model.matrix
. NAs
are replaced with convergence values as obtained via EM. If object contains no NAs
, it is returned unaltered.
B. Walczak, D.L. Massart. Dealing with missing data, Part I. Chemom. Intell. Lab. Syst. 58 (2001); 15:27
dat <- introNAs(iris, percent = 25)
imputeEM(dat)
Run the code above in your browser using DataLab