MIPCA: Multiple Imputation with PCA

Description

MIPCA performs Multiple Imputation with a PCA model. Can be used as a preliminary step to perform Multiple Imputation in PCA

Usage

MIPCA(X, ncp = 2, scale = TRUE, method = "Regularized", 
   threshold = 1e-04, nboot = 100)

Arguments

a data.frame with continuous variables containing missing values

ncp

integer corresponding to the number of components used to reconstruct data with the PCA reconstruction formulae

scale

boolean. By default TRUE leading to a same weight for each variable

method

"Regularized" by default or "EM"

threshold

the threshold for the criterion convergence

nboot

the number of imputed datasets

Value

res.imputePCAA matrix corresponding to the imputed dataset obtained with the function imputePCA (the completed dataset)
res.MIAn array corresponding to nboot imputed dataset. The dimensions of the array are: the number of row of X, the number of column of X and nboot
callthe matched call

Details

MIPCA generates nboot imputed datasets from a PCA model. The observed values are the same from one dataset to the others whereas the imputed values change. The variation among the imputed values reflects the variability with which missing values can be predicted. The multiple imputation is proper in the sense of Little and Rubin (2002) since it takes into account the variability of the parameters.

References

Josse, J., Husson, F. (2010). Multiple Imputation in PCA.

Examples

Run this code

data(orange)
## First the number of components has to be chosen 
##   (for the reconstruction step)
## nb <- estim_ncpPCA(orange,ncp.max=5) ## Time consuming, nb = 2

## Multiple Imputation
resMI <- MIPCA(orange,ncp=2)

## Visualization on the PCA map
plot(resMI)

Run the code above in your browser using DataLab