Learn R Programming

missMDA (version 1.6)

MIPCA: Multiple Imputation with PCA

Description

MIPCA performs Multiple Imputation with a PCA model. Can be used as a preliminary step to perform Multiple Imputation in PCA

Usage

MIPCA(X, ncp = 2, scale = TRUE, method = "Regularized", 
   threshold = 1e-04, nboot = 100)

Arguments

X
a data.frame with continuous variables containing missing values
ncp
integer corresponding to the number of components used to reconstruct data with the PCA reconstruction formulae
scale
boolean. By default TRUE leading to a same weight for each variable
method
"Regularized" by default or "EM"
threshold
the threshold for the criterion convergence
nboot
the number of imputed datasets

Value

  • res.imputePCAA matrix corresponding to the imputed dataset obtained with the function imputePCA (the completed dataset)
  • res.MIAn array corresponding to nboot imputed dataset. The dimensions of the array are: the number of row of X, the number of column of X and nboot
  • callthe matched call

Details

MIPCA generates nboot imputed datasets from a PCA model. The observed values are the same from one dataset to the others whereas the imputed values change. The variation among the imputed values reflects the variability with which missing values can be predicted. The multiple imputation is proper in the sense of Little and Rubin (2002) since it takes into account the variability of the parameters.

References

Josse, J., Husson, F. (2010). Multiple Imputation in PCA.

See Also

imputePCA,plot.MIPCA

Examples

Run this code
data(orange)
## First the number of components has to be chosen 
##   (for the reconstruction step)
## nb <- estim_ncpPCA(orange,ncp.max=5) ## Time consuming, nb = 2

## Multiple Imputation
resMI <- MIPCA(orange,ncp=2)

## Visualization on the PCA map
plot(resMI)

Run the code above in your browser using DataLab