Learn R Programming

missMDA (version 1.6)

imputeMCA: Impute missing values in categorical variables with Multiple Correspondence Analysis

Description

Impute the missing values of a categorical dataset (in the indicator matrix) with Multiple Correspondence Analysis

Usage

imputeMCA(don, ncp=2, row.w=NULL, coeff.ridge=1, threshold=1e-06, seed=NULL, maxiter=1000)

Arguments

don
a data.frame with categorical variables containing missing values
ncp
integer corresponding to the number of dimensions used to reconstruct data with the reconstruction formulae
row.w
an optional row weights (by default, a vector of 1 over the number of rows for uniform row weights)
coeff.ridge
a positive coefficient that permits to shrink the eigenvalues more than by the mean of the last eigenvalues (by default, 1 the eigenvalues are shrunk by the mean of the last eigenvalues; a coefficient between 1 and 2 is required)
threshold
the threshold for assessing convergence
seed
an integer to specify the seed for the initialization for the regularized iterative MCA algorithm (if seed = NULL the initialization step corresponds to the imputation of the proportion of each category)
maxiter
integer, maximum number of iterations for the regularized iterative MCA algorithm

Value

  • tab.disjThe imputed indicator matrix; the imputed values are real numbers and may be seen as degree of membership to the corresponding category.
  • completeObsThe completed data.frame

Details

Use a Regularized Iterative Multiple Correspondence Analysis to impute missing values. The regularized iterative MCA algorithm first imputes the missing values in the indicator matrix with initial values (the proportion of each category), then performs MCA on the completed dataset, imputes the missing values with the reconstruction formulae of order ncp and iterates until convergence. If ncp=0, the Average method (imputation with the proportion) is performed.

References

Josse, J., Chavent, M., Liquet, B. and Husson, F. (2010). Handling missing values with Regularized Iterative Multiple Correspondence Analysis.

See Also

estim_ncpMCA,imputePCA

Examples

Run this code
data(vnf)
## First the number of components has to be chosen 
##   (for the reconstruction step)
## nb <- estim_ncpMCA(vnf,ncp.max=5) ## Time-consuming, nb = 4

## Impute indicator matrix and perform a MCA
tab.disj.impute <- imputeMCA(vnf, ncp=4)
res.mca <- MCA(vnf,tab.disj=tab.disj.impute$tab.disj)

Run the code above in your browser using DataLab