Learn R Programming

missMDA (version 1.6)

imputeMFA: Impute dataset with MFA

Description

Impute the missing values of a dataset with the Multiple Factor Analysis model. Can be used as a preliminary step before performing a MFA on an incomplete dataset with coninuous variables.

Usage

imputeMFA(X, group, ncp = 2, type=rep("s",length(group)), method = "Regularized", 
       row.w = NULL,coeff.ridge=1,threshold = 1e-06, seed = NULL, maxiter = 1000,...)

Arguments

X
a data.frame with continuous variables containing missing values
group
a list indicating the number of variables in each group
ncp
integer corresponding to the number of components used to reconstruct data with the PCA reconstruction formulae
type
the type of variables in each group; three possibilities: "c" or "s" for quantitative variables (the difference is that for "s" variables are scaled to unit variance), "n" for categorical variables; by default, all variables are quantitative and sca
method
"Regularized" by default or "EM"
row.w
an optional row weights (by default, a vector of 1 over the number of rows for uniform row weights)
coeff.ridge
a positive coefficient that permits to shrink the eigenvalues more than by the mean of the last eigenvalues (by default, 1 the eigenvalues are shrunk by the mean of the last eigenvalues; a coefficient between 1 and 2 is required)
threshold
the threshold for assessing convergence
seed
a single value, interpreted as an integer for the set.seed function (if seed = NULL, missing values are initially imputed by the mean of each variable)
maxiter
integer, maximum number of iteration for the algorithm
...
further arguments passed to or from other methods

Value

  • completeObsthe imputed dataset; the observed values for non-missing entries and the imputed values for missing values
  • tab.disjthe imputed matrix in which categorical variables are coding by dummy variables. In the dummy variables, the imputed values are real numbers and may be seen as degree of membership to the corresponding category.
  • callthe matched call

Details

If all the groups are quantitative, impute the missing entries of a data frame using the iterative MFA algorithm (EM) or a regularized iterative MFA algorithm. The iterative MFA algorithm first imputes the missing values with initial values (the means of each variable), then performs MFA on the completed dataset, imputes the missing values with the reconstruction formulae of order ncp and iterates until convergence. The regularized version allows to avoid overfitting problems, especially important when there are many missing values. If some groups are qualitative, impute the missing entries of the disjunctive tables for the qualitative groups and the missing entries of the quantitative variables. The output can be used as an input in the MFA function using the argument tab.comp.

References

F. Husson, J. Josse (2013) Handling missing values in multiple factor analysis. PhD thesis of J. Josse or HDR of F. Husson

See Also

imputePCA

Examples

Run this code
data(orange)
res.comp <- imputeMFA(orange,group=c(5,3),type=rep("s",2),ncp=2)
## Note that MFA is performed on the completed matrix
res.mfa <- MFA(res.comp$completeObs,group=c(5,3),type=rep("s",2))

data(vnf)
res.comp <- imputeMFA(vnf,group=c(6,5,3),type=c("n","n","n"),ncp=2)
res.mfa <- MFA(vnf,group=c(6,5,3),type=c("n","n","n"),tab.comp=res.comp)

Run the code above in your browser using DataLab