VarSelCluster: This function performs the variable selection and the maximum likelihood estimation of the Latent Class Model

Description

This function performs the variable selection and the maximum likelihood estimation of the Latent Class Model

Usage

VarSelCluster(x, g, vbleSelec = TRUE, crit.varsel = "BIC", initModel = 50,
  nbcores = 1, discrim = rep(1, ncol(x)), nbSmall = 250, iterSmall = 20,
  nbKeep = 50, iterKeep = 1000, tolKeep = 10^(-6))

Arguments

data.frame. Rows correspond to observations and columns correspond to variables. Continuous variables must be "numeric", count variables must be "integer" and categorical variables must be "factor".

numeric. It defines number of components.

vbleSelec

logical. It indicates if a variable selection is done (TRUE: yes, FALSE: no; default is 1).

crit.varsel

character. It defines the information criterion used for the variable selection ("AIC", "BIC" or "MICL"; only used if vbleSelec=1; default is "BIC").

initModel

numeric. It gives the number of initializations of the alternated algorithm maximizing the MICL criterion (only used if crit.varsel="MICL"; default is 50)

nbcores

numeric. It defines the numerber of cores used by the alogrithm (default is 1).

discrim

numeric. It indicates if each variable is discrimiative (1) or irrelevant (0) (only used if vbleSelec=0; default is rep(1,ncol(x))).

nbSmall

numeric. It indicates the number of SmallEM algorithms performed for the ML inference (default is 250).

iterSmall

numeric. It indicates the number of iterations for each SmallEM algorithm (default is 20).

nbKeep

numeric. It indicates the number of chains used for the final EM algorithm (default is 50).

iterKeep

numeric. It indicates the maximal number of iterations for each EM algorithm (default is 1000).

tolKeep

numeric. It indicates the maximal gap between two successive iterations of EM algorithm which stops the algorithm (default is 0.001).

Value

Returns an instance of '>VSLCMresultsMixed.

Examples

Run this code

# NOT RUN {
equire(VarSelLCM)

# Data loading:
# x contains the observed variables
# z the known statu (i.e. 1: absence and 2: presence of heart disease)
data(heart)
z <- heart[,"Class"]
x <- heart[,-13]

# Cluster analysis without variable selection
res_without <- VarSelCluster(x, 2, vbleSelec = FALSE)

# Cluster analysis with variable selection (with parallelisation)
res_with <- VarSelCluster(x, 2, nbcores = 2, initModel=40)

# Confusion matrices: variable selection decreases the misclassification error rate
print(table(z, res_without@partitions@zMAP))
print(table(z, res_with@partitions@zMAP))

# Summary of the best model
summary(res_with)

# Parameters of the best model
print(res_with)

# Plot of the best model
plot(res_with)
# }

Run the code above in your browser using DataLab