Learn R Programming

ClustMMDD (version 1.0.4)

selectK.R: Selection of the number $K$ of clusters.

Description

Perform a selection of the number $K$ of clusters for a given subset $S$ of clustering variables.

Usage

selectK.R(xdata, S, Kmax, ploidy = 1, Kmin = 1, emOptions = list(epsi = 1e-05, nberSmallEM = 20, nberIterations = 15, nberMaxIterations = 5000, typeSmallEM = 0, typeEM = 0, putThreshold = FALSE), cte = 1, project = deparse(substitute(xdata)))

Arguments

xdata
A dataset in which data of each variable are in $ploidy$ column(s).
S
A subset of clustering variables on the form of logical vector of the same length P as the number of variables in xdata.
Kmax
The maximum number of clusters to be explored.
ploidy
The number of occurrences for each variable in the data. For example, $ploidy = 2$ for genotype
Kmin
The minimum number of clusters to be explored. The default value is set to 1.
emOptions
A list of EM options (see EmOptions and setEmOptions).
cte
A double used for the selection criterion named CteDim in which the penalty function is $pen(K,S)=cte*dim$, where dim is the number of free parameters.
project
The name of the project. The default value is the name of the dataset.

Value

A list of estimated paramaters for each selection criteria.

References

See Also

backward.explorer for more exploration of the competing models space, dimJump.R for data driven calibration of the penality function, and model.selection.R for model selection.

Examples

Run this code
data(genotype1)
head(genotype1)
genotype2 = cutEachCol(genotype1[, -11], ploidy = 2)
head(genotype2)
S = c(rep(TRUE, 8), rep(FALSE, 2))
## Not run: 
# outPut = selectK.R(genotype2, S, Kmax = 6, ploidy = 2, Kmin=1)
# outPut[["BIC"]]
# 
# file.remove("genotype2_ExploredModels.txt")
# ## End(Not run)

Run the code above in your browser using DataLab