Learn R Programming

RMixtComp (version 4.1.4)

slopeHeuristic: Slope heuristic

Description

Criterion to choose the number of clusters

Usage

slopeHeuristic(object, K0 = floor(max(object$nClass) * 0.4))

Value

the values of the slope heuristic

Arguments

object

output of mixtCompLearn

K0

number of class for computing the constant value (see details)

Author

Quentin Grimonprez

Details

The slope heuristic criterion is: LL_k - 2 C * D_k, with LL_k the loglikelihood for k classes, D_k the number of free parameters for k classes, C is the slope of the linear regression between D_k and LL_k for (k> K0)

References

Cathy Maugis, Bertrand Michel. Slope heuristics for variable selection and clustering via Gaussian mixtures. [Research Report] RR-6550, INRIA. 2008. inria-00284620v2

Jean-Patrick Baudry, Cathy Maugis, Bertrand Michel. Slope Heuristics: Overview and Implementation. 2010. hal-00461639

Examples

Run this code
# \donttest{
data(titanic)

## Use the MixtComp format
dat <- titanic

# refactor categorical data: survived, sex, embarked and pclass
dat$sex <- refactorCategorical(dat$sex, c("male", "female", NA), c(1, 2, "?"))
dat$embarked <- refactorCategorical(dat$embarked, c("C", "Q", "S", NA), c(1, 2, 3, "?"))
dat$survived <- refactorCategorical(dat$survived, c(0, 1, NA), c(1, 2, "?"))
dat$pclass <- refactorCategorical(dat$pclass, c("1st", "2nd", "3rd"), c(1, 2, 3))

# replace all NA by ?
dat[is.na(dat)] <- "?"

# create model
model <- list(
  pclass = "Multinomial",
  survived = "Multinomial",
  sex = "Multinomial",
  age = "Gaussian",
  sibsp = "Poisson",
  parch = "Poisson",
  fare = "Gaussian",
  embarked = "Multinomial"
)

# create algo
algo <- createAlgo()

# run clustering
resLearn <- mixtCompLearn(dat, model, algo, nClass = 2:25, criterion = "ICL", nRun = 3, nCore = 1)

out <- slopeHeuristic(resLearn, K0 = 6)
# }

Run the code above in your browser using DataLab