Learn R Programming

MachineLearning (version 0.1.4)

Clustering: A simple and powerful function to create clusters with KMeans

Description

This is a modified kmeans clustering technique to automatize the number of groups or clusters that can be partitioned the sample. Several techniques are used to obtain the best number of clusters.

Usage

Clustering(
  data,
  n = "auto",
  n_max = 10,
  iter.max = 10,
  auto_criterion = c("explainwss", "db", "ratkowsky", "ball", "friedman"),
  confidenceWSS = 0.9,
  agregate_method = median
)

Arguments

data

Data frame which numeric variables.

n

Data frame which numeric variables.

n_max

maximal number of clusters, between 2 and (number of objects - 1), greater or equal to n_min. By default, n_max=10.

iter.max

the maximum number of iterations allowed.

auto_criterion

the available criterions are: "explainwss", "db", "ratkowsky", "ball" and "friedman".

confidenceWSS

a confidence interval for criterion WSS.

agregate_method

a function to agregate results of different methods. Default value=median

Details

Several methods are available in order to obtain the best number of clusters: explainwss = Within-cluster Sum of Square db = Davies<U+2013>Bouldin index (DBI). Davies and Bouldin (1979) ratkowsky = Ratkowsky and Lance (1978) ball = Ball and Hall (1965) friedman = Friedman and Rubin (1967)

@return A MLA object of subclass Clustering

Examples

Run this code
# NOT RUN {
## Load a Dataset
 
# }
# NOT RUN {
data(EGATUR)
modelFit <- Clustering(data=EGATUR[,c("A13","gastototal")])
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab