kml: ~ Algorithm: KmL, K-means for Longitidinal data ~

Description

KmL is a non parametric algorithm for clustering longitudinal data. Here is the description of the algorithm. For an overview of the package, see kml-package.

Usage

kml(.Object, nbClusters = 2:6, nbRedrawing = 20, maxIt = 200, print = "calinski", distance = "euclidean")

Arguments

.Object

[ClusterizLongData]: contains trajectories to clusterize as well as pre-existing clusterizations.

nbClusters

[numeric] Vector containing the number of clusters with which KmL must work. By default, nbClusters is 2:6 which indicates that KmL must search partitions with respectively 2, then 3, ... up to 6 clusters

nbRedrawing

[numeric] Sets the number of retire to execute for each number of clusters.

maxIt

[numeric]: sets a limit to the number of iteration before convergeance.

[character]: can take on two values : "all" forces the display of the algorithme under progression. Any other value suppresses this display (faster).

distance

[character] method used to measures the distance between trajectories (only "euclidean" is avalable for now).

Value

A class ClusterizLongData object, after having added supplementary Clusterization.

Author(s)

Christophe Genolini PSIGIAM: Paris Sud Innovation Group in Adolescent Mental Health INSERM U669 / Maison de Solenn / Paris Responsable :

English translation

Rapha�l Ricaud Laboratoire "Sport & Culture" / "Sports & Culture" Laboratory University of Paris 10 / Nanterre

Details

kml works on object of class ClusterizLongData. For each number included in nbClusters, kml looks for a clusterization then stores it in the field clusterizList according to its number of clusters. The algorithm starts over as many times as it is told in nbRedrawing. By default, it is executed for 2, 3, 4, 5 and 6 clusters 20 times each, namely 100 times. When a clusterization has been found, it is added to the slot clusterizList. clusterizList stores all the partitions found according to their number of clusters. The Clusterization with the same number of clusters are sort from the biggest Calinski criterion to the smallest. So the best are stored first. Clusterization are saved throughout. If the user wish to interrupt the execution of kml, the result will not be lost.

Examples

Run this code

### Generation of some data
cld1 <- as.cld(generateArtificialLongData())

### We suspect 2, 3 4 or 5 clusters, we want 3 redrawing.
#     And we want to "see" what happen (so print="all")
#kml(cld1,2:6,3,print="all")

### 3 seems to be the best. But to be sure, we try more redrawing 3 or 6 only.
#     We don't want to see again, we want to get the result as fast as possible.
#kml(cld1,c(3,6),10)

Run the code above in your browser using DataLab