kml
is a implementation of k-means for longitudinal data (or trajectories). This algorithm is able to deal with missing value and
provides an easy way to re roll the algorithm several times, varying the starting conditions and/or the number of clusters looked for.
Here is the description of the algorithm. For an overview of the package, see kml-package.kml(object,nbClusters=2:6,nbRedrawing=20,toPlot="none",parAlgo=parALGO())
Partition
.kml
must work. By default,
nbClusters
is 2:6
which indicates that kml
must
search partitions with respectively 2, the[character]
: either 'traj' for plotting
trajectories alone, 'criterion' for plotting criterion alone,
'both' for plotting both or 'none' for not display anything (faster).[ParKml ]
: parameters used to run
the algorithm. They can be change using the function
parKml
. Option are mainly 'saveFreq', 'maxIt',
'imputationMethod',ClusterLongData
object, after having added
some Partition
to it.distance
is set to "euclidean"
andtoPlot
is set to 'none' or
'criterion',kml
call a C
compiled (optimized) procedure.toPlot
to
'traj' or 'both',kml
uses a R non compiled
programmes.Example
section).
If for a specific use, you need a different distance, feel free to
contact the author.kml
works on object of class ClusterLongData
.
For each number included in nbClusters
, kml
computes a
Partition
then stores it in the field
cX
of the object ClusterLongData
according to the number
of clusters 'X'. The algorithm starts over as many times as it is told in nbRedrawing
. By default, it is executed for 2,
3, 4, 5 and 6 clusters 20 times each, namely 100 times.
When a Partition
has been found, it is added to the
corresponding slot c1,
c2, c3, ... or c26. The sublist cX stores the all Partition
with
X clusters. Inside a sublist, the
Partition
can be sorted from the biggest quality criterion to
the smallest (the best are stored first, using
ordered,ListPartition
), or not.
Note that Partition
are saved throughout the algorithm. If the user
interrupts the execution of kml
, the result is not lost. If the
user run kml
on an object, then runnig kml
again on the same object
will add some new Partition
to the one already found.
The possible starting conditions are defined in initializePartition
.kml-package
Classes : ClusterLongData
, Partition
Methods : clusterLongData
, choice
### Generation of some data
cld1 <- generateArtificialLongData(25)
### We suspect 3, 4 or 6 clusters, we want 3 redrawing.
### We want to "see" what happen (so printCal and printTraj are TRUE)
kml(cld1,c(3,4,6),3,toPlot='both')
### 4 seems to be the best. We want 7 more redrawing.
### We don't want to see again, we want to get the result as fast as possible.
kml(cld1,4,10)
Run the code above in your browser using DataLab