trajClusters: Classify the Longitudinal Data Based on the Measures.

Description

Classifies the trajectories by applying a nonparametric clustering algorithm to the measures computed by trajMeasures().

Usage

trajClusters(
  Measures,
  select = NULL,
  fuzzy = FALSE,
  nclusters = NULL,
  nstart = 50
)
# S3 method for trajClusters
print(x, ...)
# S3 method for trajClusters
summary(object, ...)

Value

An object of class trajClusters; a list containing the result of the clustering, as well as a curated form of the arguments. If nclusters is set to NULL, clustering is carried out for each number \(k\) of clusters between 2 and (up to) 8 and a plot is produced representing the value of three internal cluster validity indices (C-index, Calinski-Harabasz, Wemmert-Gancarski) as a function of \(k\). As in the 'KmL' package of Genolini et al., these validity indices are presented on a scale from 0 to 1, with 1 corresponding to the highest validity score and 0 corresponding to the lowest. From this, a "best" value of \(k\) is determined using a ranked voting system.

Arguments

Measures: object of class trajMeasures as returned by the function trajMeasures().
select: an optional vector of positive integers corresponding to the measures to use in the clustering. Defaults to NULL, which uses all the measures contained in Measures.
fuzzy: logical. If FALSE, each trajectory is assigned to a unique group. If TRUE, each trajectory is assigned a "degree of membership" to each group. Defaults to FALSE.
nclusters: The desired number of clusters. If NULL, clustering is carried out for every number of clusters between 2 and (up to) 8 and the "best" number of clusters is used, as judged by the combination of three internal cluster validity indices. See section 'Value' for more details. Defaults to NULL.
nstart: The number of random starts. Defaults to 50.
x: object of class trajClusters.
...: further arguments passed to or from other methods.
object: object of class trajClusters.

Details

The spectral clustering algorithm presented in Meila (2005) is implemented in which the similarity matrix \(S\) is built from a binary K nearest neighbors similarity function (\(S=(W+W^T)/2\), where \(W_{ij}=1\) if data point \(j\) is among the nearest points to data point \(i\) and \(W_{ij}=0\) otherwise).

References

Genolini, C. et al., kml: K-Means for Longitudinal Data, https://CRAN.R-project.org/package=kml

Meila, M., Spectral Clustering. Handbook of Cluster Analysis, Chapter 7, Chapman and Hall/CRC, 2005.

Examples

Run this code

if (FALSE) {
data("trajdata")
trajdata.noGrp <- trajdata[, -which(colnames(trajdata) == "Group")] # remove the Group column

m = trajMeasures(trajdata.noGrp, ID = TRUE, measures = 1:19)

s2.3 <- trajClusters(m, nclusters = 3)
plot(s2.3)

#'s2.4 <- trajClusters(m, nclusters = 4)
plot(s2.4)

#'s2.5 <- trajClusters(m, nclusters = 5)
plot(s2.5)

groups <- s2.4 <- trajClusters(m, nclusters = 4)$partition
}

Run the code above in your browser using DataLab