computeUnSupervised: Unsupervised clustering

Description

Perform unsupervised clustering, dealing with the number of clusters K, automatically or not.

Usage

computeUnSupervised(
  data.sample,
  K = 0,
  method.name = "K-means",
  pca = FALSE,
  pca.nb.dims = 0,
  spec = FALSE,
  use.sampling = FALSE,
  sampling.size.max = 0,
  scaling = FALSE,
  RclusTool.env = initParameters(),
  echo = FALSE
)

Value

data.sample list containing features, profiles and updated clustering results (with vector of labels and clusters summaries).

Arguments

data.sample: list containing features, profiles and clustering results.
K: number of clusters. If K=0 (default), this number is automatically computed thanks to the Elbow method.
method.name: character vector specifying the constrained algorithm to use. Must be 'K-means' (default), 'EM' (Expectation-Maximization), 'Spectral', 'HC' (Hierarchical Clustering) or 'PAM' (Partitioning Around Medoids).
pca: boolean: if TRUE, Principal Components Analysis is applied to reduce the data space.
pca.nb.dims: number of principal components kept. If pca.nb.dims=0, this number is computed automatically.
spec: boolean: if TRUE, spectral embedding is applied to reduce the data space.
use.sampling: boolean: if FALSE (default), data sampling is not used.
sampling.size.max: numeric: maximal size of the sampling set.
scaling: boolean: if TRUE, scaling is applied.
RclusTool.env: environment in which all global parameters, raw data and results are stored.
echo: boolean: if FALSE (default), no description printed in the console.

Details

computeUnSupervised performs unsupervised clustering, dealing with the number of clusters K, automatically or not

Examples

Run this code

dat <- rbind(matrix(rnorm(100, mean = 0, sd = 0.3), ncol = 2), 
             matrix(rnorm(100, mean = 2, sd = 0.3), ncol = 2), 
             matrix(rnorm(100, mean = 4, sd = 0.3), ncol = 2))
tf <- tempfile()
write.table(dat, tf, sep=",", dec=".")
x <- importSample(file.features=tf)

x <- computeUnSupervised(x, K=0, pca=TRUE, echo=TRUE)
label <- x$clustering[["K-means_pca"]]$label
plot(dat[,1], dat[,2], type = "p", xlab = "x", ylab = "y", 
    col = label, main = "K-means clustering")