clustatis_FreeSort: Perform a cluster analysis of free sorting data

Description

Hierarchical clustering of free sorting data followed by a partitioning algorithm (consolidation). Each cluster of blocks is associated with a compromise computed by the STATIS method. Moreover, a noise cluster can be set up.

Usage

clustatis_FreeSort(Data, NameSub=NULL, Noise_cluster=FALSE,Itermax=30,
                           Graph_dend=TRUE, Graph_bar=TRUE, printlevel=FALSE,
                           gpmax=min(6, ncol(Data)-1), Testonlyoneclust=TRUE,
                           alpha=0.05, nperm=50)

Arguments

Data

data frame or matrix. Corresponds to all variables that contain subjects results. Each column corresponds to a subject and gives the groups to which the products (rows) are assigned

NameSub

string vector. Name of each subject. Length must be equal to the number of clumn of the Data. If NULL, the names are S1,...Sm. Default: NULL

Noise_cluster

logical. Should a noise cluster be computed? Default: FALSE

Itermax

numerical. Maximum of iteration for the partitioning algorithm. Default: 30

Graph_dend

logical. Should the dendrogram be plotted? Default: TRUE

Graph_bar

logical. Should the barplot of the difference of the criterion and the barplot of the overall homogeneity at each merging be plotted? Default: FALSE

printlevel

logical. Print the number of remaining levels during the hierarchical clustering algorithm? Default: FALSE

gpmax

logical. What is maximum number of clusters to consider? Default: min(6, ncol(Data)-1)

Testonlyoneclust

logical. Test if there is more than one cluster? Default: TRUE

alpha

numerical between 0 and 1. What is the threshold to test if there is more than one cluster? Default: 0.05

nperm

numerical. How many permutations are required to test if there is more than one cluster? Default: 50

Value

Each partitionK contains a list for each number of clusters of the partition, K=1 to gpmax with:

group: the clustering partition of subjects after consolidation. If Noise_cluster=TRUE, some subjects could be in the noise cluster ("K+1")
rho: the threshold for the noise cluster
homogeneity: homogeneity index (
rv_with_compromise: RV coefficient of each block with its cluster compromise
weights: weight associated with each subject in its cluster
comp_RV: RV coefficient between the compromises associated with the various clusters
compromise: the W compromise of each cluster
coord: the coordinates of objects of each cluster
inertia: percentage of total variance explained by each axis for each cluster
rv_all_cluster: the RV coefficient between each subject and each cluster compromise
criterion: the CLUSTATIS criterion error
param: parameters called in the consolidation
type: parameter passed to other functions

There is also at the end of the list:

dend: The CLUSTATIS dendrogram
cutree_k: the partition obtained by cutting the dendrogram for K clusters (before consolidation).
overall_homogeneity_ng: percentage of overall homogeneity by number of clusters before consolidation (and after if there is no noise cluster)
diff_crit_ng: variation of criterion when a merging is done before consolidation (and after if there is no noise cluster)
test_one_cluster: decision and pvalue to know if there is more than one cluster
param: parameters called
type: parameter passed to other functions

References

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2018). Analysis and clustering of multiblock datasets by means of the STATIS and CLUSTATIS methods. Application to sensometrics. Food Quality and Preference, in Press. Llobell, F., Vigneau, E., Qannari, E. M. (2019). Clustering datasets by means of CLUSTATIS with identification of atypical datasets. Application to sensometrics. Food Quality and Preference, 75, 97-104.

Examples

Run this code

# NOT RUN {
data(choc)
res.clu=clustatis_FreeSort(choc)
plot(res.clu, Graph_dend=FALSE)
summary(res.clu)

# }

Run the code above in your browser using DataLab