cluscata_kmeans: Compute the CLUSCATA partitioning algorithm on different blocks from a CATA experiment

Description

Partitioning of binary Blocks from a CATA experiment. Each cluster is associated with a compromise computed by the CATATIS method. Can be performed using a multi-start strategy or initial partition provided by the user. Moreover, a noise cluster can be set up.

Usage

cluscata_kmeans(Data,nblo, clust, nstart=100, rho=0, NameBlocks=NULL, NameVar=NULL,
               Itermax=30, Graph_groups=TRUE, print_attempt=FALSE, Warnings=FALSE)

Value

a list with:

group: the clustering partition. If rho>0, some subjects could be in the noise cluster ("K+1")
rho: the threshold for the noise cluster
homogeneity: percentage of homogeneity of the subjects in each cluster and the overall homogeneity
s_with_compromise: Similarity coefficient of each subject with its cluster compromise
weights: weight associated with each subject in its cluster
compromise: The compromise of each cluster
CA: The correspondance analysis results on each cluster compromise (coordinates, contributions...)
inertia: percentage of total variance explained by each axis of the CA for each cluster
s_all_cluster: the similarity coefficient between each subject and each cluster compromise
param: parameters called
criterion: the CLUSCATA criterion error
type: parameter passed to other functions

Arguments

Data: data frame or matrix where the blocks of binary variables are merged horizontally. If you have a different format, see change_cata_format
nblo: numerical. Number of blocks (subjects).
clust: numerical vector or integer. Initial partition or number of starting partitions if integer. If numerical vector, the numbers must be 1,2,3,...,number of clusters
nstart: numerical. Number of starting partitions. Default: 100
rho: numerical or vector between 0 and 1. Threshold for the noise cluster. Default:0. If you want a different threshold for each cluster, you can provide a vector.
NameBlocks: string vector. Name of each block. Length must be equal to the number of blocks. If NULL, the names are S1,...Sm. Default: NULL
NameVar: string vector. Name of each variable (attribute, the same names for each subject). Length must be equal to the number of attributes. If NULL, the colnames of the first block are taken. Default: NULL
Itermax: numerical. Maximum of iterations by partitioning algorithm. Default: 30
Graph_groups: logical. Should each cluster compromise graphical representation be plotted? Default: TRUE
print_attempt: logical. Print the number of remaining attempts in multi-start case? Default: FALSE
Warnings: logical. Display warnings about the fact that none of the subjects in some clusters checked an attribute or product? Default: FALSE

References

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2019). A new approach for the analysis of data and the clustering of subjects in a CATA experiment. Food Quality and Preference, 72, 31-39.
Llobell, F., Giacalone, D., Labenne, A., Qannari, E.M. (2019). Assessment of the agreement and cluster analysis of the respondents in a CATA experiment. Food Quality and Preference, 77, 184-190.

Examples

Run this code

# \donttest{
data(straw)
cl_km=cluscata_kmeans(Data=straw[,1:(16*40)], nblo=40, clust=3)
#plot(cl_km, Graph_groups=FALSE, Graph_weights = TRUE)
summary(cl_km)
# }