cluscata_kmeans: Compute the CLUSCATA partitionning algorithm on different blocks of binary variables from a CATA experiment. Can be performed using a multi start strategy or initial partition provided by the user

Description

Partitionning of binary Blocks from a CATA experiment. Each cluster is associated with a compromise computed by the CATATIS method. Moreover, a noise cluster can be set up.

Usage

cluscata_kmeans(Data,nblo, clust, nstart=100, rho=0, NameBlocks=NULL, NameVar=NULL,
               Itermax=30, Graph_groups=TRUE, print_attempt=FALSE, Warnings=FALSE)

Arguments

Data

data frame or matrix where the blocks of binary variables are merged horizontally. If you have a different format, see change_cata_format

nblo

numerical. Number of blocks (subjects).

clust

numerical vector or integer. Initial partition or number of starting partitions if integer. If numerical vector, the numbers must be 1,2,3,...,number of clusters

nstart

numerical. Number of starting partitions. Default: 100

rho

numerical between 0 and 1. Threshold for the noise cluster. If 0, there is no noise cluster. Default: 0

NameBlocks

string vector. Name of each block. Length must be equal to the number of blocks. If NULL, the names are S1,...Sm. Default: NULL

NameVar

string vector. Name of each variable (attribute, the same names for each subject). Length must be equal to the number of attributes. If NULL, the colnames of the first block are taken. Default: NULL

Itermax

numerical. Maximum of iterations by partitionning algorithm. Default: 30

Graph_groups

logical. Should each cluster compromise graphical representation be plotted? Default: TRUE

print_attempt

logical. Print the number of remaining attempts in multi-start case? Default: FALSE

Warnings

logical. Display warnings about the fact that none of the subjects in some clusters checked an attribute or product? Default: FALSE

Value

a list with:

group: the clustering partition. If rho>0, some subjects could be in the noise cluster ("K+1")
rho: the threshold for the noise cluster
homogeneity: percentage of homogeneity of the subjects in each cluster and the overall homogeneity
s_with_compromise: Similarity coefficient of each subject with its cluster compromise
weights: weight associated with each subject in its cluster
compromise: The compromise of each cluster
CA: The correspondance analysis results on each cluster compromise (coordinates, contributions...)
inertia: percentage of total variance explained by each axis of the CA for each cluster
s_all_cluster: the similarity coefficient between each subject and each cluster compromise
param: parameters called
criterion: the CLUSCATA criterion error
type: parameter passed to other functions

References

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2019). A new approach for the analysis of data and the clustering of subjects in a CATA experiment. Food Quality and Preference, 72, 31-39. Llobell, F., Giacalone, D., Labenne, A., Qannari, E.M. (2019). Assessment of the agreement and cluster analysis of the respondents in a CATA experiment. Food Quality and Preference, 77, 184-190.

Examples

Run this code

# NOT RUN {
data(straw)
cl_km=cluscata_kmeans(Data=straw[,1:(16*40)], nblo=40, clust=3)
plot(cl_km, Graph_groups=FALSE, Graph_weights = TRUE)
summary(cl_km)

# }

Run the code above in your browser using DataLab