K-means algorithm for the clustering of variables. Directional or local groups may be defined. Each group of variables is associated with a latent component. Moreover external information collected on the observations or on the variables may be introduced.
CLV_kmeans(
X,
Xu = NULL,
Xr = NULL,
method,
sX = TRUE,
sXr = FALSE,
sXu = FALSE,
clust,
iter.max = 20,
nstart = 100,
strategy = "none",
rho = 0.3
)
The matrix of the variables to be clustered
The external variables associated with the columns of X
The external variables associated with the rows of X
The criterion to use in the cluster analysis. 1 or "directional" : the squared covariance is used as a measure of proximity (directional groups). 2 or "local" : the covariance is used as a measure of proximity (local groups)
TRUE/FALSE : standardization or not of the columns X (TRUE by default) (predefined -> cX = TRUE : column-centering of X)
TRUE/FALSE : standardization or not of the columns Xr (FALSE by default) (predefined -> cXr = TRUE : column-centering of Xr)
TRUE/FALSE : standardization or not of the columns Xu (FALSE by default) (predefined -> cXu= FALSE : no centering, Xu considered as a weight matrix)
: a number i.e. the size of the partition, K, or a vector of INTEGERS i.e. the group membership of each variable in the initial partition (integer between 1 and K)
maximal number of iteration for the consolidation (20 by default)
nb of random initialisations in the case where init is a number (100 by default)
"none" (by default), or "kplusone" (an additional cluster for the noise variables), or "sparselv" (zero loadings for the noise variables)
a threshold of correlation between 0 and 1 (0.3 by default)
The value of the clustering criterion at convergence. The percentage of the explained initial criterion value. The number of iterations in the partitioning algorithm.
the group's membership
The latent components of the clusters
if there are external variables Xr or Xu : The loadings of the external variables
The initalization can be made at random, repetitively, or can be defined by the user.
The parameter "strategy" makes it possible to choose a strategy for setting aside variables that do not fit into the pattern of any cluster.
Vigneau E., Qannari E.M. (2003). Clustering of variables around latents components. Comm. Stat, 32(4), 1131-1150.
Vigneau E., Chen M., Qannari E.M. (2015). ClustVarLV: An R Package for the clustering of Variables around Latent Variables. The R Journal, 7(2), 134-148
Vigneau E., Chen M. (2016). Dimensionality reduction by clustering of variables while setting aside atypical variables. Electronic Journal of Applied Statistical Analysis, 9(1), 134-153
CLV, LCLV
# NOT RUN {
data(apples_sh)
#local groups with external variables Xr
resclvkmYX <- CLV_kmeans(X = apples_sh$pref, Xr = apples_sh$senso,method = "local",
sX = FALSE, sXr = TRUE, clust = 2, nstart = 20)
# }
Run the code above in your browser using DataLab