Last chance! 50% off unlimited learning
Sale ends in
Finding pivotal units from a data partition and a co-association matrix C according to three different methods.
piv_sel(C, clusters)
pivots
A matrix with
A
A vector of integers from
Leonardo Egidi legidi@units.it
Given a set of C
is the co-association matrix,
where
Let
or
These methods give the unit that maximizes the global
within similarity ("maxsumint"
) and the unit that
maximizes the difference between global within and
between similarities ("maxsumdiff"
), respectively.
Alternatively, we may choose "minsumnoint"
).
See the vignette for further details.
Egidi, L., Pappadà, R., Pauli, F. and Torelli, N. (2018). Relabelling in Bayesian Mixture Models by Pivotal Units. Statistics and Computing, 28(4), 957-969.
# Iris data
data(iris)
# select the columns of variables
x<- iris[,1:4]
N <- nrow(x)
H <- 1000
a <- matrix(NA, H, N)
# Perform H k-means partitions
for (h in 1:H){
a[h,] <- kmeans(x, centers = 3)$cluster
}
# Build the co-association matrix
C <- matrix(NA, N,N)
for (i in 1:(N-1)){
for (j in (i+1):N){
C[i,j] <- sum(a[,i]==a[,j])/H
C[j,i] <- C[i,j]
}}
km <- kmeans(x, centers =3)
# Apply three pivotal criteria to the co-association matrix
ris <- piv_sel(C, clusters = km$cluster)
graphics::plot(iris[,1], iris[,2], xlab ="Sepal.Length", ylab= "Sepal.Width",
col = km$cluster)
# Add the pivots chosen by the maxsumdiff criterion
points( x[ris$pivots[,3], 1:2], col = 1:3,
cex =2, pch = 8 )
Run the code above in your browser using DataLab