clustering: Posterior similarity matrix and partition estimation

Description

The function computes the posterior similarity (coclustering) matrix (psm) and estimates a representative partition of the observations from the MCMC output. The user can provide the desired number of clusters or estimate a optimal clustering solution by minimizing a loss function on the space of the partitions. In the latter case, the function uses the package salso (Dahl et al., 2021), that the user needs to load.

Usage

clustering(
  object,
  clustering_method = c("dendrogram", "salso"),
  K = 2,
  nCores = 1,
  ...
)
# S3 method for hidalgo_psm
print(x, ...)
# S3 method for hidalgo_psm
plot(x, ...)

Value

list containing the posterior similarity matrix (psm) and the estimated partition clust.

Arguments

object

object of class Hidalgo, the output of the Hidalgo function.

clustering_method

character indicating the method to use to perform clustering. It can be

"dendrogram": thresholding the adjacency dendrogram with a given number (K);

"salso"

estimation via minimization of several partition estimation criteria. The default loss function is the variation of information.

number of clusters to recover by thresholding the dendrogram obtained from the psm.

nCores

parameter for the salso function: the number of CPU cores to use. A value of zero indicates to use all cores on the system.

...

ignored.

object of class hidalgo_psm, obtained from the function clustering().

References

D. B. Dahl, D. J. Johnson, and P. Müller (2022), "Search Algorithms and Loss Functions for Bayesian Clustering", Journal of Computational and Graphical Statistics, tools:::Rd_expr_doi("10.1080/10618600.2022.2069779").

David B. Dahl, Devin J. Johnson and Peter Müller (2022). "salso: Search Algorithms and Loss Functions for Bayesian Clustering". R package version 0.3.0. https://CRAN.R-project.org/package=salso

Examples

Run this code

# \donttest{
library(salso)
X            <- replicate(5,rnorm(500))
X[1:250,1:2] <- 0
h_out        <- Hidalgo(X)
clustering(h_out)
# }