Clustering of the draws in the point process representation (PPR) using \(k\)-means clustering.
identifyLCAMixture(Func, Mu, Phi, Eta, S, centers)A named list containing:
"S": reordered assignments.
"Mu": reordered Mu matrix.
"Phi": reordered Phi matrix.
"Eta": reordered weights.
"non_perm_rate": proportion of draws where the clustering did not
result in a permutation and hence no relabeling could be
performed; this is the proportion of draws discarded.
A numeric array of dimension \(M \times d \times K\); data for clustering in the PPR.
A numeric array of dimension \(M \times r \times K\); draws of cluster means.
A numeric array of dimension \(M \times K \times r\); draws of precisions.
A numeric array of dimension \(M \times K\); draws of cluster sizes.
A numeric matrix of dimension \(M \times N\); draws of cluster assignments.
An integer or a numeric matrix of dimension \(K \times d\); used to initialize stats::kmeans().
The following steps are implemented:
A functional of the draws of the component-specific
parameters (Func) is passed to the function. The functionals
of each component and iteration are stacked on top of each other in
order to obtain a matrix where each row corresponds to the
functional of one component.
The functionals are clustered into \(K_+\) clusters using \(k\)-means clustering. For each functional a group label is obtained.
The obtained labels of the functionals are used to construct a classification for each MCMC iteration. Those classifications which are a permutation of \((1,\ldots,K_+)\) are used to reorder the Mu and Eta draws and the assignment matrix S. This results in an identified mixture model.
Note that only iterations resulting in permutations are used for parameter estimation and deriving the final partition. Those MCMC iterations where the obtained classifications of the functionals are not a permutation of \((1,\ldots,K_+)\) are discarded as no unique assignment of functionals to components can be made. If the non-permutation rate, i.e. the proportion of MCMC iterations where the obtained classifications of the functionals are not a permutation, is high, this is an indication of a poor clustering solution, as the functionals are not clearly separated.