This function implements the sequentially-allocated latent structure optimization (SALSO) to find a clustering that minimizes various loss functions. The SALSO method was presented at the workshop "Bayesian Nonparametric Inference: Dependence Structures and their Applications" in Oaxaca, Mexico on December 6, 2017.
salso(expectedPairwiseAllocationMatrix, loss = c("squaredError",
"absoluteError", "binder", "lowerBoundVariationOfInformation")[1],
nCandidates = 100, budgetInSeconds = 10, maxSize = 0,
maxScans = 10, multicore = TRUE)
A n
-by-n
symmetric matrix
whose (i,j)
elements gives the estimated expected number of times that items
i
and j
are in the same subset (i.e., cluster).
One of "squaredError"
, "absoluteError"
, "binder"
, or
"lowerBoundVariationOfInformation"
to indicate the optimization should seeks to
minimize squared error loss, absolute error loss, Binder loss (Binder 1978), or the lower
bound of the variation of information loss (Wade & Ghahramani 2017), respectively.
The first three are equivalent.
The (maximum) number of candidates to consider. Fewer than
nCandidates
may be considered if the time in budgetInSeconds
is exceeded.
The computational cost is linear in the number of candidates and there are rapidly
diminishing returns to more candidates.
The (maximum) number of seconds to devote to the optimization. When this time is exceeded, no more candidates are considered.
Either zero or a positive integer. If a positive integer, the optimization is constrained to produce solutions whose number of clusters is no more than the supplied value. If zero, the size is not constrained.
The maximum number of reallocation scans after the intial allocation.
The actual number of scans may be less than maxScans
since the algorithm stops
if the result does not change between scans.
Logical indicating whether computations should take advantage of multiple CPU cores.
A clustering (as a vector of cluster labels).
Wade, S. and Ghahramani, Z. (2017). Bayesian cluster analysis: Point estimation and credible balls. Bayesian analysis.
Binder, D. (1978). Bayesian Cluster Analysis. Biometrika, 65: 31<U+2013>38.
# NOT RUN {
suppressWarnings({ # For testing purposes, suppress deprecation warning.
probabilities <- expectedPairwiseAllocationMatrix(iris.clusterings)
salso(probabilities)
})
# }
Run the code above in your browser using DataLab