dlso: Perform Draws-Based Latent Structure Optimization

Description

Among the supplied latent structures, this function picks the structure that minimizes one of various loss functions.

Usage

dlso(x, loss = c("squaredError", "absoluteError", "binder",
  "lowerBoundVariationOfInformation")[1], maxSize = 0)

Arguments

A collection of clusterings or feature allocations. If x is a B-by-n matrix, each of the B rows represents a clustering of n items using cluster labels. For clustering b, items i and j are in the same cluster if and only if x[b,i] == x[b,j]. If x is a list of length B, each element of list represents a feature allocation using a binary matrix of n rows and an arbitrary number of columns. For feature allocation b, items i and j share m features if, for k = 1, 2, ..., the expression x[[b]][i,k] == x[[b]][j,k] == 1 is true exactly m times.

loss

One of "squaredError", "absoluteError", "binder", or "lowerBoundVariationOfInformation" to indicate the optimization should seeks to minimize squared error loss, absolute error loss, Binder loss (Binder 1978), or the lower bound of the variation of information loss (Wade & Ghahramani 2017), respectively. For clustering, the first three are equivalent. For feature allocation, only the first two are valid.

maxSize

Either zero or a positive integer. If a positive integer, the optimization is constrained to produce solutions whose number of clusters or number of features is no more than the supplied value. If zero, the size is not constrained.

Value

A clustering (as a vector of cluster labels) or a feature allocation (as a binary matrix of feature indicators).

References

Wade, S. and Ghahramani, Z. (2017). Bayesian cluster analysis: Point estimation and credible balls. Bayesian analysis.

Binder, D. (1978). Bayesian Cluster Analysis. Biometrika, 65: 31<U+2013>38.

Examples

Run this code

# NOT RUN {
dlso(iris.clusterings)
dlso(USArrests.featureAllocations)
# }
# NOT RUN {
# }