boot: Bootstrap Resampling of Clustering Algorithms

Description

Generate bootstrap replicates of the results of applying a base clustering algorithm to a given data set.

Usage

cl_boot(x, B, k = NULL,
        algorithm = if (is.null(k)) "hclust" else "kmeans", 
        parameters = list(), resample = FALSE)

Arguments

Value

A cluster ensemble of length $B$, with either (if resampling is not used, default) the results of running the base algorithm on the given data set, or (if resampling is used) the memberships for the given data predicted from the results of running the base algorithm on bootstrap samples of the data.

Details

This is a rather simple-minded function with limited applicability, and mostly useful for studying the effect of (uncontrolled) random initializations of fixed-point partitioning algorithms such as kmeans or cmeans, see the examples. To study the effect of varying control parameters or explicitly providing random starting values, the respective cluster ensemble has to be generated explicitly (most conveniently by using replicate to create a list lst of suitable instances of clusterings obtained by the base algorithm, and using cl_ensemble(list = lst) to create the ensemble).

Examples

Run this code

## Study e.g. the effect of random kmeans() initializations.
data("Cassini")
pens <- cl_boot(Cassini$x, 15, 3)
diss <- cl_dissimilarity(pens)
summary(c(diss))
plot(hclust(diss))

Run the code above in your browser using DataLab