clue (version 0.3-55)

hierarchy: Hierarchies

Description

Determine whether an R object represents a hierarchy of objects, or coerce to an R object representing such.

Usage

is.cl_hierarchy(x)
is.cl_dendrogram(x)

as.cl_hierarchy(x) as.cl_dendrogram(x)

Arguments

x

an R object.

Value

For the testing functions, a logical indicating whether the given object represents a clustering of objects of the respective kind.

For the coercion functions, a container object inheriting from "cl_hierarchy", with a suitable representation of the hierarchy given by x.

Details

These functions are generic functions.

The methods provided in package clue handle the partitions and hierarchies obtained from clustering functions in the base R distribution, as well as packages RWeka, ape, cba, cclust, cluster, e1071, flexclust, flexmix, kernlab, mclust, movMF and skmeans (and of course, clue itself).

The hierarchies considered by clue are \(n\)-trees (hierarchies in the strict sense) and dendrograms (also known as valued \(n\)-trees or total indexed hierarchies), which are represented by the virtual classes "cl_hierarchy" and "cl_dendrogram" (which inherits from the former), respectively.

\(n\)-trees on a set \(X\) of objects correspond to collections \(H\) of subsets of \(X\), usually called classes of the hierarchy, which satisfy the following properties:

  • \(H\) contains all singletons with objects of \(X\), \(X\) itself, but not the empty set;

  • The intersection of two sets \(A\) and \(B\) in \(H\) is either empty or one of the sets.

The classes of a hierarchy can be obtained by cl_classes.

Dendrograms are \(n\)-trees where additionally a height \(h\) is associated with each of the classes, so that for two classes \(A\) and \(B\) with non-empty intersection we have \(h(A) \le h(B)\) iff \(A\) is a subset of \(B\). For each pair of objects one can then define \(u_{ij}\) as the height of the smallest class containing both \(i\) and \(j\): this results in a dissimilarity on \(X\) which satisfies the ultrametric (3-point) conditions \(u_{ij} \le \max(u_{ik}, u_{jk})\) for all triples \((i, j, k)\) of objects. Conversely, an ultrametric dissimilarity induces a unique dendrogram.

The ultrametric dissimilarities of a dendrogram can be obtained by cl_ultrametric.

as.cl_hierarchy returns an object of class "cl_hierarchy" “containing” the given object x if this already represents a hierarchy (i.e., is.cl_hierarchy(x) is true), or the ultrametric obtained from x via as.cl_ultrametric.

as.cl_dendrogram returns an object which has class "cl_dendrogram" and inherits from "cl_hierarchy", and contains x if it represents a dendrogram (i.e., is.cl_dendrogram(x) is true), or the ultrametric obtained from x.

Conceptually, hierarchies and dendrograms are virtual classes, allowing for a variety of representations.

There are group methods for comparing dendrograms and computing their minimum, maximum, and range based on the meet and join operations, see cl_meet. There is also a plot method.

Examples

Run this code
# NOT RUN {
hcl <- hclust(dist(USArrests))
is.cl_dendrogram(hcl)
is.cl_hierarchy(hcl)
# }

Run the code above in your browser using DataCamp Workspace