protoclust (version 1.6.3)

protocut: Cut a Minimax Linkage Tree To Get a Clustering

Description

Cuts a minimax linkage tree to get one of n - 1 clusterings. Works like cutree except also returns the prototypes of the resulting clustering.

Usage

protocut(hc, k = NULL, h = NULL)

Arguments

hc

an object returned by protoclust

k

the number of clusters desired

h

the height at which to cut the tree

Value

A list corresponding to the clustering from cutting tree:

cl

vector of cluster memberships

protos

vector of prototype indices corresponding to the k clusters created. protos[i] gives the index of the prototype for all elements with cl==i

imerge

vector describing the nodes where prototypes occur. We use the naming convention of the merge matrix in hclust: if imerge[i] is positive, it is the interior node (counting from the bottom) of the cluster with elements which(cl==i); if imerge[i] is negative, then this is a singleton cluster with a leaf as prototype.

Details

Given a minimax linkage hierarchical clustering, this function cuts the tree at a given height or so that a specified number of clusters is created. It returns both the indices of the prototypes and their locations. This latter information is useful for plotting a dendrogram with prototypes (see plotwithprototypes). As with cutree, if both k and h are given, h is ignored. Unlike cutree, in current version k and h cannot be vectors.

References

Bien, J., and Tibshirani, R. (2011), "Hierarchical Clustering with Prototypes via Minimax Linkage," The Journal of the American Statistical Association, 106(495), 1075-1084.

See Also

protoclust, cutree, plotwithprototypes

Examples

Run this code
# NOT RUN {
# generate some data:
set.seed(1)
n <- 100
p <- 2
x <- matrix(rnorm(n * p), n, p)
rownames(x) <- paste("A", 1:n, sep="")
d <- dist(x)

# perform minimax linkage clustering:
hc <- protoclust(d)

# cut the tree to yield a 10-cluster clustering:
k <- 10 # number of clusters
cut <- protocut(hc, k=k)
h <- hc$height[n - k]

# plot dendrogram (and show cut):
plotwithprototypes(hc, imerge=cut$imerge, col=2)
abline(h=h, lty=2)

# get the prototype assigned to each point:
pr <- cut$protos[cut$cl]

# find point farthest from its prototype:
dmat <- as.matrix(d)
ifar <- which.max(dmat[cbind(1:n, pr[1:n])])

# note that this distance is exactly h:
stopifnot(dmat[ifar, pr[ifar]] == h)

# since this is a 2d example, make 2d display:
plot(x, type="n")
points(x, pch=20, col="lightblue")
lines(rbind(x[ifar, ], x[pr[ifar], ]), col=3)
points(x[cut$protos, ], pch=20, col="red")
text(x[cut$protos, ], labels=hc$labels[cut$protos], pch=19)
tt <- seq(0, 2 * pi, length=100)
for (i in cut$protos) {
  lines(x[i, 1] + h * cos(tt), x[i, 2] + h * sin(tt))
}

# }

Run the code above in your browser using DataCamp Workspace