aggExCluster: Exemplar-based Agglomerative Clustering

Description

Runs exemplar-based agglomerative clustering

Usage

## S3 method for class 'matrix,missing':
aggExCluster(s, x, includeSim=FALSE)
## S3 method for class 'matrix,APResult':
aggExCluster(s, x, includeSim=FALSE)
## S3 method for class 'matrix,ExClust':
aggExCluster(s, x, includeSim=FALSE)
## S3 method for class 'missing,APResult':
aggExCluster(s, x, includeSim=TRUE)
## S3 method for class 'missing,ExClust':
aggExCluster(s, x, includeSim=TRUE)
## S3 method for class 'function,ANY':
aggExCluster(s, x, includeSim=TRUE, ...)
## S3 method for class 'character,ANY':
aggExCluster(s, x, includeSim=TRUE, ...)

Arguments

an $l\times l$ similarity matrix or a similarity function either specified as the name of a package-provided similarity function as character string or a user provided function object.

either a prior clustering of class APResult or ExClust or, if called with s being a function or function name, input data to be clustered (see <

includeSim

if TRUE, the similarity matrix (either computed internally or passed via the s argument) is stored to the slot sim of the returned AggExResult object. The default i

...

all other arguments are passed to the selected similarity function as they are

Value

Upon successful completion, the function returns an AggExResult object.

code

aggExCluster

enumerate

Agglomerative clustering of an entire data set can be accomplished either by callingaggExClusteron a quadratic similarity matrix without further argument or by callingaggExClusterfor a function or function name along with data to be clustered (as argumentx). A full agglomeration run is performed that starts fromlclusters (all samples in separate one-element clusters) and ends with one cluster (all samples in one single cluster).

item

Agglomerative clustering starting from a given clustering result can be accomplished by calling aggExCluster for an APResult or ExClust object passed as parameter x. The similarity matrix can either be passed as argument s or, if missing, aggExCluster looks if the similarity matrix is included in the clustering object x. A cluster hierarchy with numbers of clusters ranging from the number of clusters in x down to 1 is created.

sQuote

Cluster 1
Cluster 2

Details

aggExCluster performs agglomerative clustering. Unlike other methods, e.g., the ones implemented in hclust, aggExCluster is computing exemplars for each cluster and its merging objective is geared towards the identification of meaningful exemplars, too.

For each pair of clusters, the merging objective is computed as follows:

An intermediate cluster is created as the union of the two clusters.

The potential exemplar is selected from the intermediate cluster as the sample that has the largest average similarity to all other samples in the intermediate cluster. Then the average similarity of the exemplar with all samples in the first cluster and the average similarity with all samples in the second cluster is computed. These two values measure how well the joint exemplar describes the samples in the two clusters. The merging objective is finally computed as the average of the two measures above. Hence, we can consider the merging objective as some kind of balanced average similarity to the joint exemplar.

References

http://www.bioinf.jku.at/software/apcluster

Bodenhofer, U., Kothmeier, A., and Hochreiter, S. (2011) APCluster: an R package for affinity propagation clustering. Bioinformatics 27, 2463-2464. DOI: http://dx.doi.org/10.1093/bioinformatics/btr406{10.1093/bioinformatics/btr406}.

Examples

Run this code

## create two Gaussian clouds
cl1 <- cbind(rnorm(50,0.2,0.05),rnorm(50,0.8,0.06))
cl2 <- cbind(rnorm(50,0.7,0.08),rnorm(50,0.3,0.05))
x <- rbind(cl1,cl2)

## compute agglomerative clustering from scratch
aggres1 <- aggExCluster(negDistMat(r=2), x)

## show results
show(aggres1)

## plot dendrogram
plot(aggres1)

## plot heatmap along with dendrogram
heatmap(aggres1)

## plot level with two clusters
plot(aggres1, x, k=2)

## run affinity propagation
apres <- apcluster(negDistMat(r=2), x, q=0.7)

## create hierarchy of clusters determined by affinity propagation
aggres2 <- aggExCluster(x=apres)

## show results
show(aggres2)

## plot dendrogram
plot(aggres2)

## plot heatmap
heatmap(aggres2)

## plot level with two clusters
plot(aggres2, x, k=2)

Run the code above in your browser using DataLab