Learn R Programming

COMMUNAL (version 1.0)

clusterKeys: Rekey cluster assignments.

Description

Reindexes (rekeys) the cluster assignments to maximize overlap across algorithms. Ignores algorithms which could not find k clusters; i.e. when one of the clusters is smaller than the min.size argument. Use this after determining the number of clusters.

Usage

clusterKeys(clusters, k, min.size = 3)

Arguments

clusters
Data frame of cluster assignments, where rows are samples, columns are algorithms, assignments are integers. For example, the output of the getClustering method in "COMMUNAL".
k
Number of clusters selected.
min.size
Minimum cluster size. Algorithms that return clusters smaller than this (or that don't have k clusters) are tossed out.

Value

  • Returns a matrix of rekeyed cluster assignments, such that cluster 'n' refers to the same cluster across all algorithms. Cluster 0 contains the samples for which no consistent 'core' cluster could be identified.

Examples

Run this code
# reindexes cluster numbers to agree
k <- 3
clusters <- data.frame(
  alg1=as.integer(c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)),
  alg2=as.integer(c(1,1,1,1,1,3,3,3,3,3,2,2,2,2,2)),
  alg3=as.integer(c(3,3,3,3,3,1,1,1,1,1,2,2,2,2,2))
)
mat.key <- clusterKeys(clusters, k)
mat.key # cluster indices are relabeled
examineCounts(mat.key)
core <- returnCore(mat.key, agreement.thresh=50) # find 'core' clusters
table(core) # the 'core' clusters

# some clusters assignments are undetermined
k <- 3
clusters <- data.frame(
  alg1=as.integer(c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,1,1,2,2,3,3)),
  alg2=as.integer(c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,1,2,2,3,3,1)),
  alg3=as.integer(c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,2,3,1,1,2,3))
)
mat.key <- clusterKeys(clusters, k)
mat.key # last six samples have conflicting assignments
examineCounts(mat.key)
core <- returnCore(mat.key, agreement.thresh=66) # at least 2 of 3 algs must agree
table(core)
core <- returnCore(mat.key, agreement.thresh=99) # all algs must agree
table(core)

Run the code above in your browser using DataLab