plotRange3D: Visualize cluster stability

Description

Given clusterRange output for a dataset, visualize the cluster optimality output for a range of K over a range of variable measures.

Usage

plotRange3D(clusRange, ks=NULL, goodAlgs=NULL, goodMeasures=NULL, filename=NULL, colorbar=T, minSize = 3, plot3D=T, ...)

Arguments

clusRange

The output of clusterRange.

range of cluster number k to plot. If NULL, plots all ks in clusRange.

goodAlgs

which algorithms to use in summarizing validation measures. If NULL, plots all algorithms in clusRange.

goodMeasures

which validation measures to use in summarizing validation measures. If NULL, plots all validity measures in clusRange.

filename

optionally specify filename to save a snapshot of the 3D image.

colorbar

Whether to draw the front right color legend in the output.

minSize

The minimum acceptable size for a cluster. plotRange3D tests all algorithms for clusters getGoodAlgs to decrease the number of non-robust algorithms. Having some clusters

plot3D

Whether to do a 3D plot at all. If rgl() will not work locally, this allows the user to still see the 2D plot.

...

other arguments to pass down to 2D plot of mean z-score against k.

Value

A 3D plot is generated, using the package "rgl". A matrix of the plotted values is returned. A 2D plot of average metric against k (i.e., the mean over varRange) is also generated.

Details

A summarized validation measure value is computed for each value of k, for each dataset. This is done by first subsetting the data to the measures, ks, and algorithms of interest, and then computing averages of the measures for each dataset and k (number of clusters).

For some validation measures, a lower value implies better clustering, and for others a higher value is better. Prior to averaging, measures that favor a lower value are multiplied by negative one. Furthermore, each measure is scaled to have zero mean and unit variance across all the datasets prior to averaging, so each measure has equal weight, and we can compare the plot across datasets.

Examples

Run this code

## Not run: 
# ## clusterRange output for breast cancer dataset
# data(BRCA.results) 
# 
# ## automated selection of optimal algorithms and validity measures
# goodAlgs <- getGoodAlgs(BRCA.results)
# goodMeasures <- getNonCorrNonMonoMeasures(BRCA.results)
# 
# (values <- plotRange3D(BRCA.results, goodAlgs, goodMeasures))
# ## End(Not run)

Run the code above in your browser using DataLab