heatmap.3: Enhanced Heatmap Representation with Dendrogram and Partition

Description

Enhanced heatmap representation with dendrograms and partition given the elbow criterion or a desired number of clusters. 1) a dendrogram added to the left side and to the top, according to cluster analysis; 2) partitions in highlighted rectangles, according to the "elbow" rule or a desired number of clusters.

Usage

heatmap.3(x, diss = inherits(x, "dist"), Rowv = TRUE, Colv = TRUE, dendrogram = c("both", "row", "column", "none"), dist.row, dist.col, dist.FUN = gdist, dist.FUN.MoreArgs = list(method = "euclidean"), hclust.row, hclust.col, hclust.FUN = hclust, hclust.FUN.MoreArgs = list(method = "ward"), scale = c("none", "row", "column"), na.rm = TRUE, cluster.by.row = FALSE, cluster.by.col = FALSE, kr = NA, kc = NA, row.clusters = NA, col.clusters = NA, revR = FALSE, revC = FALSE, add.expr, breaks, x.center, color.FUN = gplots::bluered, sepList = list(NULL, NULL), sep.color = c("gray45", "gray45"), sep.lty = 1, sep.lwd = 2, cellnote, cex.note = 1, notecol = "cyan", na.color = par("bg"), trace = c("none", "column", "row", "both"), tracecol = "cyan", hline, vline, linecol = tracecol, labRow = TRUE, labCol = TRUE, srtRow = NULL, srtCol = NULL, sideRow = 4, sideCol = 1, margin.for.labRow, margin.for.labCol, ColIndividualColors, RowIndividualColors, cexRow, cexCol, labRow.by.group = FALSE, labCol.by.group = FALSE, key = TRUE, key.title = "Color Key", key.xlab = "Value", key.ylab = "Count", keysize = 1.5, mapsize = 9, mapratio = 4/3, sidesize = 3, cex.key.main = 0.75, cex.key.xlab = 0.75, cex.key.ylab = 0.75, density.info = c("histogram", "density", "none"), denscol = tracecol, densadj = 0.25, main = "Heatmap", sub = "", xlab = "", ylab = "", cex.main = 2, cex.sub = 1.5, font.main = 2, font.sub = 3, adj.main = 0.5, mgp.main = c(1.5, 0.5, 0), mar.main = 3, mar.sub = 3, if.plot = TRUE, plot.row.partition = FALSE, plot.col.partition = FALSE, cex.partition = 1.25, color.partition.box = "gray45", color.partition.border = "#FFFFFF", plot.row.individuals = FALSE, plot.col.individuals = FALSE, plot.row.clusters = FALSE, plot.col.clusters = FALSE, plot.row.clustering = FALSE, plot.col.clustering = FALSE, plot.row.individuals.list = FALSE, plot.col.individuals.list = FALSE, plot.row.clusters.list = FALSE, plot.col.clusters.list = FALSE, plot.row.clustering.list = FALSE, plot.col.clustering.list = FALSE, row.data = FALSE, col.data = FALSE, if.plot.info = FALSE, text.box, cex.text = 1, ...)

Arguments

data matrix or data frame, or dissimilarity matrix or `dist' object determined by the value of the 'diss' argument. ##diss logical flag: if TRUE (default for dist or dissimilarity objects), then x is assumed to be a dissimilarity matrix. If FALSE,then x is treated as a matrix of observations by variables.

diss

logical, whether the x is a dissimilarity matrix

Rowv

one of the following: TRUE, a `dend' object, a vector or NULL/FALSE; determines if and how the row dendrogram should be reordered.

Colv

one of the following: "Rowv", TRUE, a `dend' object, a vector or NULL/FALSE; determines if and how the column dendrogram should be reordered.

dendrogram

character string indicating whether to draw 'none', 'row', 'column' or 'both' dendrograms. Defaults to 'both'.

dist.row

a dist object for row observations.

dist.col

a dist object for column observations.

dist.FUN

function used to compute the distance (dissimilarity) between both rows and columns. Defaults to gdist.

dist.FUN.MoreArgs

a list of other arguments to be passed to gdist

hclust.row

a hclust object (as produced by hclust) for row observations.

hclust.col

a hclust object (as produced by hclust) for column observations.

hclust.FUN

function used to compute the hierarchical clustering when "Rowv" or "Colv" are not dendrograms. Defaults to hclust.

hclust.FUN.MoreArgs

a list of other arguments to be passed to hclust. Defaults to list(method="ward")

scale

character indicating if the values should be centered and scaled in either the row direction or the column direction, or none. The default is "none".

na.rm

logical, whether NA values will be removed when scaling.

cluster.by.row

logical, whether to cluster row observations and reorder the input accordingly.

cluster.by.col

logical, whether to cluster column observations and reorder the input accordingly.

numeric, number of clusters in rows; suppressed when row.cluster is specified. DEFAULT: NULL.

numeric, number of clusters in columns; suppressed when col.cluster is specified. DEFAULT: NULL.

row.clusters

a numerical vector, indicating the cluster labels of row observations.

col.clusters

a numerical vector, indicating the cluster labels of column observations.

revR

logical indicating if the row order should be 'rev'ersed for plotting.

revC

logical indicating if the column order should be 'rev'ersed for plotting, such that e.g., for the symmetric case, the symmetry axis is as usual.

add.expr

expression that will be evaluated after the call to image. Can be used to add components to the plot.

breaks

numeric, either a numeric vector indicating the splitting points for binning x into colors, or a integer number of break points to be used, in which case the break points will be spaced equally between range(x). DEFAULT: 16 when not specified.

x.center

numeric, a value of x for centering colors to

color.FUN

function or function name in characters, for colors in the heatmap

sepList

a list of length 2 giving the row and column lines of separation.

sep.color

color for lines of separation.

sep.lty

line type for lines of separation.

sep.lwd

line width for lines of separation.

cellnote

(optional) matrix of character strings which will be placed within each color cell, e.g. cell labels or p-value symbols.

cex.note

relative font size of cellnote.

notecol

color of cellnote.

na.color

Color to use for missing value (NA). Defaults to the plot background color.

trace

character string indicating whether a solid "trace" line should be drawn across "row"s or down "column"s, "both" or "none". The distance of the line from the center of each color-cell is proportional to the size of the measurement. Defaults to "none".

tracecol

character string giving the color for "trace" line. Defaults to "cyan";

hline

Vector of values within cells where a horizontal dotted line should be drawn. only plotted if 'trace' is 'row' or 'both'. Default to the median of the breaks.

vline

Vector of values within cells where a vertical dotted line should be drawn; only drawn if 'trace' 'column' or 'both'. vline default to the median of the breaks.

linecol

the color of hline and vline. Defaults to the value of 'tracecol'.

labRow

character vectors with row labels to use; defaults to rownames(x).

labCol

character vectors with column labels to use; defaults to colnames(x).

srtRow

numerical, specifying (in degrees) how row labels should be rotated. See help("par", package="graphics").

srtCol

numerical, specifying (in degrees) how col labels should be rotated. See help("par", package="graphics").

sideRow

2 or 4, which side row labels display.

sideCol

1 or 3, which side row labels display.

margin.for.labRow

a numerical value gives the margin to plot labRow.

margin.for.labCol

a numerical value gives the margin to plot labCol.

ColIndividualColors

(optional) character vector of length ncol(x) containing the color names for a horizontal side bar that may be used to annotate the columns of x.

RowIndividualColors

(optional) character vector of length nrow(x) containing the color names for a vertical side bar that may be used to annotate the rows of x.

cexRow

positive numbers, used as 'cex.axis' in for column axis labeling. The default currently only uses number of columns.

cexCol

positive numbers, used as 'cex.axis' in for the row axis labeling. The default currently only uses number of rows.

labRow.by.group

logical, whether group unique labels for rows.

labCol.by.group

logical, whether group unique labels for columns.

key

logical indicating whether a color-key should be shown.

key.title

character, title of the color-key ["Color Key"]

key.xlab

character, xlab of the color-key ["Value"]

key.ylab

character, ylab of the color-key ["Count"]

keysize

numeric value indicating the relative size of the key

mapsize

numeric value indicating the relative size of the heatmap.

mapratio

the width-to-height ratio of the heatmap.

sidesize

numeric value indicating the relative size of the sidebars.

cex.key.main

a numerical value giving the amount by which main-title of color-key should be magnified relative to the default.

cex.key.xlab

a numerical value giving the amount by which xlab of color-key should be magnified relative to the default.

cex.key.ylab

a numerical value giving the amount by which ylab of color-key should be magnified relative to the default.

density.info

character string indicating whether to superimpose a 'histogram', a 'density' plot, or no plot ('none') on the color-key.

denscol

character string giving the color for the density display specified by 'density.info', defaults to the same value as 'tracecol'.

densadj

Numeric scaling value for tuning the kernel width when a density plot is drawn on the color key. (See the 'adjust' parameter for the 'density' function for details.) Defaults to 0.25.

main

an overall title for the plot. See help("title", package="graphics").

sub

a subtitle for the plot, describing the distance and/or alignment gap (the "shift").

xlab

a title for the x axis. See help("title", package="graphics").

ylab

a title for the y axis. See help("title", package="graphics").

cex.main

a numerical value giving the amount by which main-title should be magnified relative to the default.

cex.sub

a numerical value giving the amount by which sub-title should be magnified relative to the default.

font.main

An integer which specifies which font to use for main-title.

font.sub

An integer which specifies which font to use for sub-title.

adj.main

The value of 'adj' determines the way in which main-title strings are justified.

mgp.main

the margin line (in 'mex' units) for the main-title.

mar.main

a numerical vector of the form c(bottom, left, top, right) which gives the number of lines of margin to be specified on the four sides of the main-title.

mar.sub

a numerical vector of the form c(bottom, left, top, right) which gives the number of lines of margin to be specified on the four sides of the sub-title.

if.plot

logical, whether to plot. Reordered matrix is returned without graphical output if FALSE.

plot.row.partition

logical, whether to plot row partition.

plot.col.partition

logical, whether to plot column partition.

cex.partition

a numerical value giving the amount by which partition should be magnified relative to the default.

color.partition.box

color for the partition box.

color.partition.border

color for the partition border.

plot.row.individuals

logical, whether to make a plot of row observations.

plot.col.individuals

logical, whether to make a plot of column observations.

plot.row.clusters

logical, whether to make a summary plot of row clusters.

plot.col.clusters

logical, whether to make a summary plot of column clusters.

plot.row.clustering

logical, whether to make a summary plot of overall row clustering.

plot.col.clustering

logical, whether to make a summary plot of overall column clustering.

plot.row.individuals.list

a list of expressions that is used to plot.row.individuals

plot.col.individuals.list

a list of expressions that is used to plot.col.individuals

plot.row.clusters.list

a list of expressions that is used to plot.row.clusters

plot.col.clusters.list

a list of expressions that is used to plot.col.clusters

plot.row.clustering.list

a list of expressions that is used to plot.row.clustering

plot.col.clustering.list

a list of expressions that is used to plot.col.clustering

row.data

(optional) data used to plot.row.individuals, plot.row.clusters or plot.row.clustering

col.data

(optional) data used to plot.col.individuals, plot.col.clusters or plot.col.clustering

if.plot.info

logical, whether to plot text.box.

text.box

character plotted when if.plot.info is TRUE.

cex.text

a numerical value giving the amount by which text.box should be magnified relative to the default.

...

arguments to be passed to method heatmap.3. e help("image", package="graphics").

Value

A reordered matrix according to row or/and col dendrogram(s) and indices that used for reordering.

Details

Enhanced heatmap representation with partition and summary statistics (optional). This is an enhanced version of `heatmap.2' function in the Package gplots. The enhancement includes: 1) Improved performance with optional input of precomputed dist object and hclust object. 2) Highlight of specific cells using rectangles. For instance, the cells of clusters of interests. (Examples should be included in future.) 3) Add-on plots in addition to the heatmap, such as cluster-wise summary plots and overall clustering summary plots, to the right of or under the heatmap.

Examples

Run this code

## ------------------------------------------------------------------------
## Example1: mtcars
## ------------------------------------------------------------------------
## load library
require("GMD")

## load data
data(mtcars)

## heatmap on raw data
x  <- as.matrix(mtcars)

dev.new(width=10,height=8)
heatmap.3(x)                               # default, with reordering and dendrogram
## Not run: 
# heatmap.3(x, Rowv=FALSE, Colv=FALSE)       # no reordering and no dendrogram
# heatmap.3(x, dendrogram="none")            # reordering without dendrogram
# heatmap.3(x, dendrogram="row")        # row dendrogram with row (and col) reordering
# heatmap.3(x, dendrogram="row", Colv=FALSE) # row dendrogram with only row reordering
# heatmap.3(x, dendrogram="col")             # col dendrogram
# heatmap.3(x, dendrogram="col", Rowv=FALSE) # col dendrogram with only col reordering
# heatmapOut <-
#   heatmap.3(x, scale="column")             # sacled by column
# names(heatmapOut)                          # view the list that is returned
# heatmap.3(x, scale="column", x.center=0)   # colors centered around 0
# heatmap.3(x, scale="column",trace="column")  # trun "trace" on
# ## End(Not run)

## coloring cars (row observations) by brand
brands <- sapply(rownames(x), function(e) strsplit(e," ")[[1]][1])
names(brands) <- c()
brands.index <- as.numeric(as.factor(brands))
RowIndividualColors <- rainbow(max(brands.index))[brands.index]
heatmap.3(x, scale="column", RowIndividualColors=RowIndividualColors)

## coloring attributes (column features) randomly (just for a test :)
heatmap.3(x, scale="column", ColIndividualColors=rainbow(ncol(x)))

## add a single plot for all row individuals
dev.new(width=12,height=8)
expr1 <- list(quote(plot(row.data[rowInd,"hp"],1:nrow(row.data),
xlab="hp",ylab="",yaxt="n",main="Gross horsepower")),
quote(axis(2,1:nrow(row.data),rownames(row.data)[rowInd],las=2)))
heatmap.3(x, scale="column", plot.row.individuals=TRUE, row.data=x,
          plot.row.individuals.list=list(expr1))


## ------------------------------------------------------------------------
## Example2: ruspini
## ------------------------------------------------------------------------
## load library
require("GMD")
require(cluster)

## load data
data(ruspini)

## heatmap on a `dist' object
x <- gdist(ruspini)
main <- "Heatmap of Ruspini data"
dev.new(width=10,height=10)
heatmap.3(x, main=main, mapratio=1) # with a title and a map in square!
## Not run: 
# heatmap.3(x, main=main, revC=TRUE)  # reverse column for a symmetric look
# heatmap.3(x, main=main, kr=2, kc=2) # partition by predefined number of clusters
# ## End(Not run)
## show partition by elbow
css.multi.obj <- css.hclust(x,hclust(x))
elbow.obj <- elbow.batch(css.multi.obj,ev.thres=0.90,inc.thres=0.05)
heatmap.3(x, main=main, revC=TRUE, kr=elbow.obj$k, kc=elbow.obj$k)

## Not run: 
# ## show elbow info as subtitle
# heatmap.3(x, main=main, sub=sub("\n"," ",attr(elbow.obj,"description")),
# cex.sub=1.25,revC=TRUE,kr=elbow.obj$k, kc=elbow.obj$k)
# ## End(Not run)