Produces a heatmap visualizing all samples and variables of a dataset. Both samples and variables are clustered using methods suitable for mixed-type data. Different types of variables are indicated by different color schemes.
mix.heatmap(data, D.subjects, D.variables, dend.subjects, dend.variables, varweights,
dist.variables.method = c("associationMeasures", "distcor", "ClustOfVar"),
linkage="ward.D2", associationFun = association, rowlab, rowmar = 3, lab.cex = 1.5,
ColSideColors, RowSideColors,
col.cont = marray::maPalette(low = "lightblue", high = "darkblue", k = 50),
cont.fixed.range = FALSE, cont.range,
col.ord = list(low = "lightgreen", high = "darkgreen"),
col.cat = c("indianred1","darkred","orangered","orange","palevioletred1",
"violetred4","red3","indianred4"),
legend.colbar, legend.rowbar, legend.mat = FALSE, legend.cex = 1, legend.srt = 0)
data frame where columns are variables (of different data types) and rows are observations (subjects, samples)
A previously calculated distance matrix (class dissimilarity
) for subjects can be given. If missing, it is calculated by dist.subjects
. If set to NULL
, no clustering is done and original order in data
will be preserved.
A previously calculated distance matrix (of class dissimilarity
) for variables can be given. If missing, it is calculated by dist.variables
. If set to NULL
, no clustering is done and original order in data
will be preserved.
A dendrogram
for subjects can be given; then no distances between subjects will be calculated and D.subjects
will be ignored.
A dendrogram
for variables can be given; then no distances between variables will be calculated and D.variables
will be ignored.
optional vector of variable weights, used for calculating Gower's distances between subjects
If "associationMeasures"
, similarities between variables are assessed by combination of appropriate measures of association for different pairs of data types. If "distcor"
, distances between variables are calculated based on distance correlation. In both cases, then a dendrogram is derived by standard hierarchical clustering (hclust
). If "ClustOfVar"
, variables are clustered by hclustvar
from the ClustOfVar
package.
agglomeration method used for hierarchical clustering; corresponds to parameter method
of hclust
By default, appropriate association measures are chosen for each pair of variables, see association
for details. But the user can also define a function that for any two variables calculates a similarity measure. Ignored if dist.variables.method = "ClustOfVar"
or "distcor"
row (variable) labels; if missing, column names of data
are used
margin for row (variable) labels
size of row (variable) labels
vector of length nrow(data)
specifying colors for a color bar added on top of the heatmap
vector of length ncol(data)
specifying colors for a color bar added to the left of the heatmap
color palette for continuous variables; defaults to red-blue color palette
If FALSE
, color range of each continuous variable is defined by respective individual variable's range. If TRUE
, all continuous variables are assumed to have similar range and hence shall have the same color range; "extreme colors" then correspond to extreme values over all continuous variables and are applied to all of them equally. In any case, in order to prevent outlier values to dominate the color scale, "extreme colors" are restricted to 2.5% and 97.5% quantiles. Defaults to FALSE
if cont.fixed.range=TRUE
, extreme value limits for coloring continuous variables can be specified; if missing, extreme values are taken from the data; ignored if cont.fixed.range=FALSE
List with names of colors for the lowest and highest categories of ordinal variables. A color palette will be created correspondingly based on the number of categories. Defaults to a green color palette
vector of colors for categorical variables
class labels for subject groups defined by ColSideColors
class labels for variable groups defined by RowSideColors
shall legend matrix for heatmap be shown?
size of legend text
legend matrix label string rotation in degrees; i.e. legend.srt = 90
produces vertical labels
A mixed-data heatmap with dendrograms and annotation
If no dendrograms or distance matrices are given, subjects and/or samples are clustered with methods for mixed-type data. Similarities between subjects are measured by Gower's general similarity coefficient with an extension of Podani for ordinal variables, see gowdis
. Similarities between variables can be assessed by combination of appropriate measures of association for different pairs of data types, see association
, or based on distance correlation. Then standard hierarchical clustering with by default Ward's minimum variance method is applied. Alternatively, variables can also be clustered by the ClustOfVar
approach.
Variables are shown as rows of the heatmap, samples as columns.
Hummel M, Edelmann D, Kopp-Schneider A (2017). Clustering of samples and variables with mixed-type data. PLOS ONE, 12(11):e0188274.
Gower J (1971). A general coefficient of similarity and some of its properties. Biometrics, 27:857-871.
Chavent M, Kuentz-Simonet V, Liquet B, Saracco J (2012). ClustOfVar: An R Package for the Clustering of Variables. Journal of Statistical Software, 50:1-16.
Szekely GJ, Rizzo ML, Bakirov NK (2007). Measuring and testing dependence by correlation of distances. The Annals of Statistics, 35.6:2769-2794.
Lyons R (2013). Distance covariance in metric spaces. The Annals of Probability, 41.5:3284-3305.
dist.variables
, dist.subjects
, dendro.variables
, dendro.subjects
,distmap
# NOT RUN {
data(mixdata)
mix.heatmap(mixdata, rowmar=7, legend.mat=TRUE)
## with distance correlation
mix.heatmap(mixdata, dist.variables.method="distcor", rowmar=7, legend.mat=TRUE)
## with (random) color bars
colbar <- rep(5:7, nrow(mixdata))
rowbar <- rep(c("darkorange","grey"), ncol(mixdata))
mix.heatmap(mixdata, ColSideColors=colbar, RowSideColors=rowbar,
legend.colbar=c("1","2","3"), legend.rowbar=c("a","b"), rowmar=7)
## example with variable weights
w <- rep(1:2, each=5)
mix.heatmap(mixdata, varweights=w, rowmar=7)
# }
Run the code above in your browser using DataLab