cim( mat,
color = NULL,
row.names = TRUE,
col.names = TRUE,
row.sideColors = NULL,
col.sideColors = NULL,
row.cex = NULL,
col.cex = NULL,
cluster = "both",
dist.method = c("euclidean", "euclidean"),
clust.method = c("complete", "complete"),
cut.tree = c(0, 0),
transpose = FALSE,
comp = NULL,
symkey = TRUE,
keysize = c(1, 1),
zoom = FALSE,
main = NULL,
xlab = NULL,
ylab = NULL,
margins = c(5, 5),
lhei = NULL,
lwid = NULL,
sample.names = TRUE,
var.names = TRUE,
sample.sideColors = NULL,
var.sideColors = NULL,
center = TRUE,
scale = FALSE,
X.var.names = TRUE,
Y.var.names = TRUE,
x.sideColors = NULL,
y.sideColors = NULL,
mapping = "XY",
legend=NULL,
...)
"pca"
, "spca"
, "ipca"
, "sipca"
, "rcc"
, "pls"
, "spls"
, "plsda"
terrain.colors
,
topo.colors
, rainbow
,mat
be shown? If TRUE
(defaults) rownames(mat)
and/or colnames(mat)
are used. Possible character vectors with
row and/or column labels to use.TRUE
(defaults) object$names$indiv
and/or object$names$X
are used. Possible character vector
with sample and/or variable labels to use.TRUE
(defaults) object$names$X
and/or object$names$Y
are used. Possible character
vector with $X$- and/or $Y$-variable labels to use.nrow(mat)
containing the color
names for a vertical side bar that may be used to annotate the rows of mat
.ncol(mat)
containing
the color names for a horizontal side bar that may be used to annotate the columns of mat
.nrow(object$X)
containing the color
names for a vertical side bar that may be used to annotate the samples.ncol(object$X)
containing
the color names for a horizontal side bar that may be used to annotate the variables.ncol(object$X)
and
ncol(object$Y)
containing the color names for horizontal and vertical side bars that may
be used to annotate the $X$- and/or $Y$-variables.cex.axis
in for the row or column
axis labeling. The defaults currently only use number of rows or columns, respectively."X"
,
"Y"
or "XY"
-association matrix. See Details."none"
, "row"
,
"column"
or "both"
. Defaults to "both"
."correlation"
for Pearson
correlation and all the distances supported by dist
,hclust
such as "ward"
, "complete"
, etc.FALSE
.mat
. See scale
function.mat
. See scale
function.TRUE
.par(mar)
)
for column and row names respectively.layout
to divide the device up into two
(or three if a side color is drawn) rows and two columns, with the row-heights lhei
and the column-widths lwid
.cim
.cim
.order.dendrogram
."dendrogram"
which describes the row and
column trees produced by cim
.image(t(mat))
) with rows and/or
columns reordered according to some hierarchical clustering method to identify
interesting patterns. Generated dendrograms from clustering are added to the left
side and to the top of the image. By default the used clustering method for rows
and columns is the complete linkage method and the used distance measure
is the distance euclidean. In "pca"
, "spca"
, "ipca"
, "sipca"
, "plsda"
,
"splsda"
and "mlsplsda"
methods the mat
matrix is object$X
.
For the remaining methods, if mapping = "X"
or mapping = "Y"
the
mat
matrix is object$X
or object$Y
respectively. If mapping = "XY"
:
rcc
method, the matrixmat
is created where element$(j,k)$is the scalar product value between every pairs of vectors in dimensionlength(comp)
representing the variables$X_j$and$Y_k$on the
axis defined by$Z_i$with$i$incomp
, where$Z_i$is the
equiangular vector between the$i$-th$X$and$Y$canonical variate.pls
,spls
andmlspls
methods, ifobject$mode
is"regression"
, the element$(j,k)$of the matrixmat
is given by the scalar product value between every pairs of vectors in dimensionlength(comp)
representing the variables$X_j$and$Y_k$on the axis
defined by$U_i$with$i$incomp
, where$U_i$is the$i$-th$X$variate. Ifobject$mode
is"canonical"
then$X_j$and$Y_k$are represented on the axis defined by$U_i$and$V_i$respectively.By default four components will be displayed in the plot. At the top left is the
color key, top right is the column dendogram, bottom left is the row dendogram,
bottom right is the image plot. When sideColors
are provided, an
additional row or column is inserted in the appropriate location. This layout can
be overriden by specifiying appropriate values for lwid
and lhei
.
lwid
controls the column width, and lhei
controls the row height.
See the help page for layout
for details on how to use these arguments.
For visualization of "high-dimensional" data sets, a nice zooming tool was created.
zoom = TRUE
open a new device, one for CIM, one for zoom-out region and
define an interactive 'zoom' process: click two points at imagen map region by
pressing the first mouse button. It then draws a rectangle around the selected
region and zoom-out this at new device. The process can be repeated to zoom-out
other regions of interest.
The zoom process is terminated by clicking the second button and selecting 'Stop' from the menu, or from the 'Stop' menu on the graphics window.
Weinstein, J. N., Myers, T. G., O'Connor, P. M., Friend, S. H., Fornace Jr., A. J., Kohn, K. W., Fojo, T., Bates, S. E., Rubinstein, L. V., Anderson, N. L., Buolamwini, J. K., van Osdol, W. W., Monks, A. P., Scudiero, D. A., Sausville, E. A., Zaharevitz, D. W., Bunow, B., Viswanadhan, V. N., Johnson, G. S., Wittes, R. E. and Paull, K. D. (1997). An information-intensive approach to the molecular pharmacology of cancer. Science 275, 343-349.
Gonzalez I., Le Cao K.A., Davis M.J., Dejean S. (2012). Visualising associations between paired 'omics' data sets. BioData Mining; 5(1).
heatmap
,
hclust
, plotVar
,
plot3dVar
, network
and ## default method: shwos cross correlation between 2 data sets
#------------------------------------------------------------------
data(nutrimouse)
X <- nutrimouse$lipid
Y <- nutrimouse$gene
cim(cor(X, Y), cluster = "none")
## CIM representation for objects of class 'rcc'
#------------------------------------------------------------------
nutri.rcc <- rcc(X, Y, ncomp = 3, lambda1 = 0.064, lambda2 = 0.008)
cim(nutri.rcc, xlab = "genes", ylab = "lipids", margins = c(5, 6))
#-- interactive 'zoom' available as below
cim(nutri.rcc, xlab = "genes", ylab = "lipids", margins = c(5, 6),
zoom = TRUE)
#-- select the region and "see" the zoom-out region, click on 'finish' or 'exit' to get out
# Rstudio might throw a warning message.
#-- cim from X matrix with a side bar to indicate the diet
diet.col <- palette()[as.numeric(nutrimouse$diet)]
cim(nutri.rcc, mapping = "X", sample.names = nutrimouse$diet,
sample.sideColors = diet.col, xlab = "lipids",
clust.method = c("ward", "ward"), margins = c(6, 4))
#-- cim from Y matrix with a side bar to indicate the genotype
geno.col = color.mixo(as.numeric(nutrimouse$genotype))
cim(nutri.rcc, mapping = "Y", sample.names = nutrimouse$genotype,
sample.sideColors = geno.col, xlab = "genes",
clust.method = c("ward", "ward"))
## CIM representation for objects of class 'spca' (also works for sipca)
#------------------------------------------------------------------
data(liver.toxicity)
X <- liver.toxicity$gene
liver.spca <- spca(X, ncomp = 2, keepX = c(30, 30), scale = FALSE)
dose.col <- color.mixo(as.numeric(as.factor(liver.toxicity$treatment[, 3])))
# side bar, no variable names shown
cim(liver.spca, sample.sideColors = dose.col, var.names = FALSE,
sample.names = liver.toxicity$treatment[, 3],
clust.method = c("ward", "ward"))
## CIM representation for objects of class '(s)pls'
#------------------------------------------------------------------
data(liver.toxicity)
X <- liver.toxicity$gene
Y <- liver.toxicity$clinic
liver.spls <- spls(X, Y, ncomp = 3,
keepX = c(20, 50, 50), keepY = c(10, 10, 10))
# default
cim(liver.spls)
# transpose matrix, choose clustering method
cim(liver.spls, transpose = TRUE,
clust.method = c("ward", "ward"), margins = c(5, 7))
# Here we visualise only the X variables selected
cim(liver.spls, mapping="X")
# Here we should visualise only the Y variables selected
cim(liver.spls, mapping="Y")
# Here we only visualise the similarity matrix between the variables by spls
cim(liver.spls, cluster="none")
# plotting two data sets with the similarity matrix as input in the funciton
# (see our BioData Mining paper for more details)
# Only the variables selected by the sPLS model in X and Y are represented
cim(liver.spls, mapping="XY")
# on the X matrix only, side col var to indicate dose
dose.col <- color.mixo(as.numeric(as.factor(liver.toxicity$treatment[, 3])))
cim(liver.spls, mapping = "X", sample.sideColors = dose.col,
sample.names = liver.toxicity$treatment[, 3])
# CIM default representation includes the total of 120 genes selected, with the dose color
# with a sparse method, show only the variables selected on specific components
cim(liver.spls, comp = 1)
cim(liver.spls, comp = 2)
cim(liver.spls, comp = c(1,2))
cim(liver.spls, comp = c(1,3))
## CIM representation for objects of class '(s)plsda'
#------------------------------------------------------------------
# Setting up the Y outcome first
Y <- liver.toxicity$treatment[, 3]
liver.splsda <- splsda(X, Y, ncomp = 2, keepX = c(40, 30))
cim(liver.splsda, sample.sideColors = dose.col, sample.names = Y)
## CIM representation for objects of class splsda 'multilevel'
# with a two level factor (repeated sample and time)
#------------------------------------------------------------------
data(vac18.simulated)
X <- vac18.simulated$genes
design <- data.frame(samp = vac18.simulated$sample,
time = vac18.simulated$time,
stim = vac18.simulated$stimulation)
res.2level <- multilevel(X, ncomp = 2, design = design,
keepX = c(120, 10), method = 'splsda')
#define colors for the levels: stimulation and time
stim.col <- c("darkblue", "purple", "green4","red3")
stim.col <- stim.col[as.numeric(design$stim)]
time.col <- c("orange", "cyan")[as.numeric(design$time)]
# The row side bar indicates the two levels of the facteor, stimulation and time.
# the sample names have been motified on the plot.
cim(res.2level, sample.sideColors = cbind(stim.col, time.col),
sample.names = paste(design$time, design$stim, sep = "_"),
var.names = FALSE,
#setting up legend:
legend=list(legend = c(levels(design$time), levels(design$stim)),
col = c("orange", "cyan", "darkblue", "purple", "green4","red3"),
title = "Condition", cex = 0.7)
)
## CIM representation for objects of class spls 'multilevel'
#------------------------------------------------------------------
data(liver.toxicity)
repeat.indiv <- c(1, 2, 1, 2, 1, 2, 1, 2, 3, 3, 4, 3, 4, 3, 4, 4, 5, 6, 5, 5,
6, 5, 6, 7, 7, 8, 6, 7, 8, 7, 8, 8, 9, 10, 9, 10, 11, 9, 9,
10, 11, 12, 12, 10, 11, 12, 11, 12, 13, 14, 13, 14, 13, 14,
13, 14, 15, 16, 15, 16, 15, 16, 15, 16)
# sPLS is a non supervised technique, and so we only indicate the sample repetitions
# in the design (1 factor only here, sample)
# sPLS takes as an input 2 data sets, and the variables selected
design <- data.frame(sample = repeat.indiv)
res.spls.1level <- multilevel(X = liver.toxicity$gene,
Y=liver.toxicity$clinic,
design = design,
ncomp = 2,
keepX = c(50, 50), keepY = c(5, 5),
method = 'spls',
mode = 'canonical')
stim.col <- c("darkblue", "purple", "green4","red3")
# showing only the Y variables, and only those selected in comp 1
cim(res.spls.1level, mapping="Y",
sample.sideColors = stim.col[factor(liver.toxicity$treatment[,3])], comp = 1,
#setting up legend:
legend=list(legend = unique(liver.toxicity$treatment[,3]), col=stim.col,
title = "Dose", cex=0.9))
# showing only the X variables, for all selected on comp 1 and 2
cim(res.spls.1level, mapping="X",
sample.sideColors = stim.col[factor(liver.toxicity$treatment[,3])],
#setting up legend:
legend=list(legend = unique(liver.toxicity$treatment[,3]), col=stim.col,
title = "Dose", cex=0.9))
# These are the cross correlations between the variables selected in X and Y.
# The similarity matrix is obtained as in our paper in Data Mining
cim(res.spls.1level, mapping="XY")
Run the code above in your browser using DataLab