This function produces the union of the first and second minimum
spanning trees (MSTs) as an object of class igraph
(check package
igraph
for details). It can as well return the first and
second minimum spanning trees when return.MST2only
is FALSE
(default). It starts by calculating the correlation (coexpression) matrix and
using it to obtain a weighting matrix for a complete graph using the equation
$w_{ij} = 1 - |r_{ij}|$ where $r_{ij}$ is the correlation between
features $i$ and $j$ and $w_{ij}$ is the weight of the link between
vertices (nodes) $i$ and $j$ in the graph $G(V,E)$.For the graph $G(V,E)$ where V is the set of vertices and E is the set of
edges, the first MST is defined as the acyclic subset $T_{1} \subseteq E$
that connects all vertices in V and whose total length
$\sum_{i,j \in T_{1}} d(v_{i},v_{j})$ is minimal
(Rahmatallah et. al. 2014). The second MST is defined as the MST of the
reduced graph $G(V,E-T_{1})$. The union of the first and second MSTs is
denoted as MST2.
It was shown in Rahmatallah et. al. 2014 that MST2 can be used as a graphical
visualization tool to highlight the most highly correlated genes in the
correlation network. A gene that is highly correlated with all the other genes
tends to occupy a central position and has a relatively high degree in the MST2
because the shortest paths connecting the vertices of the first and second MSTs
tend to pass through the vertex corresponding to this gene. In contrast, a gene
with low intergene correlations most likely occupies a non-central position in
the MST2 and has a degree of 2.
In rare cases, a feature may have a constant or nearly constant level across
the samples. This results in a zero or a tiny standard deviation. Such case
produces an error in command cor
used to compute the correlations
between features. To avoid this situation, standard deviations are checked in
advance and if any is found below the minimum limit min.sd
(default is 1e-3
), the execution stops and an error message is returned
indicating the the number of feature causing the problem (if only one the
index of that feature is given too).