Aggregate the resulting clustering of the SOM algorithm into super-clusters.
# S3 method for somRes
superClass(sommap, method="ward.D", members=NULL, k=NULL,
h=NULL, ...)
# S3 method for somSC
print(x, ...)
# S3 method for somSC
summary(object, ...)
# S3 method for somSC
projectIGraph(object, init.graph, ...)
# S3 method for somSC
plot(x, type=c("dendrogram", "grid", "hitmap", "lines",
"barplot", "boxplot", "mds", "color",
"poly.dist", "pie", "graph", "dendro3d",
"radar", "projgraph"),
plot.var=TRUE, plot.legend=FALSE, add.type=FALSE,
print.title = FALSE,
the.titles = paste("Cluster",
1:prod(x$som$parameters$the.grid$dim)),
...)
A somRes
object
Arguments passed to the hclust
function.
Arguments passed to the cutree
function
(respectively, the number of super-clusters or the height where to cut the
dendrogram).
A somSC
object
An igraph object which is projected according
to the super-clusters. The number of vertices of init.graph
must be
equal to the number of rows in the original dataset processed by the SOM
(case "korresp"
is not handled by this function). In the projected
graph, the vertices are positionned at the center of gravity of the
super-clusters (more details in the section Details below).
The type of plot to draw. Default value is "dendrogram"
,
to plot the dendrogram of the clustering. Case "grid"
plots the grid
in color according to the super clustering. Case "projgraph"
uses an
igraph object passed to the argument variable
and plots
the projected graph as defined by the function projectIGraph.somSC
.
All other cases are those available in the function plot.somRes
and surimpose the super-clusters over these plots.
A boolean indicating whether a graph showing the evolution of
the explained variance should be plotted. This argument is only used when
type="dendrogram"
, its default value is TRUE
.
A boolean indicating whether a legend should be added to
the plot. This argument is only used when type
is either "grid"
or "hitmap"
or "mds"
. Its default value is FALSE
.
A boolean, which default value is FALSE
,
indicating whether you are giving an additional variable to the argument
variable
or not. If you do, the function plot.somRes
will be called with the argument what
set to "add"
.
Whether the cluster titles must be printed in center of
the grid or not for type="grid"
. Default to FALSE
(titles not
displayed).
If print.title = TRUE
, values of the title to
display for type="grid"
. Default to "Cluster " followed by the cluster
number.
Used for plot.somSC
: further arguments passed either to
the function plot
(case type="dendro"
) or to
plot.myGrid
(case type="grid"
) or to
plot.somRes
(all other cases).
The superClass
function returns an object of class
somSC
which is a list of the following elements:
The super clustering of the prototypes (only if either k
or h
are given by user).
An hclust
object.
The somRes
object given as argument (see
trainSOM
for details).
The projectIGraph.somSC function returns an object of class
igraph with the following attributes:
the graph attribute layout
which provides the layout of the
projected graph according to the center of gravity of the super-clusters
positionned on the SOM grid;
the vertex attributes name
and size
which, respectively
are the vertex number on the grid and the number of vertexes included in the
corresponding cluster;
the edge attribute weight
which gives the number of edges (or the
sum of the weights) between the vertexes of the two corresponding clusters.
The superClass
function can be used in 2 ways:
to choose the number of super clusters via an hclust
object:
then, both arguments k
and h
are not filled.
to cut the clustering into super clusters: then, either argument k
or argument h
must be filled. See cutree
for details on
these arguments.
The squared distance between prototypes is passed to the algorithm.
summary
on a superClass
object produces a complete summary of the
results that displays the number of clusters and super-clusters, the clustering
itself and performs ANOVA analyses. For type="numeric"
the ANOVA is
performed for each input variable and test the difference of this variable
accross the super-clusters of the map. For type="relational"
a
dissimilarity ANOVA is performed (see (Anderson, 2001), except that in the
present version, a crude estimate of the p-value is used which is based on the
Fisher distribution and not on a permutation test.
On plots, the different super classes are identified in the following ways:
either with different color, when type
is set among:
"grid"
(*, #), "hitmap"
(*, #), "lines"
(*, #),
"barplot"
(*, #), "boxplot"
, "mds"
(*, #),
"dendro3d"
(*, #), "graph"
(*, #)
or with title, when type
is set among: "color"
(*),
"poly.dist"
(*, #), "pie"
(#), "radar"
(#)
In the list above, the charts available for a korresp
SOM are marked with
a * whereas those available for a relational
SOM are marked with a #.
projectIGraph.somSC
produces a projected graph from the
igraph object passed to the argument variable
as
described in (Olteanu and Villa-Vialaneix, 2015). The attributes of this graph
are the same than the ones obtained from the SOM map itself in the function
projectIGraph.somRes
. plot.somSC
used with
type="projgraph"
calculates this graph and represents it by positionning
the super-vertexes at the center of gravity of the super-clusters. This feature
can be combined with pie.graph=TRUE
to super-impose the information
from an external factor related to the individuals in the original dataset (or,
equivalently, to the vertexes of the graph).
Anderson M.J. (2001). A new method for non-parametric multivariate analysis of variance. Austral Ecology, 26, 32-46.
Olteanu M., Villa-Vialaneix N. (2015) Using SOMbrero for clustering and visualizing graphs. Journal de la Societe Francaise de Statistique, 156, 95-119.
# NOT RUN {
set.seed(11051729)
my.som <- trainSOM(x.data=iris[,1:4])
# choose the number of super-clusters
sc <- superClass(my.som)
plot(sc)
# cut the clustering
sc <- superClass(my.som, k=4)
summary(sc)
plot(sc)
plot(sc, type="hitmap", plot.legend=TRUE)
# }
Run the code above in your browser using DataLab