somRes
resultsUseful details on how to produce graphics to help interpreting a somRes
object.
Important: all these graphics are available when the algorithm's type is
"numeric"
; those which are available for a korresp
SOM are
marked by a * and those which are available for a relational
SOM are
marked with a #.
The possible values for type
are: "hitmap"
(*, #), "color"
,
"lines"
, "barplot"
, "names"
(*, #), "boxplot"
and
"radar"
.
For the cases what="obs"
and what="add"
, if a neuron is empty,
nothing will be plotted at its location.
"hitmap"
(*, #) plots proportional areas according to the number of
observations per neuron. It is the default plot when what="obs"
.
"color"
can have two more arguments, var
, the index of the
variable to be considered (default, 1
), and my.palette
for the
colors to be used. Neurons are filled using the given colors according to the
average value level of the observations for the chosen variable.
"lines"
plots, for each neuron, the average value level of the
observations, with lines. One point represents a variable. All variables of the
dataset used to train the algorithm are plotted.
"barplot"
is similar to "lines"
but using barplots. Then,
a bar represents a variable.
"radar"
is similar to "lines"
but using radars. Then, a
slice represents a variable. If needed, a legend can be added ; its location
has to be passed by the key.loc
argument (see stars
).
"names"
(*, #) prints on the grid the element names (i.e., the
names of the rows) in the neuron to which it belongs.
"boxplot"
plots boxplots for several observations in every neuron.
This case can handle 5 variables at most. The default behavior is to plot the
boxplots forthe first 5 variables of the data set; if there is less than 5
variables in the data set, they will all be plotted.
When the algorithm's type is korresp
or relational
, only the types
"hitmap"
and "names"
are available.
This graphic is only available if some intermediate backups have been
registered (i.e., x$parameters$nb.save>1
). Graphic plots the evolution of
the level of the energy according to the registered steps.
The possible values for type
are: "3d"
(*), "lines"
(*, #),
"barplot"
(*, #), "radar"
(*, #), "color"
(*),
"smooth.dist"
(*, #), "poly.dist"
(*, #), "umatrix"
(*, #),
"mds"
(*, #) and "grid.dist"
(*, #).
"lines"
(*, #) has the same behavior as the "lines"
case
described in the observations section, but according to the prototypes level;
"barplot"
(*, #) has the same behavior as the "barplot"
case
described in the observations section, but according to the prototypes level;
"radar"
(*, #) has the same behavior as the "radar"
case
described in the observations section, but according to the prototypes level;
"color"
(*) has the same behavior as the "color"
case
described in the observations section, but according to the prototypes level;
"3d"
case is similar to the "color"
case, but in 3
dimensions, with x and y the coordinates of the grid and z the value of the
prototypes for the considered variable;
"smooth.dist"
(*, #) depicts the average distance between a
prototypes and its neighbors on a map where x and y are the coordinates of the
prototypes on the grid;
"poly.dist"
(*, #) also represents the distances between
prototypes but with polygons plotted for each neuron. The closest from the
border the polygon point is, the closest the pairs of prototypes are. The color
used for filling the polygon shows the number of observations in each neuron.
A white polygon means that there is no observation. With the default colors, a
red polygon means a high number of observations;
"umatrix"
(*, #) is another way of plotting distances between
prototypes. The grid is plotted and filled with my.palette
colors
according to the mean distance between the current neuron and the neighboring
neurons. With the default colors, red indicates proximity.
"mds"
(*, #) plots the number of the neuron on a map according to
a Multi Dimensional Scaling (MDS) projection on a two dimensional space.
"grid.dist"
(*, #) plots on a 2 dimension map all distances. The
number of points on this picture is equal to:
\(\frac{\textrm{number of neurons}\times(\textrm{number of neurons}-1)}{2}\).
On the x axis corresponds to the prototype distances whereas the y axis depicts
the grid distances.
The case what="add"
considers an additional variable, which has to be
given to the argument variable
. Its length must match the number of
observations in the original data. Then the possible values for type
are:
"pie"
(#), "color"
(#), "lines"
(#), "boxplot"
(#),
"barplot"
(#), "radar"
(#), "names"
(#), "words"
(#) and
"graph"
(#).
"color"
(#) has the same behavior as the "color"
case
described in the observations section. Then, the additional variable must be a
numerical vector;
"lines"
(#) has the same behavior as the "color"
case
described in the observations section. Then, the additional variable must be a
numerical matrix or a data frame;
"boxplot"
(#) has the same behavior as the "color"
case
described in the observations section. Then, the additional variable must be
either a numeric vector or a numeric matrix/data frame;
"barplot"
(#) has the same behavior as the "color"
case
described in the observations section. Then, the additional variable must be
either a numeric vector or a numeric matrix/data frame;
"radar"
(#) has the same behavior as the "color"
case
described in the observations section. Then, the additional variable must be a
numerical matrix or data frame;
"pie"
requires the argument variable
to be a factor vector
and plots one pie for each neuron according to this factor;
"names"
(#) has the same behavior as the "names"
case
described in the observations section. Then, the names to be printed are the
elements of the variable given to the variable
argument;
"words"
(#) needs the argument variable
be a contingency
table: names of the columns will be used as words and the values express the
frequency of a given word in the observation. Then, for each neuron of the grid,
the words will be printed with sizes proportional to their frequency in the
neuron;
Last option is "graph"
(#). The argument variable
must be an
igraph
object (see library(igraph)
. According to the existing
edges in the graph and to the clustering obtained with the SOM algorithm, a
clustered graph will be produced where a vertex between two vertices represents
a neuron and the width of an edge is proportional to the number of edges in the
given graph between the vertices affected to the corresponding neurons. The
option can handle two more arguments: pie.graph
and pie.variable
.
These are used to display the vertex as pie charts. For this case,
pie.graph
must be set to TRUE
and a factor vector is supplied by
pie.variable
.
When the algorithm's type is korresp
, no graphic is available for
what="add"
.
All these graphics are available for a relational
SOM.
Further arguments, their reference functions and the plot.somRes
cases
are summarized in the following list:
plot
is called by the cases:
what="energy"
type="lines"
what="prototypes"/type="mds"
plot.myGrid
is called by the cases:
what="obs"/type="hitmap"
type="color"
what="prototypes"/type="poly.dist"
what="prototypes"/type="umatrix"
plot.igraph
is called by the case
what="add"/type="graph"
pie
is called by the case what="add"/type="pie"
barplot
is called by the cases type="barplot"
boxplot
is called by the cases type="boxplot"
stars
is called by the cases type="radar"
persp
is called by the case
what="prototypes"/type="3d"
wordcloud
is called by the cases:
type="names"
what="add"/type="words"
# NOT RUN {
# run the SOM algorithm on the numerical data of 'iris' data set
iris.som <- trainSOM(x.data=iris[,1:4], nb.save=2)
# plots
# on energy
plot(iris.som, what="energy") # energy
# on prototypes
plot(iris.som, what="prototypes", type="3d", variable="Sepal.Length")
# on an additional variable: the flower species
plot(iris.som, what="add", type="pie", variable=iris$Species)
# }
Run the code above in your browser using DataLab