DBS is a flexible and robust clustering framework that consists
of three independent modules. The first module is the parameter-free
projection method Pswarm Pswarm
, which exploits the concepts of self-organization
and emergence, game theory, swarm intelligence and symmetry considerations.
The second module is a parameter-free high-dimensional data visualization technique,
which generates projected points on a topographic map with hypsometric colors GeneratePswarmVisualization
,
called the generalized U-matrix. The third module is a clustering method with no
sensitive parameters DBSclustering
. The clustering can be verified by the visualization and vice versa.
The term DBS refers to the method as a whole.
The GeneratePswarmVisualization
function generates the special case (please see [Thrun, 2018]) of the generalized Umatrix with the help of an unsupervised neural network (simplified emergent self-organizing map published in [Thrun/Ultsch, 2020]). From the generalized Umatrix a topographic map with hypsometric tints can be visualized. To see this visualization use plotTopographicMap
of the package GeneralizedUmatrix.
GeneratePswarmVisualization(Data, ProjectedPoints, LC, PlotIt=FALSE,
ComputeInR=FALSE, Parallel=TRUE, Tiled = FALSE, DataPerEpoch = 1)
list of
matrix [1:n,1:2], BestMatches of the Umatrix, contrary to ESOM they are always fixed, because predefined by GridPoints.
matrix [1:Lines,1:Columns],
array [1:Lines,1:Columns,1:d], d is the dimension of the weights, the same as in the ESOM algorithm
matrix [1:n,1:2],quantized projected points: projected points now lie on a predefined grid.
c(Lines,Columns), normally equal to grid size of Pswarm, sometimes it a better or a lower resolution for the visualization is better. Therefore here the grid size of the neurons is given back.
If PlotIt=FALSE: NULL, otherwise plotly object for ploting topview of topographic map
[1:n,1:d] array of data: n cases in rows, d variables in columns
matrix, ProjectedPoints[1:n,1:2] n by 2 matrix containing coordinates of the Projection: A matrix of the fitted configuration. see output of Pswarm
for further details
size of the grid c(Lines,Columns), number of Lines and Columns automatic calculated by setGridSize
in Pswarm
Sometimes is better to choose a different grid size, e.g. to to reduce computional effort contrary to SOM, here the grid size defined only the resolution of the visualizations The real grid size is predfined by Pswarm, but you may choose a factor x*res$LC if you so desire. Therefore, The resulting grid size is given back in the Output.
Optional, default(FALSE), If TRUE than uses plotTopographicMap
of the package GeneralizedUmatrix is plotted as a topview in the tiled option, see details for explanation.
Optional, =TRUE: Rcode, =FALSE C++ implementation
Optional, =TRUE: Parallel C++ implementation, =FALSE Sequential C++ implementation
Optional, =TRUE: arrangement of four grids for better understanding of edge behaviour, =FALSE: single grid.
Optional: Number between 0 and 1 stating the ratio of data per epoch for training of the generalized u-matrix approach.
Michael Thrun
Tiled: The topographic map is visualized 4 times because the projection is toroidal. The reason is that there are no border in the visualizations and clusters (if they exist) are not disrupted by borders of the plot.
If you used Pswarm
with distance matrix instead of a data matrix (in the sense that you do not have any data matrix available), you may transform your distances into data by using MDS
of the ProjectionBasedClustering package in order to use the GeneratePswarmVisualization
function. The correct dimension can be found through the Sheppard diagram or kruskals stress.
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through Self-Organization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 978-3-658-20539-3, tools:::Rd_expr_doi("10.1007/978-3-658-20540-9"), 2018.
[Thrun/Ultsch, 2020] Thrun, M. C., & Ultsch, A.: Uncovering High-Dimensional Structures of Projections from Dimensionality Reduction Methods, MethodsX, Vol. 7, pp. 101093, tools:::Rd_expr_doi("10.1016/j.mex.2020.101093"), 2020.
Pswarm
and
plotTopographicMap
and GeneralizedUmatrix
of the package GeneralizedUmatrix
data("Lsun3D")
Data=Lsun3D$Data
Cls=Lsun3D$Cls
InputDistances=as.matrix(dist(Data))
# \donttest{
projList=Pswarm(InputDistances)
genUmatrixList=GeneratePswarmVisualization(Data,projList$ProjectedPoints,projList$LC)
library(GeneralizedUmatrix)
plotTopographicMap(genUmatrixList$Umatrix,genUmatrixList$Bestmatches,Cls)
# }
data2=matrix(runif(n = 100),10,10)
distance=as.matrix(dist(data2))
projList2=Pswarm(distance, LC = c(10,12))
genUmatrixList=GeneratePswarmVisualization(matrix(runif(n = 100),10,10),projList2$ProjectedPoints,projList2$LC)
Run the code above in your browser using DataLab