Learn R Programming

DatabionicSwarm (version 1.1.0)

DBSclustering: Databonic swarm clustering (DBS)

Description

Automated Clustering approach of the Databonic swarm with abstact U distances, which are the geodesic distances based on high-dimensional distances combined with low dimensional graph paths by using ShortestGraphPathsC.

Usage

DBSclustering(k, DataOrDistance, BestMatches, LC, StructureType = TRUE, PlotIt = FALSE,
                 method = "euclidean",...)

Arguments

k

number of clusters, how many to you see in the topographic map (3D landscape)?

DataOrDistance

Either [1:n,1:d] Matrix of Data (n cases, d dimensions) that will be used. One DataPoint per row

or symmetric Distance matrix [1:n,1:n]

BestMatches

[1:n,1:2] Matrix with positions of Bestmatches=ProjectedPoints, one matrix line per data point

LC

grid size c(Lines,Columns)

StructureType

Optional, bool; =TRUE: compact structure of clusters assumed, =FALSE: connected structure of clusters assumed. For the two options vor Clusters, see [Thrun, 2018] or Handl et al. 2006

PlotIt

Optional, bool, Plots Dendrogramm

method

Optional, one of 39 distance methods of parDist of package parallelDist, if Data matrix is chosen above

Further arguments passed on to the parDist function, e.g. user-defined distance functions

Value

Cls [1:n] vector with selected classes of the bestmatches. You can use plotTopographicMap(Umatrix,Bestmatches,Cls) for verification.

Details

DBS is a flexible and robust clustering framework that consists of three independent modules. The first module is the parameter-free projection method Pswarm Pswarm, which exploits the concepts of self-organization and emergence, game theory, swarm intelligence and symmetry considerations. The second module is a parameter-free high-dimensional data visualization technique, which generates projected points on a topographic map with hypsometric colors GeneratePswarmVisualization, called the generalized U-matrix. The third module is a clustering method with no sensitive parameters DBSclustering (see [Thrun, 2018, p. 104 ff]). The clustering can be verified by the visualization and vice versa. The term DBS refers to the method as a whole.

References

[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through Self-Organization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 978-3-658-20539-3, https://doi.org/10.1007/978-3-658-20540-9, 2018.

Examples

Run this code
# NOT RUN {
data("Lsun3D")
Data=Lsun3D$Data
InputDistances=as.matrix(dist(Data))
# }
# NOT RUN {
projection=Pswarm(InputDistances)
#autmatic Clustering without GeneralizedUmatrix visualization
Cls=DBSclustering(k=3, Data, 

projection$ProjectedPoints, projection$LC,PlotIt=TRUE)
# }
# NOT RUN {
visualization=GeneratePswarmVisualization(Data,

projection$ProjectedPoints,projection$LC)
## Sometimes an automatic Clustering can be improved 
## thorugh an interactive approach, 
## e.g. if Outliers exist (see [Thrun/Ultsch, 2017])
library(ProjectionBasedClustering)
Cls2=ProjectionBasedClustering::interactiveClustering(visualization$Umatrix, 
visualization$Bestmatches, Cls)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab