# dbscan v1.1-2

Monthly downloads

## Density Based Clustering of Applications with Noise (DBSCAN) and Related Algorithms

A fast reimplementation of several density-based algorithms of
the DBSCAN family for spatial data. Includes the DBSCAN (density-based spatial
clustering of applications with noise) and OPTICS (ordering points to identify
the clustering structure) clustering algorithms HDBSCAN (hierarchical DBSCAN) and the LOF (local outlier
factor) algorithm. The implementations use the kd-tree data structure (from
library ANN) for faster k-nearest neighbor search. An R interface to fast kNN
and fixed-radius NN search is also provided.

## Readme

# dbscan - Density Based Clustering of Applications with Noise (DBSCAN) and Related Algorithms - R package

This R package provides a fast C++ (re)implementation of several density-based algorithms with a focus on the DBSCAN family for clustering spatial data. The package includes:

**Clustering**

**DBSCAN:**Density-based spatial clustering of applications with noise.**HDBSCAN:**Hierarchical DBSCAN with simplified hierarchy extraction.**OPTICS/OPTICSXi:**Ordering points to identify the clustering structure clustering algorithms.**FOSC:**Framework for Optimal Selection of Clusters for unsupervised and semisupervised clustering of hierarchical cluster tree.**Jarvis-Patrick clustering****SNN Clustering**: Shared Nearest Neighbor Clustering.

**Outlier Detection**

**LOF:**Local outlier factor algorithm.**GLOSH:**Global-Local Outlier Score from Hierarchies algorithm.

**Fast Nearest-Neighbor Search (using kd-trees)**

**kNN search****Fixed-radius NN search**

The implementations use the kd-tree data structure (from library ANN) for faster k-nearest neighbor search, and are typically faster than the native R implementations (e.g., dbscan in package `fpc`

), or the
implementations in WEKA, ELKI and Python's scikit-learn.

## Installation

**Stable CRAN version:** install from within R with

```
install.packages("dbscan")
```

**Current development version:** Download package from AppVeyor or install from GitHub (needs devtools).

```
library("devtools")
install_github("mhahsler/dbscan")
```

## Usage

Load the package and use the numeric variables in the iris dataset

```
library("dbscan")
data("iris")
x <- as.matrix(iris[, 1:4])
```

Run DBSCAN

```
db <- dbscan(x, eps = .4, minPts = 4)
db
```

```
DBSCAN clustering for 150 objects.
Parameters: eps = 0.4, minPts = 4
The clustering contains 4 cluster(s) and 25 noise points.
0 1 2 3 4
25 47 38 36 4
Available fields: cluster, eps, minPts
```

Visualize results (noise is shown in black)

```
pairs(x, col = db$cluster + 1L)
```

Calculate LOF (local outlier factor) and visualize (larger bubbles in the visualization have a larger LOF)

```
lof <- lof(x, k = 4)
pairs(x, cex = lof)
```

Run OPTICS

```
opt <- optics(x, eps = 1, minPts = 4)
opt
```

```
OPTICS clustering for 150 objects.
Parameters: minPts = 4, eps = 1, eps_cl = NA, xi = NA
Available fields: order, reachdist, coredist, predecessor, minPts, eps, eps_cl, xi
```

Extract DBSCAN-like clustering from OPTICS and create a reachability plot (extracted DBSCAN clusters at eps_cl=.4 are colored)

```
opt <- extractDBSCAN(opt, eps_cl = .4)
plot(opt)
```

Extract a hierarchical clustering using the Xi method (captures clusters of varying density)

```
opt <- extractXi(opt, xi = .05)
opt
plot(opt)
```

Run HDBSCAN (captures stable clusters)

```
hdb <- hdbscan(x, minPts = 4)
hdb
```

```
HDBSCAN clustering for 150 objects.
Parameters: minPts = 4
The clustering contains 2 cluster(s) and 0 noise points.
1 2
100 50
Available fields: cluster, minPts, cluster_scores, membership_prob, outlier_scores, hc
```

Visualize the results as a simplified tree

```
plot(hdb, show_flat = T)
```

See how well each point corresponds to the clusters found by the model used

```
colors <- mapply(function(col, i) adjustcolor(col, alpha.f = hdb$membership_prob[i]),
palette()[hdb$cluster+1], seq_along(hdb$cluster))
plot(x, col=colors, pch=20)
```

## License

The dbscan package is licensed under the GNU General Public License (GPL) Version 3. The **OPTICSXi** R implementation was directly ported from the ELKI framework's Java implementation (GNU AGPLv3), with explicit permission granted by the original author, Erich Schubert.

## Further Information

- Development version of dbscan on github.
- List of changes from NEWS.md
- dbscan reference manual

*Maintainer:* Michael Hahsler

## Functions in dbscan

Name | Description | |

reachability | Density Reachability Structures | |

hullplot | Plot Convex Hulls of Clusters | |

glosh | Global-Local Outlier Score from Hierarchies | |

extractFOSC | Framework for Optimal Selection of Clusters | |

NN | Nearest Neighbors Auxiliary Functions | |

frNN | Find the Fixed Radius Nearest Neighbors | |

dbscan | DBSCAN | |

jpclust | Jarvis-Patrick Clustering | |

DS3 | DS3: Spatial data with arbitrary shapes | |

hdbscan | HDBSCAN | |

kNN | Find the k Nearest Neighbors | |

pointdensity | Calculate Local Density at Each Data Point | |

lof | Local Outlier Factor Score | |

sNN | Shared Nearest Neighbors | |

optics | OPTICS | |

kNNdist | Calculate and plot the k-Nearest Neighbor Distance | |

moons | Moons Data | |

sNNclust | Shared Nearest Neighbor Clustering | |

No Results! |

## Vignettes of dbscan

Name | ||

figures/dbscan_a.pdf | ||

figures/dbscan_b.pdf | ||

figures/dbscan_benchmark.pdf | ||

figures/optics_benchmark.pdf | ||

dbscan.Rnw | ||

dbscan.bib | ||

hdbscan.Rmd | ||

No Results! |

## Last month downloads

## Details

Date | 2018-05-18 |

LinkingTo | Rcpp |

VignetteBuilder | knitr |

BugReports | https://github.com/mhahsler/dbscan |

License | GPL (>= 2) |

Copyright | ANN library is copyright by University of Maryland, Sunil Arya and David Mount. All other code is copyright by Michael Hahsler and Matthew Piekenbrock. |

SystemRequirements | C++11 |

NeedsCompilation | yes |

Packaged | 2018-05-19 02:24:18 UTC; hahsler |

Repository | CRAN |

Date/Publication | 2018-05-19 03:54:52 UTC |

suggests | dendextend , DMwR , fpc , igraph , knitr , microbenchmark , testthat |

imports | graphics , methods , Rcpp (>= 0.12.12) , stats |

Contributors | Matthew Piekenbrock, David Mount, Sunil Arya |

#### Include our badge in your README

```
[![Rdoc](http://www.rdocumentation.org/badges/version/dbscan)](http://www.rdocumentation.org/packages/dbscan)
```