dbscan v0.9-8

0

Monthly downloads

0th

Percentile

by Michael Hahsler

Density Based Clustering of Applications with Noise (DBSCAN) and Related Algorithms

A fast reimplementation of several density-based algorithms of the DBSCAN family for spatial data. Includes the DBSCAN (density-based spatial clustering of applications with noise) and OPTICS (ordering points to identify the clustering structure) clustering algorithms and the LOF (local outlier factor) algorithm. The implementations uses the kd-tree data structure (from library ANN) for faster k-nearest neighbor search. An R interface to fast kNN and fixed-radius NN search is also provided.

Readme

dbscan - Density Based Clustering of Applications with Noise (DBSCAN) and Related Algorithms - R package

CRAN version CRAN RStudio mirror downloads Travis-CI Build Status AppVeyor Build Status

This R package provides a fast C++ reimplementation of several density-based algorithms of the DBSCAN family for spatial data. Includes the DBSCAN (density-based spatial clustering of applications with noise) and OPTICS/OPTICSXi (ordering points to identify the clustering structure) clustering algorithms and the LOF (local outlier factor) algorithm. The implementations uses the kd-tree data structure (from library ANN) for faster k-nearest neighbor search. An R interface to fast kNN and fixed-radius NN search is also provided.

This implementation is typically faster than the native R implementation in package fpc, or the implementations in WEKA, ELKI and Python's scikit-learn.

Installation

  • Stable CRAN version: install from within R.
  • Current development version: Download package from AppVeyor or install via install_github("mhahsler/dbscan") (requires devtools)

Examples

library("dbscan")

## use the numeric variables in the iris dataset
data("iris")
x <- as.matrix(iris[, 1:4])

## DBSCAN
db <- dbscan(x, eps = .4, minPts = 4)
db
## visualize results (noise is shown in black)
pairs(x, col = db$cluster + 1L)

## LOF (local outlier factor) 
lof <- lof(x, k = 4)
## larger bubbles in the visualization have a larger LOF
pairs(x, cex = lof)

## OPTICS
opt <- optics(x, eps = 1, minPts = 4, eps_cl = .4)
opt
## create a reachability plot (extracted DBSCAN clusters at eps_cl=.4 are colored)
plot(opt)
## plot the extracted DBSCAN clustering
pairs(x, col = opt$cluster + 1L)
## extract a hierarchical clustering using the Xi method (captures clusters of varying density)
opt <- optics(x, eps = 1, minPts = 4, xi = .05)
opt
plot(opt)

License

The dbscan package is licensed under the GNU General Public License (GPL) Version 3. The OPTICSXi R implementation was directly ported from ELKI frameworks available Java source code (GNU AGPLv3), with explicit permission granted by the original author, Erich Schubert.

Further Information

Maintainer: Michael Hahsler

Functions in dbscan

Name Description
hullplot Plot Convex Hulls of Clusters
frNN Find the Fixed Radius Nearest Neighbors
kNN Find the k Nearest Neighbors
lof Local Outlier Factor Score
optics OPTICS
dbscan DBSCAN
kNNdist Calculate and plot the k-Nearest Neighbor Distance
No Results!

Last month downloads

Details

Date 2016-08-05
LinkingTo Rcpp
BugReports https://github.com/mhahsler/dbscan/issues
License GPL (>= 2)
Copyright ANN library is copyright by University of Maryland, Sunil Arya and David Mount. All other code is copyright by Michael Hahsler.
RoxygenNote 5.0.1
NeedsCompilation yes
Packaged 2016-08-05 21:12:17 UTC; hahsler
Repository CRAN
Date/Publication 2016-08-06 00:41:11

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/dbscan)](http://www.rdocumentation.org/packages/dbscan)