Learn R Programming

dbscan (version 1.1-6)

lof: Local Outlier Factor Score

Description

Calculate the Local Outlier Factor (LOF) score for each data point using a kd-tree to speed up kNN search.

Usage

lof(x, k = 4, ...)

Arguments

x

a data matrix or a dist object.

k

size of the neighborhood.

further arguments are passed on to kNN.

Value

A numeric vector of length ncol(x) containing LOF values for all data points.

Details

LOF compares the local reachability density (lrd) of an point to the lrd of its neighbors. A LOF score of approximately 1 indicates that the lrd around the point is comparable to the lrd of its neighbors and that the point is not an outlier. Points that have a substantially lower lrd than their neighbors are considered outliers and produce scores significantly larger than 1.

Note on duplicate points: If there are more than k duplicates of a point in the data, then LOF can become NaN caused by an lrd of 0 restulting leading to LOF of0/0. For some applications, the duplicate points should be in the data (e.g., they are the result of rounding the values for several points). We set LOF to 1 in this case since there is already enought density from the points in the same location to make them not outliers. The original paper by Breunig et al (2000) assumes that the points are real duplicates and suggests to remove the duplicates before computing LOF. If duplicate points are removed first, then this LOF implementation in dbscan behaves like the one described by Breunig et al.

References

Breunig, M., Kriegel, H., Ng, R., and Sander, J. (2000). LOF: identifying density-based local outliers. In ACM Int. Conf. on Management of Data, pages 93-104. 10.1145/335191.335388

See Also

kNN, pointdensity, glosh.

Examples

Run this code
# NOT RUN {
set.seed(665544)
n <- 100
x <- cbind(
  x=runif(10, 0, 5) + rnorm(n, sd=0.4),
  y=runif(10, 0, 5) + rnorm(n, sd=0.4)
  )

### calculate LOF score
lof <- lof(x, k=3)

### distribution of outlier factors
summary(lof)
hist(lof, breaks=10)

### point size is proportional to LOF
plot(x, pch = ".", main = "LOF (k=3)")
points(x, cex = (lof-1)*3, pch = 1, col="red")
text(x[lof>2,], labels = round(lof, 1)[lof>2], pos = 3)
# }

Run the code above in your browser using DataLab