Local Outlier Factor Score
Calculate the Local Outlier Factor (LOF) score for each data point using a kd-tree to speed up kNN search.
lof(x, k = 4, ...)
- a data matrix or a dist object.
- size of the neighborhood.
- further arguments are passed on to
LOF compares the local density of an point to the local densities
of its neighbors. Points that have a substantially lower density than
their neighbors are considered outliers.
A LOF score of approximately 1 indicates that density around the point
is comparable to its neighbors. Scores significantly larger than
1 indicate outliers. Note: If there are more than
k duplicate points in the data, then lof
NaN caused by an infinite local density.
In this case we set lof to 1. The paper by Breunig et al (2000) suggests a different method of removing all duplicate points first.
A numeric vector of length
ncol(x) containing LOF values for
all data points.
Breunig, M., Kriegel, H., Ng, R., and Sander, J. (2000). LOF: identifying density-based local outliers. In ACM Int. Conf. on Management of Data, pages 93-104.
set.seed(665544) n <- 100 x <- cbind( x=runif(10, 0, 5) + rnorm(n, sd=0.4), y=runif(10, 0, 5) + rnorm(n, sd=0.4) ) ### calculate LOF score lof <- lof(x, k=3) ### distribution of outlier factors summary(lof) hist(lof, breaks=10) ### point size is proportional to LOF plot(x, pch = ".", main = "LOF (k=3)") points(x, cex = (lof-1)*3, pch = 1, col="red") text(x[lof>2,], labels = round(lof, 1)[lof>2], pos = 3)