Learn R Programming

HighDimOut (version 1.0.0)

Func.ABOD: Angle-based outlier detection (ABOD) algorithm

Description

This function performs the basic and aprroximated version of angle-based outlier detection algorithm. The ABOD method is especially useful for high-dimensional data, as angle is a more robust measure than distance in high-dimensional space. The basic version calculate the angle variance based on the whole data. The results obtained are more reliable. However, the speed can be very slow. The approximated version calculate the angle variance based on a subset of data and thereby, increasing the calculation speed. This function is based on the work of Krigel, H.P., Schubert, M., Zimek, A., Angle-based outlier detection in high dimensional data, 2008.

Usage

Func.ABOD(data, basic = FALSE, perc)

Arguments

data
is the data frame containing the observations. Each row represents an observation and each variable is stored in one column.
basic
is a logical value, indicating whether the basic method is used. The speed of basic version can be very slow if the data size is large.
perc
defines the percentage of data to use when calculating the angle variance. It is only needed when basic=F.

Value

The function returns the vector containing the angle variance for each observation

Examples

Run this code
library(ggplot2)
res.ABOD <- Func.ABOD(data=TestData[,1:2], basic=FALSE, perc=0.2)
data.temp <- TestData[,1:2]
data.temp$Ind <- NA
data.temp[order(res.ABOD, decreasing = FALSE)[1:10],"Ind"] <- "Outlier"
data.temp[is.na(data.temp$Ind),"Ind"] <- "Inlier"
data.temp$Ind <- factor(data.temp$Ind)
ggplot(data = data.temp) + geom_point(aes(x = x, y = y, color=Ind, shape=Ind))

Run the code above in your browser using DataLab