Learn R Programming

HighDimOut (version 1.0.0)

Func.SOD: Subspace outlier detection (SOD) algorithm

Description

This function performs suspace outlier detection algorithm The implemented method is based on the work of Krigel, H.P., Kroger, P., Schubert, E., Zimek, A., Outlier detection in axis-parallel subspaces of high dimensional data, 2009.

Usage

Func.SOD(data, k.nn, k.sel, alpha = 0.8)

Arguments

data
is the data frame containing the observations. Each row represents an observation and each variable is stored in one column.
k.nn
specifies the value used for calculating the shared nearest neighbors. Note that k.nn should be greater than k.sel.
k.sel
specifies the number shared nearest neighbors. It can be interpreted as the number of reference set for constructing the subspace hyperplane.
alpha
specifies the lower limit for selecting subspace. 0.8 is set as default as suggested in the original paper.

Value

The function returns a vector containing the SOD outlier scores for each observation

Examples

Run this code
library(ggplot2)
res.SOD <- Func.SOD(data = TestData[,1:2], k.nn = 10, k.sel = 5, alpha = 0.8)
data.temp <- TestData[,1:2]
data.temp$Ind <- NA
data.temp[order(res.SOD, decreasing = TRUE)[1:10],"Ind"] <- "Outlier"
data.temp[is.na(data.temp$Ind),"Ind"] <- "Inlier"
data.temp$Ind <- factor(data.temp$Ind)
ggplot(data = data.temp) + geom_point(aes(x = x, y = y, color=Ind, shape=Ind))

Run the code above in your browser using DataLab