CORElearn (version 1.57.3)

rfOutliers: Random forest based outlier detection

Description

Based on random forest instance proximity measure detects training cases which are different to all other cases.

Usage

rfOutliers(model, dataset)

Value

For each instance from a dataset the function returns a numeric score of its strangeness to other cases.

Arguments

model

a random forest model returned by CoreModel

dataset

a training set used to generate the model

Author

John Adeyanju Alao (as a part of his BSc thesis) and Marko Robnik-Sikonja (thesis supervisor)

Details

Strangeness is defined using the random forest model via a proximity matrix (see rfProximity). If the number is greater than 10, the case can be considered an outlier according to Breiman 2001.

References

Leo Breiman: Random Forests. Machine Learning Journal, 45:5-32, 2001

See Also

CoreModel, rfProximity, rfClustering.

Examples

Run this code
#first create a random forest tree using CORElearn
dataset <- iris
md <- CoreModel(Species ~ ., dataset, model="rf", rfNoTrees=30, 
                maxThreads=1)
outliers <- rfOutliers(md, dataset)
plot(abs(outliers))
#for a nicer display try 
plot(md, dataset, rfGraphType="outliers")

destroyModels(md) # clean up

Run the code above in your browser using DataLab