CORElearn (version 1.57.3)

rfClustering: Random forest based clustering

Description

Creates a clustering of random forest training instances. Random forest provides proximity of its training instances based on their out-of-bag classification. This information is usually passed to visualizations (e.g., scaling) and attribute importance measures.

Usage

rfClustering(model, noClusters=4)

Value

An object of class pam representing the clustering (see ?pam.object for details), the most important being a vector of cluster assignments (named cluster) to training instances used to generate the model.

Arguments

model

a random forest model returned by CoreModel

noClusters

number of clusters

Author

John Adeyanju Alao (as a part of his BSc thesis) and Marko Robnik-Sikonja (thesis supervisor)

Details

The method calls pam function for clustering, initializing its distance matrix with random forest based similarity by calling rfProximity with argument model.

References

Leo Breiman: Random Forests. Machine Learning Journal, 45:5-32, 2001

See Also

CoreModel rfProximity pam

Examples

Run this code
set<-iris
md<-CoreModel(Species ~ ., set, model="rf", rfNoTrees=30, maxThreads=1)
mdCluster<-rfClustering(md, 5)

destroyModels(md) # clean up

Run the code above in your browser using DataLab