Learn R Programming

gamclass (version 0.52)

RFcluster: Random forests estimate of predictive accuracy for clustered data

Description

This function adapts random forests to work (albeit clumsily and inefficiently) with clustered categorical outcome data. For example, there may be multiple observations on individuals (clusters). Predictions are made fof the OOB (out of bag) clusters

Usage

RFcluster(formula, id, data, nfold = 15,
              ntree=500, progress=TRUE, printit = TRUE, seed = 29)

Arguments

formula
Model formula
id
numeric, identifies clusters
data
data frame that supplies the data
nfold
numeric, number of folds
ntree
numeric, number of trees (number of bootstrap samples)
progress
Print information on progress of calculations
printit
Print summary information on accuracy
seed
Set seed, if required, so that results are exactly reproducible

Value

  • classPredicted values from cross-validation
  • OOBaccuracyCross-validation estimate of accuracy
  • confusionConfusion matrix

Details

Bootstrap samples are taken of observations in the in-bag clusters. Predictions are made for all observations in the OOB clusters.

References

http:/www.maths.anu.edu.au/~johnm/nzsr/taws.html

Examples

Run this code
library(mlbench)
library(randomForest)
data(Vowel)
RFcluster(formula=Class ~., id = V1, data = Vowel, nfold = 15,
              ntree=500, progress=TRUE, printit = TRUE, seed = 29)

Run the code above in your browser using DataLab