Learn R Programming

gamclass (version 0.62.5)

RFcluster: Random forests estimate of predictive accuracy for clustered data

Description

This function adapts random forests to work (albeit clumsily and inefficiently) with clustered categorical outcome data. For example, there may be multiple observations on individuals (clusters). Predictions are made fof the OOB (out of bag) clusters

Usage

RFcluster(formula, id, data, nfold = 15,
              ntree=500, progress=TRUE, printit = TRUE, seed = 29)

Value

class

Predicted values from cross-validation

OOBaccuracy

Cross-validation estimate of accuracy

confusion

Confusion matrix

Arguments

formula

Model formula

id

numeric, identifies clusters

data

data frame that supplies the data

nfold

numeric, number of folds

ntree

numeric, number of trees (number of bootstrap samples)

progress

Print information on progress of calculations

printit

Print summary information on accuracy

seed

Set seed, if required, so that results are exactly reproducible

Author

John Maindonald

Details

Bootstrap samples are taken of observations in the in-bag clusters. Predictions are made for all observations in the OOB clusters.

References

https://maths-people.anu.edu.au/~johnm/nzsr/taws.html

Examples

Run this code
if (FALSE) {
library(mlbench)
library(randomForest)
data(Vowel)
RFcluster(formula=Class ~., id = V1, data = Vowel, nfold = 15,
              ntree=500, progress=TRUE, printit = TRUE, seed = 29)
}

Run the code above in your browser using DataLab