OTE (version 1.0)

OTClass: Train the ensemble of optimal trees for classification.

Description

This function selects optimal trees for classification from a total of t.initial trees grown by random forest. Number of trees in the initial set, t.initial, is specified by the user. If not specified then the default t.initial = 1000 is used.

Usage

OTClass(XTraining, YTraining, p = 0.2, t.initial = NULL, nf = NULL, ns = NULL, info = TRUE)

Arguments

XTraining
An n x d dimensional training data matrix/frame consiting of traing observation where n is the number of observations and d is the number of features.
YTraining
A vector of length n consisting of class labels for the training data. Should be binary (0,1).
p
Percent of the best t.initial trees to be selected on the basis of performance on out-of-bag observations.
t.initial
Size of the initial set of classification trees.
nf
Number of features to be sampled for spliting the nodes of the trees. If equal to NULL then the default sqrt(number of features) is executed.
ns
Node size: Minimal number of samples in the nodes. If equal to NULL then the default 1 is executed.
info
If TRUE, displays processing information.

Value

A trained object consisting of the selected trees.

Details

Large values are recommended for t.initial for better performance as possible under the available computational resources.

References

Khan, Z., Gul, A., Perperoglou, A., Mahmoud, O.,Miftahuddin, M., Adler, W. and Lausen, B.(2014) ``An ensemble of optimal trees for classification and regression'' Journal name to appear.

Liaw, A. and Wiener, M. (2002) ``Classification and regression by random forest'' R news. 2(3). 18--22.

See Also

Predict.OTClass, OTReg, OTProb

Examples

Run this code
#load the data

  data(Body)
  data <- Body
  
#Divide the data into training and test parts

  set.seed(9123) 
  n <- nrow(data)
  training <- sample(1:n,round(2*n/3))
  testing <- (1:n)[-training]
  X <- data[,1:24]
  Y <- data[,25]
  
#Train OTClass on the training data

  Opt.Trees <- OTClass(XTraining=X[training,],YTraining = Y[training],t.initial=200)
  
#Predict on test data

  Prediction <- Predict.OTClass(Opt.Trees, X[testing,],YTesting=Y[testing])
  
#Objects returned

  names(Prediction)
  Prediction$Confusion.Matrix
  Prediction$Predicted.Class.Labels

Run the code above in your browser using DataLab