Learn R Programming

RPEnsemble (version 0.2)

RPChooseSS: A sample splitting version of RPChoose

Description

Chooses the best projection based on an estimate of the test error of the classifier with training data (XTrain, YTrain), the estimation method counts the number of errors made on the validation set (XVal, YVal).

Usage

RPChooseSS(XTrain, YTrain, XVal, YVal, XTest, d, B2 = 100, base = "LDA", k = c(3, 5), projmethod = "Haar", ...)

Arguments

XTrain
An n by p matrix containing the training data feature vectors
YTrain
A vector of length n of the classes (either 1 or 2) of the training data
XVal
An n.val by p matrix containing the validation data feature vectors
YVal
A vector of length n.val of the classes (either 1 or 2) of the validation data
XTest
An n.test by p matrix of the test data feature vectors
d
The lower dimension of the image space of the projections
B2
The block size
base
The base classifier one of "knn","LDA","QDA" or "other"
k
The options for k if base = "knn"
projmethod
Either "Haar" or "axis"
...
Optional further arguments if base = "other"

Value

n.val + n.test: the first n.val entries are the estimated classes of the validation set, the last n.test are the estimated classes of the test set.

Details

Projects the the data using B2 Haar or axis-aligned random projections. For each projection the validation set is classified using the the training set and the projection yielding the smallest error estimate over the validation set is retained. The validation set and test set are then classified using the chosen projection.

References

Cannings, T. I. and Samworth, R. J. (2015) Random projection ensemble classification. http://arxiv.org/abs/1504.04595

See Also

RPParallel, RPChoose, lda, qda, knn

Examples

Run this code
set.seed(100)
Train <- RPModel(1, 50, 20, 0.5)
Validate <- RPModel(1, 50, 20, 0.5)
Test <- RPModel(1, 100, 20, 0.5)
Choose.out5 <- RPChooseSS(XTrain = Train$x, YTrain = Train$y, 
XVal = Validate$x, YVal = Validate$y, XTest = Test$x, d = 2, 
B2 = 5, base = "LDA", projmethod = "Haar")
Choose.out10 <- RPChooseSS(XTrain = Train$x, YTrain = Train$y, 
XVal = Validate$x, YVal = Validate$y, XTest = Test$x, d = 2, 
B2 = 10, base = "LDA", projmethod = "Haar")
sum(Choose.out5[1:50] != Validate$y)
sum(Choose.out10[1:50] != Validate$y)
sum(Choose.out5[51:150] != Test$y)
sum(Choose.out10[51:150] != Test$y)

Run the code above in your browser using DataLab