Last chance! 50% off unlimited learning
Sale ends in
1.Impose missing values under the mechanism of missing completely at random on all covariates of the training dataset.
2.Hot-deck imputation for all variables. For each variable, observed values are randomly selected to impute missing values.
3.Build one tree in random forests using the above imputed training dataset, and then use it to predict the binary outcomes in the original testing dataset.
4.Repeat 1 to 3 for number.trees
times.
rrfc4(dat, yvar = ncol(dat), tr, te, mispct, number.trees)
Liaw, A. & Wiener, M., 2002. Classification and regression by randomForest. R News, 2(3), pp. 18-22.
Xiong, Kuangnan. "Roughened Random Forests for Binary Classification." PhD diss., State University of New York at Albany, 2014.
rrfa
, rrfb
, rrfc1
, rrfc2
, rrfc3
, rrfc5
, rrfc6
, rrfc7
, rrfd
, rrfe
if(require(MASS)){
if(require(caTools)){
dat=rbind(Pima.tr,Pima.te)
number.trees=50
#number.trees=500
tr=1:200
te=201:532
mispct=0.4
yvar=ncol(dat)
#AUC value for the testing dataset based on the original random forests
rf=randomForest(dat[tr,-yvar],dat[tr,yvar],dat[te,-yvar],ntree=number.trees)
print(colAUC(rf$test$votes[,2],dat[te,yvar]))
#AUC value for the testing dataset based on RRFC4
pred.rrfc4=rrfc4(dat,yvar,tr,te,mispct,number.trees)
print(colAUC(apply(pred.rrfc4$pred,1,mean),dat[te,yvar]))
}}
#
Run the code above in your browser using DataLab