adabag (version 4.1)

autoprune: Builds automatically a pruned tree of class rpart

Description

Builds automatically a pruned tree of class rpart looking in the cptable for the minimum cross validation error plus a standard deviation

Usage

autoprune(formula, data, subset=1:length(data[,1]), ...)

Arguments

formula

a formula, as in the lm function.

data

a data frame in which to interpret the variables named in the formula.

subset

optional expression saying that only a subset of the rows of the data should be used in the fit, as in the rpart function.

...

further arguments passed to or from other methods.

Value

An object of class rpart

Details

The cross validation estimation of the error (xerror) has a random component. To avoid this randomness the 1-SE rule (or 1-SD rule) selects the simplest model with a xerror equal or less than the minimum xerror plus the standard deviation of the minimum xerror.

References

Breiman, L., Friedman, J.H., Olshen, R. and Stone, C.J. (1984): "Classification and Regression Trees". Wadsworth International Group. Belmont

Therneau, T., Atkinson, B. and Ripley, B. (2014). rpart: Recursive Partitioning and Regression Trees. R package version 4.1-5

See Also

rpart

Examples

Run this code
# NOT RUN {
## rpart library should be loaded
library(rpart)
data(iris)
iris.prune<-autoprune(Species~., data=iris)
iris.prune

## Comparing the test error of rpart and autoprune
library(mlbench)
data(BreastCancer)
l <- length(BreastCancer[,1])
sub <- sample(1:l,2*l/3)

BC.rpart <- rpart(Class~.,data=BreastCancer[sub,-1],cp=-1, maxdepth=5)
BC.rpart.pred <- predict(BC.rpart,newdata=BreastCancer[-sub,-1],type="class")
tb <-table(BC.rpart.pred,BreastCancer$Class[-sub])
tb
1-(sum(diag(tb))/sum(tb))


BC.prune<-autoprune(Class~.,data=BreastCancer[,-1],subset=sub)
BC.rpart.pred <- predict(BC.prune,newdata=BreastCancer[-sub,-1],type="class")
tb <-table(BC.rpart.pred,BreastCancer$Class[-sub])
tb
1-(sum(diag(tb))/sum(tb))



# }

Run the code above in your browser using DataLab