adabag (version 4.2)

autoprune: Builds automatically a pruned tree of class rpart

Description

Builds automatically a pruned tree of class rpart looking in the cptable for the minimum cross validation error plus a standard deviation

Usage

autoprune(formula, data, subset=1:length(data[,1]), ...)

Value

An object of class rpart

Arguments

formula

a formula, as in the lm function.

data

a data frame in which to interpret the variables named in the formula.

subset

optional expression saying that only a subset of the rows of the data should be used in the fit, as in the rpart function.

...

further arguments passed to or from other methods.

Author

Esteban Alfaro-Cortes Esteban.Alfaro@uclm.es, Matias Gamez-Martinez Matias.Gamez@uclm.es and Noelia Garcia-Rubio Noelia.Garcia@uclm.es

Details

The cross validation estimation of the error (xerror) has a random component. To avoid this randomness the 1-SE rule (or 1-SD rule) selects the simplest model with a xerror equal or less than the minimum xerror plus the standard deviation of the minimum xerror.

References

Breiman, L., Friedman, J.H., Olshen, R. and Stone, C.J. (1984): "Classification and Regression Trees". Wadsworth International Group. Belmont

Therneau, T., Atkinson, B. and Ripley, B. (2014). rpart: Recursive Partitioning and Regression Trees. R package version 4.1-5

See Also

rpart

Examples

Run this code
## rpart library should be loaded
library(rpart)
data(iris)
iris.prune<-autoprune(Species~., data=iris)
iris.prune

## Comparing the test error of rpart and autoprune
library(mlbench)
data(BreastCancer)
l <- length(BreastCancer[,1])
sub <- sample(1:l,2*l/3)

BC.rpart <- rpart(Class~.,data=BreastCancer[sub,-1],cp=-1, maxdepth=5)
BC.rpart.pred <- predict(BC.rpart,newdata=BreastCancer[-sub,-1],type="class")
tb <-table(BC.rpart.pred,BreastCancer$Class[-sub])
tb
1-(sum(diag(tb))/sum(tb))


BC.prune<-autoprune(Class~.,data=BreastCancer[,-1],subset=sub)
BC.rpart.pred <- predict(BC.prune,newdata=BreastCancer[-sub,-1],type="class")
tb <-table(BC.rpart.pred,BreastCancer$Class[-sub])
tb
1-(sum(diag(tb))/sum(tb))



Run the code above in your browser using DataCamp Workspace