bagging

0th

Percentile

Bagging Classification and Regression Trees

Bootstrap aggregated classification and regression trees.

Keywords
tree
Usage
## S3 method for class 'default':
bagging(y, X=NULL, nbagg=25, method=c("standard","double"),
        coob=TRUE, control= rpart.control(minsize=2, cp=0), ...)
## S3 method for class 'formula':
bagging(formula, data, subset, na.action=na.rpart, \dots)
Arguments
y
vector of responses: either numerical (regression) or factors (classification).
X
data frame of predictors.
nbagg
number of bootstrap replications.
method
standard for Bagging and double for Double-Bagging.
coob
logical. Compute an out-of-bag estimate of the misclassification or mean-squared error.
control
options that control details of the rpart algorithm, see rpart.control.
formula
formula describing the model: y ~ x + w + z, where y is the response and x,w,z are predictors, see lm for details.
data
optional data frame containing the variables in the model formula.
subset
optional vector specifying a subset of observations to be used.
na.action
function which indicates what should happen when the data contain NAs. Defaults to na.rpart.
...
additional parameters to methods (e.g. rpart).
Details

Bootstrap aggregated classification and regression trees were suggested by Breiman (1996, 1998) in order to stabilise trees. This function is based on trees computed by rpart. If y is a factor, classification trees are constructed, regression trees otherwise. nbagg bootstrap samples are drawn and a tree is constructed for each of them. If coob is TRUE, the out-of-bag sample is used to estimate the prediction error. Double-Bagging (Hothorn and Lausen, 2002) computes a LDA on the out-of-bag sample and uses the discriminant variables as additional predictors for the classification trees. Therefore, an out-of-bag estimate of misclassification error is not available for method="double".

print.bagging and summary.bagging are available for the inspection of the results as well as predict.bagging for prediction. Additionally, the function prune.bagging can be used to prune each of the nbagg trees. By default, the trees are not pruned and the tree growing is not stopped until the nodes are pure.

Value

  • An object of class bagging: a list containing the following objects
  • mtlist of length nbagg containing rpart trees.
  • oobout-of-bag predictions for each observation.
  • errout-of-bag error estimate.
  • nbaggnumber of bootstrap samples and trees used.
  • methodmethod used.
  • ldascdiscriminant functions of LDA (for Double-Bagging only).

References

Leo Breiman (1996), Bagging Predictors. Machine Learning 24(2), 123--140.

Leo Breiman (1998), Arcing Classifiers. The Annals of Statistics 26(3), 801--824.

Torsten Hothorn and Berthold Lausen (2002), Double-Bagging: Combining classifiers by bootstrap aggregation. submitted, preprint available under http://www.mathpreprints.com/math/Preprint/hothorn/20020227.2/1.

Aliases
  • bagging
  • bagging.formula
  • bagging.default
Examples
X <- as.data.frame(matrix(rnorm(1000), ncol=10))
y <- factor(ifelse(apply(X, 1, mean) > 0, 1, 0))
learn <- cbind(y, X)

mt <- bagging(y ~., data=learn, coob=TRUE)
mt

X <- as.data.frame(matrix(rnorm(1000), ncol=10))
y <- factor(ifelse(apply(X, 1, mean) > 0, 1, 0))

cls <- predict(mt, newdata=X)

cat("Misclass error est: ", mean(y != cls), "")
cat("Misclass error oob: ", mt$err, "")

X <- as.data.frame(matrix(rnorm(1000), ncol=10))
y <- apply(X, 1, mean) + rnorm(nrow(X))

learn <- cbind(y, X)

mt <- bagging(y ~., data=learn, coob=TRUE)
mt
 
X <- as.data.frame(matrix(rnorm(1000), ncol=10))
y <- apply(X, 1, mean) + rnorm(nrow(X))

haty <- predict(mt, newdata=X)

cat("MSE error: ", mean((haty - y)^2) , "")

data(BreastCancer)
BreastCancer$Id <- NULL

# Test set error bagging (nbagg = 50): 3.7\% (Breiman, 1998, Table 5)

bagging(Class ~ Cl.thickness + Cell.size 
                + Cell.shape + Marg.adhesion
                + Epith.c.size + Bare.nuclei
                + Bl.cromatin + Normal.nucleoli
                + Mitoses, data=BreastCancer, coob=TRUE)
Documentation reproduced from package ipred, version 0.4-0, License: GPL

Community examples

Looks like there are no examples yet.