caret (version 5.07-005)

bag.default: A General Framework For Bagging

Description

bag provides a framework for bagging classification or regression models. The user can provide their own functions for model building, prediction and aggregation of predictions (see Details below).

Usage

bag(x, ...)

## S3 method for class 'default': bag(x, y, B = 10, vars = ncol(x), bagControl = bagControl(), ...)

bagControl(fit = NULL, predict = NULL, aggregate = NULL, downSample = FALSE)

ldaBag plsBag nbBag ctreeBag svmBag nnetBag

## S3 method for class 'bag': predict(object, newdata = NULL, ...)

Arguments

Value

  • bag produces an object of class bag with elements
  • fitsa list with two sub-objects: the fit object has the actual model fit for that bagged samples and the vars object is either NULL or a vector of integers corresponding to which predictors were sampled for that model
  • controla mirror of the arguments passed into bagControl
  • callthe call
  • Bthe number of bagging iterations
  • dimsthe dimensions of the training set

Details

The function is basically a framework where users can plug in any model in to assess the effect of bagging. Examples functions can be found in ldaBag, plsBag, nbBag, svmBag and nnetBag. Each has elements fit, pred and aggregate.

One note: when vars is not NULL, the sub-setting occurs prior to the fit and predict functions are called. In this way, the user probably does not need to account for the change in predictors in their functions.

When using bag with train, classification models should use type = "prob" inside of the predict function so that predict.train(object, newdata, type = "prob") will work.

Examples

Run this code
## A simple example of bagging conditional inference regression trees:
data(BloodBrain)

treebag <- bag(bbbDescr, logBBB, B = 10,
               bagControl = bagControl(fit = ctreeBag$fit,
                                       predict = ctreeBag$pred,
                                       aggregate = ctreeBag$aggregate))




## An example of pooling posterior probabilities to generate class predictions
data(mdrr)

## remove some zero variance predictors and linear dependencies
mdrrDescr <- mdrrDescr[, -nearZeroVar(mdrrDescr)]
mdrrDescr <- mdrrDescr[, -findCorrelation(cor(mdrrDescr), .95)]

basicLDA <- train(mdrrDescr, mdrrClass, "lda")

bagLDA2 <- train(mdrrDescr, mdrrClass, 
                 "bag", 
                 B = 10, 
                 bagControl(fit = ldaBag$fit,
                            predict = ldaBag$pred,
                            aggregate = ldaBag$aggregate),
                 tuneGrid = data.frame(.vars = c((1:10)*10 , ncol(mdrrDescr))))

Run the code above in your browser using DataCamp Workspace