bag
provides a framework for bagging classification or regression models. The user can provide their own functions for model building, prediction and aggregation of predictions (see Details below).
bag(x, ...)
"bag"(x, y, B = 10, vars = ncol(x), bagControl = NULL, ...)
bagControl(fit = NULL, predict = NULL, aggregate = NULL, downSample = FALSE, oob = TRUE, allowParallel = TRUE)
ldaBag
plsBag
nbBag
ctreeBag
svmBag
nnetBag
"predict"(object, newdata = NULL, ...)
x
, y
and ...
and produces a model object that can later be used for prediction. Example functions are found in ldaBag
, plsBag
, nbBag
, svmBag
and nnetBag
.
object
and x
. The output of the function can be any type of object (see the example below where posterior probabilities are generated. Example functions are found in ldaBag
, plsBag
, nbBag
, svmBag
and nnetBag
.)
x
and type
. The function that takes the output of the predict
function and reduces the bagged predictions to a single prediction per sample. the type
argument can be used to switch between predicting classes or class probabilities for classification models. Example functions are found in ldaBag
, plsBag
, nbBag
, svmBag
and nnetBag
.
NULL
, a random sample of size vars
is taken of the predictors in each bagging iteration. If NULL
, all predictors are used.
bag
.
bag
produces an object of class bag
with elementsldaBag
, plsBag
, nbBag
, svmBag
and nnetBag
. Each has elements fit
, pred
and aggregate
.One note: when vars
is not NULL
, the sub-setting occurs prior to the fit
and predict
functions are called. In this way, the user probably does not need to account for the change in predictors in their functions.
When using bag
with train
, classification models should use type = "prob"
inside of the predict
function so that predict.train(object, newdata, type = "prob")
will work.
If a parallel backend is registered, the foreach package is used to train the models in parallel.
## A simple example of bagging conditional inference regression trees:
data(BloodBrain)
## treebag <- bag(bbbDescr, logBBB, B = 10,
## bagControl = bagControl(fit = ctreeBag$fit,
## predict = ctreeBag$pred,
## aggregate = ctreeBag$aggregate))
## An example of pooling posterior probabilities to generate class predictions
data(mdrr)
## remove some zero variance predictors and linear dependencies
mdrrDescr <- mdrrDescr[, -nearZeroVar(mdrrDescr)]
mdrrDescr <- mdrrDescr[, -findCorrelation(cor(mdrrDescr), .95)]
## basicLDA <- train(mdrrDescr, mdrrClass, "lda")
## bagLDA2 <- train(mdrrDescr, mdrrClass,
## "bag",
## B = 10,
## bagControl = bagControl(fit = ldaBag$fit,
## predict = ldaBag$pred,
## aggregate = ldaBag$aggregate),
## tuneGrid = data.frame(vars = c((1:10)*10 , ncol(mdrrDescr))))
Run the code above in your browser using DataLab