# bag.default

##### A General Framework For Bagging

`bag`

provides a framework for bagging classification or regression models. The user can provide their own functions for model building, prediction and aggregation of predictions (see Details below).

- Keywords
- models

##### Usage

`bag(x, ...)`## S3 method for class 'default':
bag(x, y, B = 10, vars = ncol(x), bagControl = NULL, ...)

bagControl(fit = NULL,
predict = NULL,
aggregate = NULL,
downSample = FALSE,
oob = TRUE,
allowParallel = TRUE)

ldaBag
plsBag
nbBag
ctreeBag
svmBag
nnetBag

## S3 method for class 'bag':
predict(object, newdata = NULL, ...)

##### Arguments

- x
- a matrix or data frame of predictors
- y
- a vector of outcomes
- B
- the number of bootstrap samples to train over.
- bagControl
- a list of options.
- ...
- arguments to pass to the model function
- fit
- a function that has arguments
`x`

,`y`

and`...`

and produces a model object that can later be used for prediction. Example functions are found in`ldaBag`

,`plsBag`

,`nbBag`

,`svmBag<`

- predict
- a function that generates predictions for each sub-model. The function should have arguments
`object`

and`x`

. The output of the function can be any type of object (see the example below where posterior probabilities are generated. E - aggregate
- a function with arguments
`x`

and`type`

. The function that takes the output of the`predict`

function and reduces the bagged predictions to a single prediction per sample. the`type`

argument can be used to swi - downSample
- a logical: for classification, should the data set be randomly sampled so that each class has the same number of samples as the smallest class?
- oob
- a logical: should out-of-bag statistics be computed and the predictions retained?
- allowParallel
- if a parallel backend is loaded and available, should the function use it?
- vars
- an integer. If this argument is not
`NULL`

, a random sample of size`vars`

is taken of the predictors in each bagging iteration. If`NULL`

, all predictors are used. - object
- an object of class
`bag`

. - newdata
- a matrix or data frame of samples for prediction. Note that this argument must have a non-null value

##### Details

The function is basically a framework where users can plug in any model in to assess the effect of bagging. Examples functions can be found in `ldaBag`

, `plsBag`

, `nbBag`

, `svmBag`

and `nnetBag`

. Each has elements `fit`

, `pred`

and `aggregate`

.

One note: when `vars`

is not `NULL`

, the sub-setting occurs prior to the `fit`

and `predict`

functions are called. In this way, the user probably does not need to account for the change in predictors in their functions.

When using `bag`

with `train`

, classification models should use `type = "prob"`

inside of the `predict`

function so that `predict.train(object, newdata, type = "prob")`

will work.

If a parallel backend is registered, the

##### Value

`bag`

produces an object of class`bag`

with elementsfits a list with two sub-objects: the `fit`

object has the actual model fit for that bagged samples and the`vars`

object is either`NULL`

or a vector of integers corresponding to which predictors were sampled for that modelcontrol a mirror of the arguments passed into `bagControl`

call the call B the number of bagging iterations dims the dimensions of the training set

##### Examples

```
## A simple example of bagging conditional inference regression trees:
data(BloodBrain)
## treebag <- bag(bbbDescr, logBBB, B = 10,
## bagControl = bagControl(fit = ctreeBag$fit,
## predict = ctreeBag$pred,
## aggregate = ctreeBag$aggregate))
## An example of pooling posterior probabilities to generate class predictions
data(mdrr)
## remove some zero variance predictors and linear dependencies
mdrrDescr <- mdrrDescr[, -nearZeroVar(mdrrDescr)]
mdrrDescr <- mdrrDescr[, -findCorrelation(cor(mdrrDescr), .95)]
## basicLDA <- train(mdrrDescr, mdrrClass, "lda")
## bagLDA2 <- train(mdrrDescr, mdrrClass,
## "bag",
## B = 10,
## bagControl = bagControl(fit = ldaBag$fit,
## predict = ldaBag$pred,
## aggregate = ldaBag$aggregate),
## tuneGrid = data.frame(vars = c((1:10)*10 , ncol(mdrrDescr))))
```

*Documentation reproduced from package caret, version 6.0-37, License: GPL-2*