# blackboost

##### Gradient Boosting with Regression Trees

Gradient boosting for optimizing arbitrary loss functions where regression trees are utilized as base-learners.

- Keywords
- models, regression

##### Usage

```
blackboost(formula, data = list(),
weights = NULL, na.action = na.pass,
offset = NULL, family = Gaussian(),
control = boost_control(),
oobweights = NULL,
tree_controls = partykit::ctree_control(
teststat = "quad",
testtype = "Teststatistic",
mincriterion = 0,
minsplit = 10,
minbucket = 4,
maxdepth = 2,
saveinfo = FALSE),
...)
```

##### Arguments

- formula
a symbolic description of the model to be fit.

- data
a data frame containing the variables in the model.

- weights
an optional vector of weights to be used in the fitting process.

- na.action
a function which indicates what should happen when the data contain

`NA`

s.- offset
a numeric vector to be used as offset (optional).

- family
a

`Family`

object.- control
a list of parameters controlling the algorithm. For more details see

`boost_control`

.- oobweights
an additional vector of out-of-bag weights, which is used for the out-of-bag risk (i.e., if

`boost_control(risk = "oobag")`

). This argument is also used internally by`cvrisk`

.- tree_controls
an object of class

`"TreeControl"`

, which can be obtained using`ctree_control`

. Defines hyper-parameters for the trees which are used as base-learners. It is wise to make sure to understand the consequences of altering any of its arguments. By default, two-way interactions (but not deeper trees) are fitted.- …
additional arguments passed to

`mboost_fit`

, including`weights`

,`offset`

,`family`

and`control`

. For default values see`mboost_fit`

.

##### Details

This function implements the `classical'
gradient boosting utilizing regression trees as base-learners.
Essentially, the same algorithm is implemented in package
`gbm`

. The
main difference is that arbitrary loss functions to be optimized
can be specified via the `family`

argument to `blackboost`

whereas
`gbm`

uses hard-coded loss functions.
Moreover, the base-learners (conditional
inference trees, see `ctree`

) are a little bit more flexible.

The regression fit is a black box prediction machine and thus hardly interpretable.

Partial dependency plots are not yet available; see example section for plotting of additive tree models.

##### Value

An object of class `mboost`

with `print`

and `predict`

methods being available.

##### References

Peter Buehlmann and Torsten Hothorn (2007),
Boosting algorithms: regularization, prediction and model fitting.
*Statistical Science*, **22**(4), 477--505.

Torsten Hothorn, Kurt Hornik and Achim Zeileis (2006). Unbiased recursive
partitioning: A conditional inference framework. *Journal of
Computational and Graphical Statistics*, **15**(3), 651--674.

Yoav Freund and Robert E. Schapire (1996),
Experiments with a new boosting algorithm.
In *Machine Learning: Proc. Thirteenth International Conference*,
148--156.

Jerome H. Friedman (2001),
Greedy function approximation: A gradient boosting machine.
*The Annals of Statistics*, **29**, 1189--1232.

Greg Ridgeway (1999), The state of boosting.
*Computing Science and Statistics*, **31**,
172--181.

##### See Also

See `mboost_fit`

for the generic boosting function,
`glmboost`

for boosted linear models, and
`gamboost`

for boosted additive models.

See `baselearners`

for possible base-learners.

See `cvrisk`

for cross-validated stopping iteration.

Furthermore see `boost_control`

, `Family`

and
`methods`

.

##### Examples

```
# NOT RUN {
### a simple two-dimensional example: cars data
cars.gb <- blackboost(dist ~ speed, data = cars,
control = boost_control(mstop = 50))
cars.gb
### plot fit
plot(dist ~ speed, data = cars)
lines(cars$speed, predict(cars.gb), col = "red")
# }
# NOT RUN {
############################################################
## Do not run this example automatically as it takes
## some time (~ 5-10 seconds depending on the system)
### set up and plot additive tree model
if (require("partykit")) {
ctrl <- ctree_control(maxdepth = 3)
viris <- subset(iris, Species != "setosa")
viris$Species <- viris$Species[, drop = TRUE]
imod <- mboost(Species ~ btree(Sepal.Length, tree_controls = ctrl) +
btree(Sepal.Width, tree_controls = ctrl) +
btree(Petal.Length, tree_controls = ctrl) +
btree(Petal.Width, tree_controls = ctrl),
data = viris, family = Binomial())[500]
layout(matrix(1:4, ncol = 2))
plot(imod)
}
# }
# NOT RUN {
# }
```

*Documentation reproduced from package mboost, version 2.9-1, License: GPL-2*