Family: Gradient Boosting Families

Description

boost_family objects provide a convenient way to specify loss functions and corresponding risk functions to be optimized by one of the boosting algorithms implemented in this package.

Usage

Family(ngradient, loss = NULL, risk = NULL, 
       offset = function(y, w) 0, 
       fW = function(f) rep(1, length(f)), 
       check_y = function(y) TRUE,
       weights = TRUE, name = "user-specified")
AdaExp()
Binomial()
GaussClass()
GaussReg()
Huber(d = NULL)
Laplace()
Poisson()
CoxPH()

Arguments

ngradient

a function with arguments y, f and w implementing the negative gradient of the loss function (which is to be minimized).

loss

an optional loss function with arguments y and f to be minimized (!).

risk

an optional risk function with arguments y, f and w, the weighted mean of the loss function by default.

offset

a function with argument y and w (weights) for computing a scalar offset.

transformation of the fit for the diagonal weights matrix for an approximation of the boosting hat matrix for loss functions other than squared error.

check_y

a function for checking the class / mode of a response variable.

weights

a logical indicating if weights are allowed.

name

a character giving the name of the loss function for pretty printing.

delta parameter for Huber loss function. If omitted, it is chosen adaptively.

Value

An object of class boost_family.

Details

The boosting algorithms implemented in glmboost, gamboost or blackboost aim at minimizing the (weighted) empirical risk function risk(y, f, w) with respect to f. By default, the risk function is the weighted sum of the loss function loss(y, f) but can be chosen arbitrarily. The ngradient(y, f) function is the negative gradient of loss(y, f) with respect to f. For binary classification problems we assume that the response y is coded by $-1$ and $+1$.

Pre-fabricated functions for the most commonly used loss functions are available as well.

The offset function returns the population minimizers evaluated at the response, i.e., $1/2 \log(p / (1 - p))$ for Binomial() or AdaExp() and $(\sum w_i)^{-1} \sum w_i y_i$ for GaussReg and the median for Huber and Laplace.

Examples

Run this code

Laplace()

    Family(ngradient = function(y, f) y - f, 
           loss = function(y, f) (y - f)^2,
           name = "My Gauss Variant")

Run the code above in your browser using DataLab