Family: Gradient Boosting Families

Description

boost_family objects provide a convenient way to specify loss functions and corresponding risk functions to be optimized by one of the boosting algorithms implemented in this package.

Usage

Family(ngradient, loss = NULL, risk = NULL,
       offset = function(y, w)
           optimize(risk, interval = range(y), 
                    y = y, w = w)$minimum,
       check_y = function(y) y,
       weights = c("any", "none", "zeroone", "case"),
       nuisance = function() return(NULL),
       name = "user-specified", fW = NULL, 
       response = function(f) NA)
AdaExp()
Binomial()
GaussClass()
GaussReg()
Gaussian()
Huber(d = NULL)
Laplace()
Poisson()
CoxPH()
QuantReg(tau = 0.5, qoffset = 0.5)
ExpectReg(tau = 0.5)
NBinomial(nuirange = c(0, 100))
PropOdds(nuirange = c(-0.5, -1), offrange = c(-5, 5))
Weibull(nuirange = c(0, 100))
Loglog(nuirange = c(0, 100))
Lognormal(nuirange = c(0, 100))

Arguments

ngradient

a function with arguments y, f and w implementing the negative gradient of the loss function (which is to be minimized).

loss

an optional loss function with arguments y and f.

risk

an optional risk function with arguments y, f and w to be minimized (!), the weighted mean of the loss function by default.

offset

a function with argument y and w (weights) for computing a scalar offset.

transformation of the fit for the diagonal weights matrix for an approximation of the boosting hat matrix for loss functions other than squared error.

response

inverse link function of a GLM or any other transformation on the scale of the response.

check_y

a function for checking and transforming the class / mode of a response variable.

nuisance

a function for extracting nuisance parameters from the family.

weights

a character indicating if weights are allowed.

name

a character giving the name of the loss function for pretty printing.

delta parameter for Huber loss function. If omitted, it is chosen adaptively.

tau

the quantile or expectile to be estimated, a number strictly between 0 and 1.

qoffset

quantile of response distribution to be used as offset.

nuirange

range of starting values for the nuisance parameter.

offrange

interval to search for offset in.

Value

An object of class boost_family.

Warning

The coefficients resulting from boosting with family Binomial are $1/2$ of the coefficients of a logit model obtained via glm. This is due to the internal recoding of the response to $-1$ and $+1$ (see above).

Details

The boosting algorithm implemented in mboost minimizes the (weighted) empirical risk function risk(y, f, w) with respect to f. By default, the risk function is the weighted sum of the loss function loss(y, f) but can be chosen arbitrarily. The ngradient(y, f) function is the negative gradient of loss(y, f) with respect to f. Pre-fabricated functions for the most commonly used loss functions are available as well. Buehlmann and Hothorn (2007) give a detailed overview of the available loss functions. The offset function returns the population minimisers evaluated at the response, i.e., $1/2 \log(p / (1 - p))$ for Binomial() or AdaExp() and $(\sum w_i)^{-1} \sum w_i y_i$ for Gaussian() and the median for Huber() and Laplace(). A short summary of the available families is given in the following paragraphs: AdaExp() and Binomial() implement families for binary classification. AdaExp() uses the exponential loss, which essentially leads to the AdaBoost algorithm of Freund and Schapire (1996). Binomial() implements the negative binomial log-likelihood of a logistic regression model as loss function. Thus, using Binomial family closely corresponds to fitting a logistic model. However, the coefficients resulting from boosting with family Binomial are $1/2$ of the coefficients of a logit model obtained via glm. This is due to the internal recoding of the response to $-1$ and $+1$ (see below). However, Buehlmann and Hothorn (2007) argue that the family Binomial is the preferred choice for binary classification. For binary classification problems the response y has to be a factor. Internally y is re-coded to $-1$ and $+1$ (Buehlmann and Hothorn 2007). Gaussian() is the default family in mboost. It implements $L_2$Boosting for continuous response. Note that families GaussReg() and GaussClass() (for regression and classification) are deprecated now. Huber() implements a robust version for boosting with continuous response, where the Huber-loss is used. Laplace() implements another strategy for continuous outcomes and uses the $L_1$-loss instead of the $L_2$-loss as used by Gaussian(). Poisson() implements a family for fitting count data with boosting methods. The implemented loss function is the negative Poisson log-likelihood. Note that the natural link function $\log(\mu) = \eta$ is assumed. CoxPH() implements the negative partial log-likelihood for Cox models. Hence, survival models can be boosted using this family. QuantReg() implements boosting for quantile regression, which is introduced in Fenske et al. (2009). ExpectReg works in analogy, only for expectiles, which were introduced to regression by Newey and Powell (1987). Families with an additional scale parameter can be used for fitting models as well: PropOdds() leads to proportional odds models for ordinal outcome variables. When using this family, an ordered set of threshold parameters is re-estimated in each boosting iteration. NBinomial() leads to regression models with a negative binomial conditional distribution of the response. Weibull(), Loglog(), and Lognormal() implement the negative log-likelihood functions of accelerated failure time models with Weibull, log-logistic, and lognormal distributed outcomes, respectively. Hence, parametric survival models can be boosted using these families. For details see Schmid and Hothorn (2008) and Schmid et al. (2010).

References

Peter Buehlmann and Torsten Hothorn (2007), Boosting algorithms: regularization, prediction and model fitting. Statistical Science, 22(4), 477--505. Nora Fenske, Thomas Kneib, and Torsten Hothorn (2009). Identifying risk factors for severe childhood malnutrition by boosting additive quantile regression. Technical Report Nr. 52, Institut fuer Statistik, LMU Muenchen. http://epub.ub.uni-muenchen.de/10510/ Yoav Freund and Robert E. Schapire (1996), Experiments with a new boosting algorithm. In Machine Learning: Proc. Thirteenth International Conference, 148--156. Whitney K. Newey and James L. Powell (1987), Asymmetric least squares estimation and testing. Econometrika, 55, 819--847. Matthias Schmid and Torsten Hothorn (2008), Flexible boosting of accelerated failure time models. BMC Bioinformatics, 9(269). Matthias Schmid, Sergej Potapov, Annette Pfahlberg, and Torsten Hothorn (2010). Estimation and regularization techniques for regression models with multidimensional prediction functions. Statistics and Computing, in press.

Examples

Run this code

Laplace()

    Family(ngradient = function(y, f) y - f,
           loss = function(y, f) (y - f)^2,
           name = "My Gauss Variant")

Run the code above in your browser using DataLab