gamboostLSS (version 2.0-1.1)

gamboostLSS-package: Boosting algorithms for GAMLSS

Description

Boosting methods for fitting generalized additive models for location, scale and shape (GAMLSS).

Arguments

Details

This package uses boosting algorithms for fitting GAMLSS (generalized additive models for location, scale and shape). For information on GAMLSS theory see Rigby and Stasinopoulos (2005), or the information provided at http://gamlss.org. For a tutorial on gamboostLSS see Hofner et al. (2015). Thomas et al. (2018) developed a novel non-cyclic approach to fit gamboostLSS models. This approach is suitable for the combination with stabsel and speeds up model tuning via cvrisk.

The fitting methods glmboostLSS and gamboostLSS, are alternatives for the algorithms provided with gamlss in the gamlss package. They offer shrinkage of effect estimates, intrinsic variable selecion and model choice for potentially high-dimensional data settings.

glmboostLSS (for linear effects) and gamboostLSS (for smooth effects) depend on their analogous companions glmboost and gamboost for generalized additive models (contained in package mboost, see Hothorn et al. 2010, 2015) and are similar in their usage.

The package includes some pre-defined GAMLSS distributions, but the user can also specify new distributions with Families.

A wide range of different base-learners is available for covariate effects (see baselearners) including linear (bols), non-linear (bbs), random (brandom) or spatial effects (bspatial or Markov random fields bmrf). Each bease-learner can be included seperately for each predictor. The selection of base-learnes is crucial as it implies the kind of effect the covariate has on each distribution parameter in the final GAMLSS.

References

B. Hofner, A. Mayr, M. Schmid (2016). gamboostLSS: An R Package for Model Building and Variable Selection in the GAMLSS Framework. Journal of Statistical Software, 74(1), 1-31.

Available as vignette("gamboostLSS_Tutorial").

Mayr, A., Fenske, N., Hofner, B., Kneib, T. and Schmid, M. (2012): Generalized additive models for location, scale and shape for high-dimensional data - a flexible approach based on boosting. Journal of the Royal Statistical Society, Series C (Applied Statistics) 61(3): 403-427.

M. Schmid, S. Potapov, A. Pfahlberg, and T. Hothorn. Estimation and regularization techniques for regression models with multidimensional prediction functions. Statistics and Computing, 20(2):139-150, 2010.

Rigby, R. A. and D. M. Stasinopoulos (2005). Generalized additive models for location, scale and shape (with discussion). Journal of the Royal Statistical Society, Series C (Applied Statistics), 54, 507-554.

Stasinopoulos, D. M. and R. A. Rigby (2007). Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software 23(7).

Buehlmann, P. and Hothorn, T. (2007). Boosting algorithms: Regularization, prediction and model fitting. Statistical Science, 22(4), 477--505.

Hothorn, T., Buehlmann, P., Kneib, T., Schmid, M. and Hofner, B. (2010). Model-based boosting 2.0. Journal of Machine Learning Research 11(Aug), 2109-2113.

Hothorn, T., Buehlmann, P., Kneib, T., Schmid, M. and Hofner, B. (2015). mboost: Model-based boosting. R package version 2.4-2. https://CRAN.R-project.org/package=mboost

Thomas, J., Mayr, A., Bischl, B., Schmid, M., Smith, A., and Hofner, B. (2018), Gradient boosting for distributional regression - faster tuning and improved variable selection via noncyclical updates. Statistics and Computing. 28: 673-687. DOI 10.1007/s11222-017-9754-6 (Preliminary version: http://arxiv.org/abs/1611.10171).

See Also

gamboostLSS and glmboostLSS for model fitting. Available distributions (families) are documented here: Families.

See also the mboost package for more on model-based boosting, or the gamlss package for the original GAMLSS algorithms provided by Rigby and Stasinopoulos.

Examples

Run this code
# NOT RUN {
# Generate covariates
x1 <- runif(100)
x2 <- runif(100)
eta_mu <-     2 - 2*x1
eta_sigma <-  -1  + 2*x2

# Generate response: Negative Binomial Distribution
y <- numeric(100)
for( i in 1:100)  y[i] <- rnbinom(1, size=exp(eta_sigma[i]), mu=exp(eta_mu[i]))

# Model fitting, 300 boosting steps, same formula for both distribution parameters
mod1 <- glmboostLSS( y ~ x1 + x2, families=NBinomialLSS(),
        control=boost_control(mstop=300), center = TRUE)

# Shrinked effect estimates
coef(mod1, off2int=TRUE)

# Empirical risk with respect to mu
plot(risk(mod1)$mu)

# Empirical risk with respect to sigma
plot(risk(mod1)$sigma)
# }

Run the code above in your browser using DataCamp Workspace