Learn R Programming

gamboostLSS (version 1.1-2)

methods: Methods for mboostLSS

Description

Methods for GAMLSS models fitted by boosting algorithms.

Usage

## extract coefficients
## S3 method for class 'glmboostLSS':
coef(object, which = NULL,
     aggregate = c("sum", "cumsum", "none"),
     off2int = FALSE, parameter = names(object), ...)
## S3 method for class 'mboostLSS':
coef(object, which = NULL,
     aggregate = c("sum", "cumsum", "none"),
     parameter = names(object), ...)

## plot partial effects ## S3 method for class 'glmboostLSS': plot(x, main = names(x), parameter = names(x), off2int = FALSE, ...) ## S3 method for class 'gamboostLSS': plot(x, main = names(x), parameter = names(x), ...)

## extract and plot marginal prediction intervals predint(x, which, pi = 0.9, newdata = NULL, ...) PI(x, which, pi = 0.9, newdata = NULL, ...) ## S3 method for class 'predint': plot(x, main = "Marginal Prediction Interval(s)", xlab = NULL, ylab = NULL, lty = c("solid", "dashed"), lcol = c("black", "black"), log = "", ...)

## extract mstop ## S3 method for class 'mboostLSS': mstop(object, parameter = names(object), ...) ## S3 method for class 'oobag': mstop(object, parameter = names(object), ...) ## S3 method for class 'cvriskLSS': mstop(object, parameter = NULL, ...)

## set mstop ## S3 method for class 'mboostLSS': [(x, i, return = TRUE, ...)

## extract risk ## S3 method for class 'mboostLSS': risk(object, merge = FALSE, parameter = names(object), ...)

## extract selected base-learners ## S3 method for class 'mboostLSS': selected(object, parameter = names(object), ...)

## extract fitted values ## S3 method for class 'mboostLSS': fitted(object, parameter = names(object), ...)

## make predictions ## S3 method for class 'mboostLSS': predict(object, newdata = NULL, type = c("link", "response", "class"), which = NULL, aggregate = c("sum", "cumsum", "none"), parameter = names(object), ...)

## update weights of the fitted model ## S3 method for class 'mboostLSS': update(object, weights, oobweights = NULL, risk = NULL, mstop = NULL, ...)

## extract model weights ## S3 method for class 'mboostLSS': model.weights(x, ...)

Arguments

x
an object of the appropriate class (see usage).
object
an object of the appropriate class (see usage).
which
a subset of base-learners to take into account when computing predictions or coefficients. If which is given (as an integer vector or characters corresponding to base-learners), a list or matrix is returned. In plot_PI
aggregate
a character specifying how to aggregate predictions or coefficients of single base-learners. The default returns the prediction or coefficient for the final number of boosting iterations. "cumsum" returns a matrix with the
parameter
This can be either a vector of indices or a vector of parameter names which should be processed. See expamles for details. Per default all distribution parameters of the GAMLSS family are returned.
off2int
logical indicating whether the offset should be added to the intercept (if there is any) or if the offset is neglected for plotting (default).
merge
logical. Should the risk vectors of the single components be merged to one risk vector for the model in total? Per default (merge = FALSE) a (named) list of risk vectors is returned.
i
integer. Index specifying the model to extract. If i is smaller than the initial mstop, a subset is used. If i is larger than the initial mstop, additional boosting steps are performed until
return
a logical indicating whether the changed object is returned.
main
a title for the plots.
xlab, ylab
x- and y axis labels for the plots.
pi
the level(s) of the prediction interval(s); Per default a 90% prediction interval is used.
lty
(vector) of line types to be used for plotting the prediction intervals. The vector should contain length(pi) + 1 elements. If less elements are specified, the last element is recycled. The first value lty[1] is used
lcol
(vector) of (line) colors to be used for plotting the prediction intervals. The vector should contain length(pi) + 1 elements. If less elements are specified, the last element is recycled. The first value lcol[1] is u
log
a character string which determines if and if so which axis should be logarithmic. See plot.default for details.
newdata
optional; A data frame in which to look for variables with which to predict or with which to plot the marginal prediction intervals.
type
the type of prediction required. The default is on the scale of the predictors; the alternative "response" is on the scale of the response variable. Thus for a binomial model the default predictions are on the log-odds scale
weights
a numeric vector of weights for the model
oobweights
an additional vector of out-of-bag weights (used internally by cvrisk. For details see there.).
risk
a character indicating how the empirical risk should be computed for each boosting iteration. Per default risk is set to the risk type specified for model fitting via boost_control
mstop
number of boosting iterations.
...
Further arguments to the functions.

Warning

The [.mboostLSS function changes the original object, i.e., LSSmodel[10] changes LSSmodel directly!

Details

These functions can be used to extract details from fitted models. print shows a dense representation of the model fit.

The function coef extracts the regression coefficients of linear predictors fitted using the glmboostLSS function or additive predictors fitted using gamboostLSS. Per default, only coefficients of selected base-learners are returned for all distribution parameters. However, any desired coefficient can be extracted using the which argument. Furhtermore, one can extract only coefficients for a single distribution parameter via the parameter argument (see examples for details).

Analogical, the function plot per default displays the coefficient paths for the complete GAMLSS but can be restricted to single distribution parameters or covariates (or subsets) using the parameter or which arguments, respectively.

The function predint (or PI which is just an alias) computes marginal prediction intervals and returns a data frame with the predictors used for the marginal prediction interval, the computed median prediction and the marginal prediction intervals. A plot function (plot.predint) for the resulting object exists. Note that marginal predictions from AFT models (i.e., families LogLogLSS, LogNormalLSS, and WeibullLSS) represent the predicted true survival time and not the observed survival time which is possible subject to censoring. Hence, comparing observed survival times with the marginal prediction interval is only sensible for uncensored observations.

The predict function can be used for predictions for the distribution parameters depending on new observations whereas fitted extracts the regression fits for the observations in the learning sample. For predict, newdata can be specified -- otherwise the fitted values are returned. If which is specified, marginal effects of the corresponding base-learner(s) are returned. The argument type can be used to make predictions on the scale of the link (i.e., the linear predictor X * beta), the response (i.e. h(X * beta), where h is the response function) or the class (in case of classification).

The function update updates models fit with gamboostLSS and is primarily used within cvrisk. It updates the weights and refits the model to the altered data. Furthermore, the type of risk and the number of boosting iterations mstop can be modified.

The function model.weights is a generic version of the same function provided by package stats, which is required to make model.weights work with mboostLSS models.

References

Mayr, A., Fenske, N., Hofner, B., Kneib, T. and Schmid, M. (2012): Generalized additive models for location, scale and shape for high-dimensional data - a flexible approach based on boosting. Journal of the Royal Statistical Society, Series C (Applied Statistics) 61(3): 403-427.

Buehlmann, P. and Hothorn, T. (2007), Boosting algorithms: regularization, prediction and model fitting. Statistical Science, 22(4), 477--505.

Rigby, R. A. and D. M. Stasinopoulos (2005). Generalized additive models for location, scale and shape (with discussion). Journal of the Royal Statistical Society, Series C (Applied Statistics), 54, 507-554.

See Also

glmboostLSS, gamboostLSS and blackboostLSS for fitting of GAMLSS.

Available distributions (families) are documented here: Families.

See methods in the mboost package for the corresponding methods for mboost objects.

Examples

Run this code
### generate data
set.seed(1907)
x1 <- rnorm(1000)
x2 <- rnorm(1000)
x3 <- rnorm(1000)
x4 <- rnorm(1000)
x5 <- rnorm(1000)
x6 <- rnorm(1000)
mu    <- exp(1.5 + x1^2 +0.5 * x2 - 3 * sin(x3) -1 * x4)
sigma <- exp(-0.2 * x4 +0.2 * x5 +0.4 * x6)
y <- numeric(1000)
for( i in 1:1000)
    y[i] <- rnbinom(1, size = sigma[i], mu = mu[i])
dat <- data.frame(x1, x2, x3, x4, x5, x6, y)

### fit a model
model <- gamboostLSS(y ~ ., families = NBinomialLSS(), data = dat,
                     control = boost_control(mstop = 100))

### use a model with more iterations for a better fit
model[400]
### extract coefficients
coef(model)

### only for distribution parameter mu
coef(model, parameter = "mu")

### only for covariate x1
coef(model, which = "x1")


### plot complete model
par(mfrow = c(4, 3))
plot(model)
### plot first parameter only
par(mfrow = c(2, 3))
plot(model, parameter = "mu")
### now plot only effect of x3 of both parameters
par(mfrow = c(1, 2))
plot(model, which = "x3")
### first component second parameter (sigma)
par(mfrow = c(1, 1))
plot(model, which = 1, parameter = 2)

### plot marginal prediction interval
pi <- predint(model, pi = 0.9, which = "x1")
pi <- predint(model, pi = c(0.8, 0.9), which = "x1")
plot(pi, log = "y")  # warning as some y values are below 0
## here it would be better to plot x1 against
## sqrt(y) and sqrt(pi)

### subset model for mstop = 300 (one-dimensional)
model[300]
# WARNING: Subsetting via model[mstopnew] changes the model directly!
# For the original fit one has to subset again: model[mstop]

par(mfrow = c(2, 2))
plot(risk(model, parameter = "mu")[[1]])
plot(risk(model, parameter = "sigma")[[1]])

### get back to orignal fit
model[400]
plot(risk(model, parameter = "mu")[[1]])
plot(risk(model, parameter = "sigma")[[1]])

### use different mstop values for the components
model[c(100, 200)]
## same as
  model[c(mu = 100, sigma = 200)]
## or
  model[list(mu = 100, sigma = 200)]
## or
  model[list(100, 200)]

plot(risk(model, parameter = "mu")[[1]])
plot(risk(model, parameter = "sigma")[[1]])

Run the code above in your browser using DataLab