methods: Methods for mboostLSS

Description

Methods for GAMLSS models fitted by boosting algorithms.

Usage

## extract coefficients
## S3 method for class 'glmboostLSS':
coef(object, which = NULL,
     aggregate = c("sum", "cumsum", "none"),
     off2int = FALSE, parameter = names(object), ...)
## S3 method for class 'mboostLSS':
coef(object, which = NULL,
     aggregate = c("sum", "cumsum", "none"),
     parameter = names(object), ...)
## plot partial effects
## S3 method for class 'glmboostLSS':
plot(x, main = names(x), parameter = names(x),
     off2int = FALSE, ...)
## S3 method for class 'gamboostLSS':
plot(x, main = names(x), parameter = names(x), ...)
## extract and plot marginal prediction intervals
predint(x, which, pi = 0.9, newdata = NULL, ...)
PI(x, which, pi = 0.9, newdata = NULL, ...)
## S3 method for class 'predint':
plot(x, main = "Marginal Prediction Interval(s)",
     xlab = NULL, ylab = NULL, lty = c("solid", "dashed"),
     lcol = c("black", "black"), log = "", ...)
## extract mstop
## S3 method for class 'mboostLSS':
mstop(object, parameter = names(object), ...)
## S3 method for class 'oobag':
mstop(object, parameter = names(object), ...)
## S3 method for class 'cvriskLSS':
mstop(object, parameter = NULL, ...)
## set mstop
## S3 method for class 'mboostLSS':
[(x, i, return = TRUE, ...)
## extract risk
## S3 method for class 'mboostLSS':
risk(object, merge = FALSE, parameter = names(object), ...)
## extract selected base-learners
## S3 method for class 'mboostLSS':
selected(object, parameter = names(object), ...)
## extract fitted values
## S3 method for class 'mboostLSS':
fitted(object, parameter = names(object), ...)
## make predictions
## S3 method for class 'mboostLSS':
predict(object, newdata = NULL,
        type = c("link", "response", "class"), which = NULL,
        aggregate = c("sum", "cumsum", "none"),
        parameter = names(object), ...)
## update weights of the fitted model
## S3 method for class 'mboostLSS':
update(object, weights, oobweights = NULL,
       risk = NULL, mstop = NULL, ...)
## extract model weights
## S3 method for class 'mboostLSS':
model.weights(x, ...)

Arguments

an object of the appropriate class (see usage).

object

an object of the appropriate class (see usage).

which

a subset of base-learners to take into account when computing predictions or coefficients. If which is given (as an integer vector or characters corresponding to base-learners), a list or matrix is returned. In plot_PI

aggregate

a character specifying how to aggregate predictions or coefficients of single base-learners. The default returns the prediction or coefficient for the final number of boosting iterations. "cumsum" returns a matrix with the

parameter

This can be either a vector of indices or a vector of parameter names which should be processed. See expamles for details. Per default all distribution parameters of the GAMLSS family are returned.

off2int

logical indicating whether the offset should be added to the intercept (if there is any) or if the offset is neglected for plotting (default).

merge

logical. Should the risk vectors of the single components be merged to one risk vector for the model in total? Per default (merge = FALSE) a (named) list of risk vectors is returned.

integer. Index specifying the model to extract. If i is smaller than the initial mstop, a subset is used. If i is larger than the initial mstop, additional boosting steps are performed until

return

a logical indicating whether the changed object is returned.

main

a title for the plots.

xlab, ylab

x- and y axis labels for the plots.

the level(s) of the prediction interval(s); Per default a 90% prediction interval is used.

lty

(vector) of line types to be used for plotting the prediction intervals. The vector should contain

length(pi) +
    1

elements. If less elements are specified, the last element is recycled. The first value lty[1] is used

lcol

(vector) of (line) colors to be used for plotting the prediction intervals. The vector should contain

length(pi) +
    1

elements. If less elements are specified, the last element is recycled. The first value lcol[1] is u

log

a character string which determines if and if so which axis should be logarithmic. See plot.default for details.

newdata

optional; A data frame in which to look for variables with which to predict or with which to plot the marginal prediction intervals.

type

the type of prediction required. The default is on the scale of the predictors; the alternative "response" is on the scale of the response variable. Thus for a binomial model the default predictions are on the log-odds scale

weights

a numeric vector of weights for the model

oobweights

an additional vector of out-of-bag weights (used internally by cvrisk. For details see there.).

risk

a character indicating how the empirical risk should be computed for each boosting iteration. Per default risk is set to the risk type specified for model fitting via boost_control

mstop

number of boosting iterations.

...

Further arguments to the functions.

`Warning`

The [.mboostLSS function changes the original object, i.e.,
  LSSmodel[10] changes LSSmodel directly!

`Details`

These functions can be used to extract details from fitted models.
  print shows a dense representation of the model fit.
  The function coef extracts the regression coefficients of
  linear predictors fitted using the glmboostLSS function or
  additive predictors fitted using gamboostLSS. Per default,
  only coefficients of selected base-learners are returned for all
  distribution parameters. However, any desired coefficient can be
  extracted using the which argument. Furhtermore, one can
  extract only coefficients for a single distribution parameter via the
  parameter argument (see examples for details).
  Analogical, the function plot per default displays the
  coefficient paths for the complete GAMLSS but can be restricted to
  single distribution parameters or covariates (or subsets) using the
  parameter or which arguments, respectively.
  The function predint (or PI which is just an alias)
  computes marginal prediction intervals and returns a data frame with
  the predictors used for the marginal prediction interval, the computed
  median prediction and the marginal prediction intervals. A plot
  function (plot.predint) for the resulting object exists. Note
  that marginal predictions from AFT models (i.e., families
  LogLogLSS, LogNormalLSS, and
  WeibullLSS) represent the predicted true
  survival time and not the observed survival time which is possible
  subject to censoring. Hence, comparing observed survival times with
  the marginal prediction interval is only sensible for uncensored
  observations.
  The predict function can be used for predictions for the
  distribution parameters depending on new observations whereas
  fitted extracts the regression fits for the observations in the
  learning sample. For predict, newdata can be specified
  -- otherwise the fitted values are returned. If which is
  specified, marginal effects of the corresponding base-learner(s) are
  returned. The argument type can be used to make predictions on
  the scale of the link (i.e., the linear predictor X * beta), the
  response (i.e. h(X * beta), where h is the response function)
  or the class (in case of classification).
  The function update updates models fit with gamboostLSS
  and is primarily used within cvrisk. It
  updates the weights and refits the model to the altered data.
  Furthermore, the type of risk and the number of boosting
  iterations mstop can be modified.
  The function model.weights is a generic version of the same
  function provided by package stats, which is required to make
  model.weights work with mboostLSS models.

`References`

Mayr, A., Fenske, N., Hofner, B., Kneib, T. and Schmid, M. (2012):
Generalized additive models for location, scale and shape for
high-dimensional data - a flexible approach based on boosting. Journal
of the Royal Statistical Society, Series C (Applied Statistics) 61(3):
403-427.
Buehlmann, P. and Hothorn, T. (2007), Boosting algorithms:
regularization, prediction and model fitting. Statistical Science,
22(4), 477--505.
Rigby, R. A. and D. M. Stasinopoulos (2005). Generalized additive models
for location, scale and shape (with discussion). Journal of the Royal
Statistical Society, Series C (Applied Statistics), 54, 507-554.

`See Also`

glmboostLSS, gamboostLSS and
 blackboostLSS for fitting of GAMLSS.
 Available distributions (families) are documented here:
 Families.
 See methods in the mboost package for the
 corresponding methods for mboost objects.

`Examples`

Run this code### generate data
set.seed(1907)
x1 <- rnorm(1000)
x2 <- rnorm(1000)
x3 <- rnorm(1000)
x4 <- rnorm(1000)
x5 <- rnorm(1000)
x6 <- rnorm(1000)
mu    <- exp(1.5 + x1^2 +0.5 * x2 - 3 * sin(x3) -1 * x4)
sigma <- exp(-0.2 * x4 +0.2 * x5 +0.4 * x6)
y <- numeric(1000)
for( i in 1:1000)
    y[i] <- rnbinom(1, size = sigma[i], mu = mu[i])
dat <- data.frame(x1, x2, x3, x4, x5, x6, y)

### fit a model
model <- gamboostLSS(y ~ ., families = NBinomialLSS(), data = dat,
                     control = boost_control(mstop = 100))

### use a model with more iterations for a better fit
model[400]
### extract coefficients
coef(model)

### only for distribution parameter mu
coef(model, parameter = "mu")

### only for covariate x1
coef(model, which = "x1")


### plot complete model
par(mfrow = c(4, 3))
plot(model)
### plot first parameter only
par(mfrow = c(2, 3))
plot(model, parameter = "mu")
### now plot only effect of x3 of both parameters
par(mfrow = c(1, 2))
plot(model, which = "x3")
### first component second parameter (sigma)
par(mfrow = c(1, 1))
plot(model, which = 1, parameter = 2)

### plot marginal prediction interval
pi <- predint(model, pi = 0.9, which = "x1")
pi <- predint(model, pi = c(0.8, 0.9), which = "x1")
plot(pi, log = "y")  # warning as some y values are below 0
## here it would be better to plot x1 against
## sqrt(y) and sqrt(pi)

### subset model for mstop = 300 (one-dimensional)
model[300]
# WARNING: Subsetting via model[mstopnew] changes the model directly!
# For the original fit one has to subset again: model[mstop]

par(mfrow = c(2, 2))
plot(risk(model, parameter = "mu")[[1]])
plot(risk(model, parameter = "sigma")[[1]])

### get back to orignal fit
model[400]
plot(risk(model, parameter = "mu")[[1]])
plot(risk(model, parameter = "sigma")[[1]])

### use different mstop values for the components
model[c(100, 200)]
## same as
  model[c(mu = 100, sigma = 200)]
## or
  model[list(mu = 100, sigma = 200)]
## or
  model[list(100, 200)]

plot(risk(model, parameter = "mu")[[1]])
plot(risk(model, parameter = "sigma")[[1]])
Run the code above in your browser using DataLab