bruto: Fit an Additive Spline Model by Adaptive Backfitting

Description

Fit an additive spline model by adaptive backfitting.

Usage

bruto(x, y, w, wp, dfmax, cost, maxit.select, maxit.backfit, 
      thresh = 0.0001, trace.bruto = FALSE, start.linear = TRUE,
      fit.object, …)

Arguments

a matrix of numeric predictors (does not include the column of 1s).

a vector or matrix of responses.

optional observation weight vector.

optional weight vector for each column of y; the RSS and GCV criteria use a weighted sum of squared residuals.

dfmax

a vector of maximum df (degrees of freedom) for each term.

cost

cost per degree of freedom; default is 2.

maxit.select

maximum number of iterations during the selection stage.

maxit.backfit

maximum number of iterations for the final backfit stage (with fixed lambda).

thresh

convergence threshold (default is 0.0001); iterations cease when the relative change in GCV is below this threshold.

trace.bruto

logical flag. If TRUE (default) a progress report is printed during the fitting.

start.linear

logical flag. If TRUE (default), the model starts with the linear fit.

fit.object

This the object returned by bruto(); if supplied, the same model is fit to the presumably new y.

…

further arguments to be passed to or from methods.

Value

A multiresponse additive model fit object of class "bruto" is returned. The model is fit by adaptive backfitting using smoothing splines. If there are np columns in y, then np additive models are fit, but the same amount of smoothing (df) is used for each term. The procedure chooses between df = 0 (term omitted), df = 1 (term linear) or df > 0 (term fitted by smoothing spline). The model selection is based on an approximation to the GCV criterion, which is used at each step of the backfitting procedure. Once the selection process stops, the model is backfit using the chosen amount of smoothing.

A bruto object has the following components of interest:

lambda

a vector of chosen smoothing parameters, one for each column of x.

the df chosen for each column of x.

type

a factor with levels "excluded", "linear" or "smooth", indicating the status of each column of x.

gcv.select gcv.backfit df.select

The sequence of gcv values and df selected during the execution of the function.

nit

the number of iterations used.

fitted.values

a matrix of fitted values.

residuals

a matrix of residuals.

call

the call that produced this object.

References

Trevor Hastie and Rob Tibshirani, Generalized Additive Models, Chapman and Hall, 1990 (page 262).

Trevor Hastie, Rob Tibshirani and Andreas Buja ``Flexible Discriminant Analysis by Optimal Scoring'' JASA 1994, 89, 1255-1270.

Examples

Run this code

# NOT RUN {
data(trees)
fit1 <- bruto(trees[,-3], trees[3])
fit1$type
fit1$df
## examine the fitted functions
par(mfrow=c(1,2), pty="s")
Xp <- matrix(sapply(trees[1:2], mean), nrow(trees), 2, byrow=TRUE)
for(i in 1:2) {
  xr <- sapply(trees, range)
  Xp1 <- Xp; Xp1[,i] <- seq(xr[1,i], xr[2,i], len=nrow(trees))
  Xf <- predict(fit1, Xp1)
  plot(Xp1[ ,i], Xf, xlab=names(trees)[i], ylab="", type="l")
}
# }