gam
is used to fit generalized additive models, specified by
giving a symbolic description of the additive predictor and a
description of the error distribution. gam
uses the
backfitting algorithm to combine different smoothing or
fitting methods. The methods currently supported are local regression
and smoothing splines.gam(formula, family = gaussian, data, weights, subset, na.action,
start, etastart, mustart, control = gam.control(...),
model=FALSE, method, x=FALSE, y=TRUE, ...)gam.fit(x, y, smooth.frame, weights = rep(1,nobs), start = NULL,
etastart = NULL, mustart = NULL, offset = rep(0, nobs), family = gaussian(),
control = gam.control())
response ~ predictors
. See the documentation of
lm
and formula
for details. Built-in nonparametric
smoothing terms are indicated by s
data
, the variables are taken
from environment(formula)
, typically the environment from
which gam
is called.NA
s. The default is set by
the na.action
setting of options
, and is
gam.control
for details. These can also be set as arguments to gam()
itself."glm.fit"
uses iteratively reweighted
least squares (IWLS). The only current alternative is
"model.frame"
which returns the mogam
:
logical values indicating whether the response
vector and model matrix used in the fitting process
should be returned as components of the returned value. For gam.fit
: x
is a model matrix of
gam.fit
only. This is essentially a
subset of the model frame corresponding to the smooth terms, and has
the ingredients needed for smoothing each variable in the backfitting
algorithm. The elements of this frame are produced by thegam
returns an object of class gam
, which inherits from
both glm
and lm
.Gam objects can be examined by print
, summary
,
plot
, and anova
. Components can be extracted using
extractor functions predict
, fitted
, residuals
,
deviance
, formula
, and family
. Can be modified
using update
. It has all the components of a glm
object,
with a few more. This also means it can be queried, summarized etc by
methods for glm
and lm
objects. Other generic functions
that have methods for gam
objects are step
and
preplot
.
The following components must be included in a legitimate `gam' object.
The residuals, fitted values, coefficients and effects should be extracted
by the generic functions of the same name, rather than
by the `$'} operator.
The code{family} function returns the entire family object used in the fitting, and code{deviance} can be used to extract the deviance of the fit.
item{coefficients}{ the coefficients of the parametric part of the code{additive.predictors}, which multiply the columns of the model matrix. The names of the coefficients are the names of the single-degree-of-freedom effects (the columns of the model matrix). If the model is overdetermined there will be missing values in the coefficients corresponding to inestimable coefficients. } item{additive.predictors}{ the additive fit, given by the product of the model matrix and the coefficients, plus the columns of the code{$smooth} component. } item{fitted.values}{ the fitted mean values, obtained by transforming the component code{additive.predictors} using the inverse link function. } item{smooth, nl.df, nl.chisq, var}{ these four characterize the nonparametric aspect of the fit. code{smooth} is a matrix of smooth terms, with a column corresponding to each smooth term in the model; if no smooth terms are in the code{gam} model, all these components will be missing. Each column corresponds to the strictly nonparametric part of the term, while the parametric part is obtained from the model matrix. code{nl.df} is a vector giving the approximate degrees of freedom for each column of code{smooth}. For smoothing splines specified by code{s(x)}, the approximate code{df} will be the trace of the implicit smoother matrix minus 2. code{nl.chisq} is a vector containing a type of score test for the removal of each of the columns of code{smooth}. code{var} is a matrix like code{smooth}, containing the approximate pointwise variances for the columns of code{smooth}. } item{smooth.frame}{This is essentially a subset of the model frame corresponding to the smooth terms, and has the ingredients needed for making predictions from a code{gam} object} item{residuals}{ the residuals from the final weighted additive fit; also known as residuals, these are typically not interpretable without rescaling by the weights. } item{deviance}{ up to a constant, minus twice the maximized log-likelihood. Similar to the residual sum of squares. Where sensible, the constant is chosen so that a saturated model has deviance zero. } item{null.deviance}{The deviance for the null model, comparable with code{deviance}. The null model will include the offset, and an intercept if there is one in the model} item{iter}{ the number of local scoring iterations used to compute the estimates. } item{family}{ a three-element character vector giving the name of the family, the link, and the variance function; mainly for printing purposes. } item{weights}{the emph{working} weights, that is the weights in the final iteration of the local scoring fit.} item{prior.weights}{the case weights initially supplied.} item{df.residual}{the residual degrees of freedom.} item{df.null}{the residual degrees of freedom for the null model.}
The object will also have the components of a code{lm} object: code{coefficients}, code{residuals}, code{fitted.values}, code{call}, code{terms}, and some others involving the numerical fit. See code{lm.object}.
}
seealso{ code{glm}, code{family}, code{lm}. } author{ Written by Trevor Hastie, following closely the design in the "Generalized Additive Models" chapter (Hastie, 1992) in Chambers and Hastie (1992), and the philosophy in Hastie and Tibshirani (1991). This version of code{gam} is adapted from the S version to match the code{glm} and code{lm} functions in R.
Note that this version of code{gam} is different from the function with the same name in the R library code{mgcv}, which uses only smoothing splines with a focus on automatic smoothing parameter selection via GCV.
} references{ Hastie, T. J. (1990) emph{Generalized additive models.} Chapter 7 of emph{Statistical Models in S} eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
Hastie, T. and Tibshirani, R. (1990) emph{Generalized Additive Models.} London: Chapman and Hall.
Venables, W. N. and Ripley, B. D. (2002) emph{Modern Applied Statistics with S.} New York: Springer. }
examples{ data(kyphosis) gam(Kyphosis ~ s(Age,4) + Number, family = binomial, data=kyphosis, trace=TRUE) data(airquality) gam(Ozone^(1/3) ~ lo(Solar.R) + lo(Wind, Temp), data=airquality, na=na.gam.replace) gam(Kyphosis ~ poly(Age,2) + s(Start), data=kyphosis, family=binomial, subset=Number>2) data(gam.data) gam.object <- gam(y ~ s(x,6) + z,data=gam.data) summary(gam.object) plot(gam.object,se=TRUE) data(gam.newdata) predict(gam.object,type="terms",newdata=gam.newdata) }
keyword{models} keyword{regression} keyword{nonparametric} keywords{smooth}
gam
remains faithful to
the philosophy of GAM models as outlined in the references below.An object gam.slist
(currently set to
c("lo","s","random")
) lists the smoothers supported by
gam
. Corresponding to each of these is a smoothing function
gam.lo
, gam.s
etc that take particular arguments and
produce particular output, custom built to serve as building blocks in
the backfitting algorithm. This allows users to add their own smoothing
methods. See the documentation for these methods for further information.