mgcv
which allows
selection of the numerical method used to optimize the smoothing parameter
estimation criterion for a gam.It is used to set argument method
of gam
.
gam.method(am="magic",gam="outer",outer="newton",gcv="deviance",
family=NULL)
"magic"
if the Wood (2004) method
(magic
) is to be used, or "mgcv"
if the faster, but less"perf.magic"
for the performance
iteration (see details) with magic
as the basic estimation
eng"newton"
for modified Newton method backed up by steepest
descent, based on exact first and second derivatives. "nlm"
to use
"deviance"
, "GACV"
or "pearson"
,
specifying the flavour of GCV to use with outer iteration. "deviance"
simply replaces the residual sum of squares term in a GCV score with the
deviance, following gam
to check the supplied
method argument. In this circumstance the family argument is passed, to check
that it works with the specified method. Negative binomial families only work
wiThe performance iteration has two disadvantages. (i) in the presence of co-linearity or concurvity (a frequent problem when spatial smoothers are included in a model with other covariates) then the process can fail to converge. Suppose we start with some coefficient and smoothing parameter estimates, implying a working penalized linear model: the optimal smoothing parameters and coefficients for this working model may in turn imply a working model for which the original estimates are better than the most recent estimates. This sort of effect can prevent convergence.
Secondly it is often possible to find a set of smoothing parameters that result in a lower GCV or UBRE score, for the final working model, than the final score that results from the performance iterations. This is because the performance iteration is only approximately optimizing this score (since optimization is only performed on the working model). The disadvantage here is not that the model with lower score would perform better (it usually doesn't), but rather that it makes model comparison on the basis of GCV/UBRE score rather difficult.
Both disadvantages of performance iteration are surmountable by using what is
basically O'Sullivan's (1986) suggestion. Here the P-IRLS scheme is iterated
to convergence for a fixed set of smoothing parameters, with an appropriate
GCV/UBRE score evaluated at convergence. This score at convergence is
optimized in some way. This is termed "outer"
optimization, since the
optimization is outer to the P-IRLS loop. Outer iteration is slower than
performance iteration.
The `appropriate GCV/UBRE' score in the previous paragraph can be defined in one of two ways either (i) the deviance, or (ii) the Pearson statistic can be used in place of the residual sum of squares in the GCV/UBRE score. (ii) makes the GCV/UBRE score correspond to the score for the working linear model at convergence of the P-IRLS, but in practice tends to result in oversmoothing, particularly with low n binomial data, or low mean counts. Hence the default is to use (i).
Several alternative optimisation methods can be used for outer
optimization.Usually the fastest and most
reliable approach is to use a modified Newton optimizer with exact first and
second derivatives, and this is the default. nlm
can be used with
finite differenced first derivatives. This is not ideal theoretically, since
it is possible for the finite difference estimates of derivatives to be very
badly in error on rare
occasions when the P-IRLS convergence tolerance is close to being matched
exactly, so that two components of a finite differenced derivative require
different numbers of iterations of P-IRLS in their evaluation. An alternative
is provided in which nlm
uses numerically exact first derivatives, this
is faster and less problematic than the other scheme. A further alternative is to use a quasi-Newton
scheme with exact derivtives, based on optim
. In practice this usually
seems to be slower than the nlm
method.
In summary: performance iteration is fast, but can fail to converge. Outer iteration is a little slower, but more reliable. At present only performance iteration is available for negative binomial families.
Wood, S.N. (2000) Modelling and Smoothing Parameter Estimation with Multiple Quadratic Penalties. J.R.Statist.Soc.B 62(2):413-428
Wood, S.N. (2003) Thin plate regression splines. J.R.Statist.Soc.B 65(1):95-114
Wood, S.N. (2004) Stable and efficient multiple smoothing parameter estimation for generalized additive models. J. Amer. Statist. Ass.
gam.control
gam
, gam.fit
, glm.control