If more than one fitted model is provided than anova.glm
is
used. If only one model is provided then the significance of each model term
is assessed using Wald tests: see summary.gam
for details of the
actual computations. In the latter case print.anova.gam
is used as the
printing method.P-values are usually reliable if the smoothing parameters are known, or
the model is unpenalized. If smoothing parameters have been estimated then the
p-values are typically somewhat too low. i.e. terms that appear `not
significant' really are not, while terms that are significant, may in fact be
non-significant if the p-value is close to whatever significance level you
are choosing to operate at. This occurs because the uncertainty associated
with the smoothing parameters is neglected in the calculations of the
distributions under the null, which tends to lead to underdispersion in these
distributions, and in turn to p-value estimates that are too low. (In
simulations where the null is correct, I have seen p-values that are as low as half of what they should
be.)
If it is important to have p-values that are as accurate as possible, then,
at least in the single model case, it is probably advisable to perform tests using unpenalized smooths
(i.e. s(...,fx=TRUE)
) with the basis dimension, k
, left at what would
have been used with penalization. Such tests are not as powerful, of
course, but the p-values are more accurate. Whether or not extra accuracy is
required will usually depend on whether or not hypothesis testing is a key
objective of the analysis.