The following formula is used for VGLMs:
\(-2 \mbox{log-likelihood} + k n_{par}\), where \(n_{par}\) represents the number of
parameters
in the fitted model, and \(k = 2\) for the usual AIC.
One could assign \(k = \log(n)\) (\(n\) the number of observations)
for the so-called BIC or SBC (Schwarz's Bayesian criterion).
This is the function `AICvlm()`

.

This code relies on the log-likelihood being defined, and computed,
for the object.
When comparing fitted objects, the smaller the AIC, the better the fit.
The log-likelihood and hence the AIC is only defined up to an additive
constant.

Any estimated scale parameter (in GLM parlance) is used as one
parameter.

For VGAMs and CAO the nonlinear effective degrees of freedom for each
smoothed component is used. This formula is heuristic.
These are the functions `AICvgam()`

and `AICcao()`

.

The finite sample correction is usually recommended when the
sample size is small or when the number of parameters is large.
When the sample size is large their difference tends to be negligible.
The correction is described in Hurvich and Tsai (1989), and is based
on a (univariate) linear model with normally distributed errors.