# ginla

##### GAM Integrated Nested Laplace Approximation Newton Enhanced

Apply Integrated Nested Laplace Approximation (INLA, Rue et al. 2009) to models estimable by `gam`

or `bam`

, using the INLA variant described in Wood (2019). Produces marginal posterior densities for each coefficient, selected coefficients or linear transformations of the coefficient vector.

- Keywords
- models, regression, smooth

##### Usage

`ginla(G,A=NULL,nk=16,nb=100,J=1,interactive=FALSE,int=0,approx=0)`

##### Arguments

- G
A pre-fit gam object, as produced by

`gam(...,fit=FALSE)`

or`bam(...,discrete=TRUE,fit=FALSE)`

.- A
Either a matrix of transforms of the coefficients that are of interest, or an array of indices of the parameters of interest. If

`NULL`

then distributions are produced for all coefficients.- nk
Number of values of each coefficient at which to evaluate its log marginal posterior density. These points are then spline interpolated.

- nb
Number of points at which to evaluate posterior density of coefficients for returning as a gridded function.

- J
How many determinant updating steps to take in the log determinant approximation step. Not recommended to increase this.

- interactive
If this is

`>0`

or`TRUE`

then every approximate posterior is plotted in red, overlaid on the simple Gaussian approximate posterior. If`2`

then waits for user to press return between each plot. Useful for judging whether anything is gained by using INLA approach.- int
0 to skip integration and just use the posterior modal smoothing parameter. >0 for integration using the CCD approach proposed in Rue et al. (2009).

- approx
0 for full approximation; 1 to update Hessian, but use approximate modes; 2 as 1 and assume constant Hessian. See details.

##### Details

Let \(\beta\), \(\theta\) and \(y\) denote the model coefficients, hyperparameters/smoothing parameters and response data, respectively. In principle, INLA employs Laplace approximations for \(\pi(\beta_i|\theta,y)\) and \(\pi(\theta|y)\) and then obtains the marginal posterior distribution \(\pi(\beta_i|y)\) by intergrating the approximations to \(\pi(\beta_i|\theta,y)\pi(\theta|y)\) w.r.t \(\theta\) (marginals for the hyperparameters can also be produced). In practice the Laplace approximation for \(\pi(\beta_i|\theta,y)\) is too expensive to compute for each \(\beta_i\) and must itself be approximated. To this end, there are two quantities that have to be computed: the posterior mode \(\beta^*|\beta_i\) and the determinant of the Hessian of the joint log density \(\log \pi(\beta,\theta,y)\) w.r.t. \(\beta\) at the mode. Rue et al. (2009) originally approximated the posterior conditional mode by the conditional mode implied by a simple Gaussian approximation to the posterior \(\pi(\beta|y)\). They then approximated the log determinant of the Hessian as a function of \(\beta_i\) using a first order Taylor expansion, which is cheap to compute for the sparse model representaiton that they use, but not when using the dense low rank basis expansions used by `gam`

. They also offer a more expensive alternative approximation based on computing the log determiannt with respect only to those elements of \(\beta\) with sufficiently high correlation with \(\beta_i\) according to the simple Gaussian posterior approximation: efficiency again seems to rest on sparsity. Wood (2018) suggests computing the required posterior modes exactly, and basing the log determinant approximation on a BFGS update of the Hessian at the unconditional model. The latter is efficient with or without sparsity, whereas the former is a `for free' improvement. Both steps are efficient because it is cheap to obtain the Cholesky factor of \(H[-i,-i]\) from that of \(H\) - see `choldrop`

. This is the approach taken by this routine.

The `approx`

argument allows two further approximations to speed up computations. For `approx==1`

the exact posterior conditional modes are not used, but instead the conditional modes implied by the simple Gaussian posterior approximation. For `approx==2`

the same approximation is used for the modes and the Hessian is assumed constant. The latter is quite fast as no log joint density gradient evaluations are required.

Note that for many models the INLA estimates are very close to the usual Gaussian approximation to the posterior, the `interactive`

argument is useful for investigating this issue.

`bam`

models are only supported with the `disrete=TRUE`

option. The `discrete=FALSE`

approach would be too inefficient. AR1 models are not supported (related arguments are simply ignored).

##### Value

A list with elements `beta`

and `density`

, both of which are matrices. Each row relates to one coefficient (or linear coefficient combination) of interest. Both matrices have `nb`

columns. If `int!=0`

then a further element `reml`

gives the integration weights used in the CCD integration, with the central point weight given first.

##### WARNINGS

This routine is still somewhat experimental, so details are liable to change. Also currently not all steps are optimally efficient.

The routine is written for relatively expert users.

`ginla`

is not designed to deal with rank deficient models.

##### References

Rue, H, Martino, S. & Chopin, N. (2009) Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations (with discussion). Journal of the Royal Statistical Society, Series B. 71: 319-392.

Wood (2019) Simplified Integrated Laplace Approximation. In press Biometrika.

##### Examples

```
# NOT RUN {
require(mgcv); require(MASS)
## example using a scale location model for the motorcycle data. A simple plotting
## routine is produced first...
plot.inla <- function(x,inla,k=1,levels=c(.025,.1,.5,.9,.975),
lcol = c(2,4,4,4,2),lwd = c(1,1,2,1,1),lty=c(1,1,1,1,1),
xlab="x",ylab="y",cex.lab=1.5) {
## a simple effect plotter, when distributions of function values of
## 1D smooths have been computed
require(splines)
p <- length(x)
betaq <- matrix(0,length(levels),p) ## storage for beta quantiles
for (i in 1:p) { ## work through x and betas
j <- i + k - 1
p <- cumsum(inla$density[j,])*(inla$beta[j,2]-inla$beta[j,1])
## getting quantiles of function values...
betaq[,i] <- approx(p,y=inla$beta[j,],levels)$y
}
xg <- seq(min(x),max(x),length=200)
ylim <- range(betaq)
ylim <- 1.1*(ylim-mean(ylim))+mean(ylim)
for (j in 1:length(levels)) { ## plot the quantiles
din <- interpSpline(x,betaq[j,])
if (j==1) {
plot(xg,predict(din,xg)$y,ylim=ylim,type="l",col=lcol[j],
xlab=xlab,ylab=ylab,lwd=lwd[j],cex.lab=1.5,lty=lty[j])
} else lines(xg,predict(din,xg)$y,col=lcol[j],lwd=lwd[j],lty=lty[j])
}
} ## plot.inla
## set up the model with a `gam' call...
G <- gam(list(accel~s(times,k=20,bs="ad"),~s(times)),
data=mcycle,family=gaulss(),fit=FALSE)
b <- gam(G=G,method="REML") ## regular GAM fit for comparison
## Now use ginla to get posteriors of estimated effect values
## at evenly spaced times. Create A matrix for this...
rat <- range(mcycle$times)
pd0 <- data.frame(times=seq(rat[1],rat[2],length=20))
X0 <- predict(b,newdata=pd0,type="lpmatrix")
X0[,21:30] <- 0
pd1 <- data.frame(times=seq(rat[1],rat[2],length=10))
X1 <- predict(b,newdata=pd1,type="lpmatrix")
X1[,1:20] <- 0
A <- rbind(X0,X1) ## A maps coefs to required function values
## call ginla. Set int to 1 for integrated version.
## Set interactive = 1 or 2 to plot marginal posterior distributions
## (red) and simple Gaussian approximation (black).
inla <- ginla(G,A,int=0)
par(mfrow=c(1,2),mar=c(5,5,1,1))
fv <- predict(b,se=TRUE) ## usual Gaussian approximation, for comparison
## plot inla mean smooth effect...
plot.inla(pd0$times,inla,k=1,xlab="time",ylab=expression(f[1](time)))
## overlay simple Gaussian equivalent (in grey) ...
points(mcycle$times,mcycle$accel,col="grey")
lines(mcycle$times,fv$fit[,1],col="grey",lwd=2)
lines(mcycle$times,fv$fit[,1]+2*fv$se.fit[,1],lty=2,col="grey",lwd=2)
lines(mcycle$times,fv$fit[,1]-2*fv$se.fit[,1],lty=2,col="grey",lwd=2)
## same for log sd smooth...
plot.inla(pd1$times,inla,k=21,xlab="time",ylab=expression(f[2](time)))
lines(mcycle$times,fv$fit[,2],col="grey",lwd=2)
lines(mcycle$times,fv$fit[,2]+2*fv$se.fit[,2],col="grey",lty=2,lwd=2)
lines(mcycle$times,fv$fit[,2]-2*fv$se.fit[,2],col="grey",lty=2,lwd=2)
## ... notice some real differences for the log sd smooth, especially
## at the lower and upper ends of the time interval.
# }
```

*Documentation reproduced from package mgcv, version 1.8-31, License: GPL (>= 2)*