get.measures: Information Criteria for boral models

Description

Calculates some information criteria for an boral model, which could be used for model selection.

Usage

get.measures(y, X = NULL, family, trial.size  = NULL, site.eff, num.lv, fit.mcmc)

Arguments

The response matrix that the boral model was fitted to.

The model matrix used in the boral model. Defaults to NULL, in which case it is assumed no model matrix was used.

family

Either a single element, or a vector of length equal to the number of columns in $y$. The former assumes all columns of $y$ come from this distribution. The latter option allows for different distributions for each column of $y$. Elements can be one of "b

trial.size

Either equal to NULL, a single element, or a vector of length equal to the number of columns in $y$. If a single element, then all columns assumed to be binomially distributed will have trial size set to this. If a vector, different trial sizes are allowe

site.eff

A logical value indicating whether to row effects were included in the model.

num.lv

The number of latent variables used in the fitted boral model.

fit.mcmc

All MCMC samples for the fitted boral model, as obtained from JAGS. These can be extracted by fitting an boral model using boral with save.model = TRUE, and then applying as.mcmc on

Value

A list with the following components:
waicWAIC based on the conditional log-likelihood.
eaicEAIC based on the mean of the conditional log-likelihood.
ebicEBIC based on the mean of the conditional log-likelihood.
aic.medianAIC (using the marginal log-likelihood) evaluated at the posterior median.
bic.medianBIC (using the marginal log-likelihood) evaluated at the posterior median.
comp.lmCompound Laplace-Metropolis estimator of the model likelihood, evaluated at the posterior median.
all.cond.logLikThe conditional log-likelihood evaluated at all MCMC samples. This is done via repeated application of calc.condlogLik.
num.paramsNumber of estimated parameters used in the fitted model.

Warning

Using information criteria for variable selection should be done with extreme caution, for two reasons: 1) The implementation of these criteria are both heuristic and experimental. 2) Deciding what model to fit for ordination purposes should be driven by the science. For example, it may be the case that criteria suggests a model with 3 or 4 latent variables. However, if we interested in visualizing the data for ordination purposes, then models with 1 or 2 latent variables are far more appropriate. As an another example, whether or not we include row effects when ordinating multivariate abundance data depends on if we are interested in differences between sites in terms of abundance (site.eff = FALSE) or in terms of species composition (site.eff = TRUE).

Details

Currently, six information criteria has been implemented: 1) Widely Applicable Information Criterion (WAIC, Watanabe, 2010) based on the conditional log-likelihood; 2) expected AIC (EAIC, Carlin and Louis, 2011); 3) expected BIC (EBIC, Carlin and Louis, 2011); 4) AIC (using the marginal likelihood) evaluated at the posterior median; 5) BIC (using the marginal likelihood) evalulated at the posterior median; 6) Compound Laplace-Metrpolis estimator of the model likelihood (Lewis and Rafery, 2011).

1) WAIC has been argued to be more natural and extension of AIC to the Bayesian and hierarhical modelling context (Gelman et al., 2013), and is based on the conditional log-likelihood calculated at each of the MCMC samples.

2 & 3) EAIC and EBIC were suggested by (Carlin and Louis, 2011). Both criteria are of the form -2*mean(conditional log-likelihood) + penalty*(no. of parameters in the model), where the mean is averaged all the MCMC samples. EAIC applies a penalty of 2, while EBIC applies a penalty of $log(n)$.

4 & 5) AIC and BIC take the form -2*(marginal log-likelihood) + penalty*(no. of parameters in the model), where the log-likelihood is evaluated at the posterior median. If the parameter-wise posterior distributions are unimodal and approximately symmetric, these will produce similar results to an AIC and BIC where the log-likelihood is evaluated at the posterior mode. EAIC applies a penalty of 2, while EBIC applies a penalty of $log(n)$.

6) The model likelihood is the probability of the data given a model, and both BIC and the compound Laplace-Metropolis estimator are based on asymptotic approximations to the this. However, while the first term in both criteria are of the same form, namely -2*(marginal log-likelihood), where the log-likelihood is evaluated at the posterior median, the compound Laplace-Metropolis estimator explicitly calculates the determinant of the relevant hessian matrix (evalulated at the posterior median) is to derive its penalty.

In our very limited experience, if information criteria are to be used for model selection between boral models, we found BIC at the posterior median and the compound Laplace-Metrpolis estimator tend to perform best. WAIC, AIC, and DIC (see get.dic) tend to over select the number of latent variables. For WAIC and DIC, part of this overfitting could be due to the fact both criteria are calculated from the conditional rather than the marginal log-likelihood (see Millar, 2009).

Intuitively, comparing boral models with and without latent variables (using information criteria such as those returned) amounts to testing whether the columns of the response matrix $y$ are correlated. With multivariate abundance data for example, where $y$ is a matrix of $n$ sites and $p$ species, comparing models with and without latent variables tests whether there is any evidence of correlation between species.

References

Carlin, B. P., & Louis, T. A. (2011). Bayesian methods for data analysis. CRC Press.
Gelman, A., Hwang, J., & Vehtari, A. (2013). Understanding predictive information criteria for Bayesian models. Statistics and Computing, 1-20.
Lewis, S. M., & Raftery, A. E. (1997). Estimating Bayes factors via posterior simulation with the Laplace-Metropolis estimator. Journal of the American Statistical Association, 92, 648-655.
Millar, R. B. (2009). Comparison of hierarchical Bayesian models for overdispersed count data using DIC and Bayes' factors. Biometrics, 65, 962-969.
Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. The Journal of Machine Learning Research, 11, 3571-3594.

Examples

Run this code

library(mvabund) ## Load a dataset from the mvabund package
data(spider)
y <- spider$abun
n <- nrow(y); p <- ncol(y); 
    
spider.fit.pois <- boral(y, family = "poisson", num.lv = 2, 
     site.eff = TRUE, save.model = FALSE, calc.ics = TRUE)

spider.fit.pois$ics ## Returns information criteria

spider.fit.nb <- boral(y, family = "negative.binomial", num.lv = 2, 
     site.eff = TRUE, save.model = FALSE, calc.ics = TRUE)

spider.fit.nb$ics ## Returns the information criteria

Run the code above in your browser using DataLab