manyany: Fitting Many Univariate Models to Multivariate Abundance Data

Description

manyany is used to fit many univariate models (GLMs, GAMs, otherwise) to high-dimensional data, such as multivariate abundance data in ecology. This is the base model-fitting function - see plot.manyany for assumption checking, and anova.manyany for significance testing.

Usage

manyany(fn, yMat, formula, data, family="negative.binomial", composition = FALSE, 
var.power=NA, ...)

Arguments

a character string giving the name of the function for the univariate model to be applied. e.g. "glm".

yMat

a matrix of response variables, e.g. multivariate abundances.

formula

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under Details.

data

a data frame containing predictor variables (a matrix is also acceptable). This is REQUIRED and needs to have more than one variable in it (even if only one is used in the model).

family

a description of the error distribution function to be used in the model, either as a character string, a family object, or a list of such objects, one for each response variable in the dataset. S

composition

logical. FALSE (default) fits a separate model to each species. TRUE fits a single model to all variables, including site as a row effect, such that all other terms model relative abundance (compositional effects).

var.power

the power parameter, if using the tweedie distribution.

...

further arguments passed to the fitting function.

Value

manyany returns an object inheriting from "manyany". The function anova (i.e. anova.manyany) will produce a significance test comparing two manyany objects. Currently there is no summary resampling function for objects of this class. The generic accessor functions fitted.values, residuals, logLik, AIC, plot can be used to extract various useful features of the value returned by manyany. An object of class "manyany" is a list containing at least the following components:
logLa vector of log-likelihood terms for each response variable in the fitted model.
fitted.valuesthe matrix of fitted mean values, obtained by transforming the linear predictors by the inverse of the link function.
residualsthe matrix of probability integral transform (PIT) residuals. If the fitted model is a good fit, these will be approximately standard uniformly distributed.
linear.predictorthe linear fit on link scale.
familya vector of family arguments, one for each response variable.
callthe matched call.
modelthe model.frame from the model for the last response variable.
termsa list of terms from the model for the last response variable.

Details

manyany can be used to fit the specified model type to many variables simultaneously, a generalisation of manyglm. It should be able to handle any fixed effects modelling function that has predict and logLik functions, and that accepts a family argument, provided that the family is on our list (currently 'gaussian', 'poisson', 'binomial', 'negative.binomial' and 'tweedie'). Models for manyany are specified symbolically, see for example the details section of lm and formula. Unlike manyglm, this function accepts family functions as arguments instead of just character strings, giving greater flexibility. For example, you can use family=binomial(link="cloglog") to fit a model using the complementary log-log link, rather than using the default logit link. A data argument is required, and it must be a dataframe containing more than one object. It need not contain that matrix of response variables, that is specified separately as yMat. Setting composition=TRUE enables compositional analyses, where predictors are used to model relative abundance rather than mean abundance. This is achieved by vectorising the response matrix and fitting a single model across all variables, with a row effect to account for differences in relative abundance across rows. The default composition=FALSE just fits a separate model for each variable.

References

Warton D. I., Wright S., and Wang, Y. (2012). Distance-based multivariate analyses confound location and dispersion effects. Methods in Ecology and Evolution, 3(1), 89-101.

Examples

Run this code

data(spider)
abund <- spider$abund
X <- as.matrix(spider$x)

## To fit a log-linear model assuming counts are poisson:
spidPois <- manyany("glm",abund,data=X,abund~X,family=poisson())

logLik(spidPois) # a number of generic functions are applible to manyany objects

Run the code above in your browser using DataLab