fCWM_SNC: Calculate community weighted means and species niche centroids for double constrained correspondence analysis

Description

Double constrained correspondence analysis (dc-CA) can be calculated directly from community weighted means (CWMs), with the trait data from which the CWMs are calculated, and the environmental data and weights for species and sites (the abundance totals for species and sites). Statistical testing at the species level requires also the species niche centroids (SNCs). The function fCWM_SNC calculates the CWMs and SNCs from the trait and environmental data, respectively, using a formula interface, so as to allow categorical traits and environmental variables. The resulting object can be set as the response argument in dc_CA so as to give the same output as a call to dc_CA with the abundance data as response, at least up to sign changes of the axes.

Usage

fCWM_SNC(
  response = NULL,
  dataEnv = NULL,
  dataTraits = NULL,
  formulaEnv = NULL,
  formulaTraits = NULL,
  divideBySiteTotals = NULL
)

Value

The default returns a list of CWM, SNC, weights, formulaTraits, inertia (weighted variance explained by the traits and by the environmental variables) and a list of data with elements dataEnv and dataTraits.

Arguments

response: matrix, data frame of the abundance data (dimension n x m) or list with community weighted means (CWMs) from fCWM_SNC, NULL. If NULL, the response should be at the left-hand side of formulaEnv. See Details for analyses starting from community weighted means. Rownames of response, if any, are carried through.
dataEnv: matrix or data frame of the row predictors, with rows corresponding to those in response. (dimension n x p).
dataTraits: matrix or data frame of the column predictors, with rows corresponding to the columns in response. (dimension m x q).
formulaEnv: two-sided or one-sided formula for the rows (samples) with row predictors in dataEnv. The left hand side of the formula is ignored if it is specified in the response argument. Specify row covariates (if any) by adding + Condition(covariate-formula) to formulaEnv as in rda. The covariate-formula should not contain a ~ (tilde). Default: NULL for ~., i.e. all variables in dataEnv are predictor variables.
formulaTraits: formula or one-sided formula for the columns (species) with column predictors in dataTraits. When two-sided, the left hand side of the formula is not used. Specify column covariates (if any ) by adding + Condition(covariate-formula) to formulaTraits as in cca. The covariate-formula should not contain a ~ (tilde). Default: NULL for ~., i.e. all variables in dataTraits are predictor traits.
divideBySiteTotals: logical; default TRUE for closing the data by dividing the rows in the response by their total. However, the default is FALSE, when the species totals are proportional to N2*(N-N2) with N2 the Hill numbers of order 2 of the species and N the number of sites, as indicator that the response data have been pre-processed to N2-based marginals using ipf2N2.

Details

The argument formulaTraits determines which CWMs are calculated. The CWMs are calculated from the trait data (non-centered, non-standardized). With trait covariates, the other predictor traits are adjusted for the trait covariates by weighted regression, after which the overall weighted mean trait is added. This has the advantage that each CWM has the scale of the original trait.

The SNCs are calculated analogously from environmental data.

Empty (all zero) rows and columns in response are removed from the response and the corresponding rows from dataEnv and dataTraits. Subsequently, any columns with missing values are removed from dataEnv and dataTraits. It gives an error (object 'name_of_variable' not found), if variables with missing entries are specified in formulaEnv and formulaTraits.

In the current implementation, formulaEnv and formulaTraits should contain variable names as is, i.e. transformations of variables in the formulas gives an error ('undefined columns selected') when the scores function is applied.

References

Kleyer, M., Dray, S., Bello, F., Lepš, J., Pakeman, R.J., Strauss, B., Thuiller, W. & Lavorel, S. (2012) Assessing species and community functional responses to environmental gradients: which multivariate methods? Journal of Vegetation Science, 23, 805-821. tools:::Rd_expr_doi("10.1111/j.1654-1103.2012.01402.x")

ter Braak, CJF, Šmilauer P, and Dray S. 2018. Algorithms and biplots for double constrained correspondence analysis. Environmental and Ecological Statistics, 25(2), 171-197. tools:::Rd_expr_doi("10.1007/s10651-017-0395-x")

ter Braak C.J.F. and P. Šmilauer (2018). Canoco reference manual and user's guide: software for ordination (version 5.1x). Microcomputer Power, Ithaca, USA, 536 pp.

Oksanen, J., et al. (2022) vegan: Community Ecology Package. R package version 2.6-4. https://CRAN.R-project.org/package=vegan.

Examples

Run this code

data("dune_trait_env")

# rownames are carried forward in results
rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites

CWMSNC <- fCWM_SNC(formulaEnv = ~ A1 + Moist + Manure + Use + Condition(Mag),
                   formulaTraits = ~ SLA + Height + LDMC + Condition(Seedmass) + Lifespan,
                   response = dune_trait_env$comm[, -1],  # must delete "Sites"
                   dataEnv = dune_trait_env$envir,
                   dataTraits = dune_trait_env$traits)
names(CWMSNC)
#CWMSNC$SNC <- NULL # would give correct dc-CA but no species-level t-values or test
mod <- dc_CA(response = CWMSNC) # formulas and data are in CWMSNC!
# note that output also gives the environment-constrained inertia,
# which is a bonus compare to the usual way to carry out a dcCA.
anova(mod)

Run the code above in your browser using DataLab