simdata: Simulation of Ecological Data Sets

Description

Simulates multiple ecological data sets using parameters estimated from a pilot study. The output can be used in downstream SSP functions for quality evaluation and sampling effort estimation.

Usage

simdata(Par, cases, N, sites, jitter.base = 0)

Value

A list of simulated community data sets, to be used by datquality and sampsd.

Arguments

Par: A list of parameters estimated by assempar.
cases: Number of data sets to simulate.
N: Number of samples to simulate in each site.
sites: Number of sites to simulate in each data set.
jitter.base: Numeric scalar in \([0,1)\). Standard deviation multiplier used to add Gaussian jitter to fs and fw. Defaults to 0.

Details

Presence/absence data are simulated using Bernoulli trials based on empirical frequencies of occurrence among sites (for site-level presence) and within sites (for local occurrence patterns). To better reflect realistic variability among nested sampling units (e.g., sites within regions), the simulation can apply controlled perturbations to the base parameters. This jittering introduces stochastic variation in occurrence probabilities across sites, while preserving the overall probabilistic structure of each species. As a result, simulated communities exhibit levels of multivariate dispersion closer to those observed in empirical data. These matrices are then converted into abundance matrices using values drawn from Poisson or negative binomial distributions (for count data), or from log-normal distributions (for continuous data like coverage or biomass), depending on the aggregation properties estimated in the pilot data.

This process is repeated cases times, producing a list of simulated data sets that reflect the statistical properties of the original assemblage, but without incorporating environmental constraints or species co-occurrence structures.

References

Anderson, M. J., & Walsh, D. C. I. (2013). PERMANOVA, ANOSIM, and the Mantel test in the face of heterogeneous dispersions: What null hypothesis are you testing? Ecological Monographs, 83(4), 557–574.

Anderson, M. J., de Valpine, P., Punnett, A., & Miller, A. E. (2019). A pathway for multivariate analysis of ecological communities using copulas. Ecology and Evolution, 9, 3276–3294.

Guerra-Castro, E.J., Cajas, J.C., Simões, N., Cruz-Motta, J.J., & Mascaró, M. (2021). SSP: an R package to estimate sampling effort in studies of ecological communities. Ecography 44(4), 561-573. doi: tools:::Rd_expr_doi("10.1111/ecog.05284")

McArdle, B. H., & Anderson, M. J. (2004). Variance heterogeneity, transformations, and models of species abundance: a cautionary tale. Canadian Journal of Fisheries and Aquatic Sciences, 61, 1294–1302.

Examples

Run this code

## Single site simulation
data(micromollusk)
par.mic <- assempar(data = micromollusk, type = "P/A", Sest.method = "average")
sim.mic <- simdata(par.mic, cases = 2, N = 10, sites = 1)

## Multiple site simulation
data(sponges)
par.spo <- assempar(data = sponges, type = "counts", Sest.method = "average")
set.seed(42)
sim.spo <- simdata(par.spo, cases = 2, N = 20, sites = 10, jitter.base = 0.5)

Run the code above in your browser using DataLab