simdata: Simulation of Data Sets

Description

The function simulates data sets (as many as requested) using estimated parameters from the list generated by assempar. The function returns an object of class list that includes all the simulated data to be used by datquality and sampsd.

Usage

simdata(Par, cases, N, sites)

Arguments

Par

A list of parameters estimated by assempar

cases

Number of data sets to be simulated

Total number of samples to be simulated in each site

sites

Total number of sites to be simulated in each data set

Value

simulated.data

The function returns an object of class List, that includes all simulated data. This object will be used by sampsd and datquality

Details

The presence/absence of each species at each site are simulated with Bernoulli trials and probability of success equals to the empirical frequency of occurrence of each species among sites in the pilot data. For sites with the presence of a particular species, Bernoulli trials are used (with a probability of success equal to the estimated empirical frequency within the sites where it appears), to simulate the distribution of the species at that site. Once created, the P/A matrices are converted to matrices of abundances replacing presences by random values from an adequate statistical distribution and parameters equal to those estimated in the pilot data. Simulations of counts of individuals are generated using Poisson or negative binomial distributions, depending on the degree of aggregation of each species in the pilot data (McArdle & Anderson 2004; Anderson & Walsh 2013). Simulations of continuous variables (i.e. coverage, biomass), are generated using the log-normal distribution. The simulation procedure is repeated to generate as many simulated data matrices as needed.

References

Anderson, M. J., & Walsh, D. C. I. (2013). PERMANOVA, ANOSIM, and the Mantel test in the face of heterogeneous dispersions: What null hypothesis are you testing? Ecological Monographs, 83(4), 557-574.

Anderson, M. J., P. de Valpine, A. Punnett, & Miller, A. E. (2019). A pathway for multivariate analysis of ecological communities using copulas. Ecology and Evolution 9:3276-3294.

Guerra-Castro, E. J., J. C. Cajas, F. N. Dias Marques Simoes, J. J. Cruz-Motta, and M. Mascaro. (2020). SSP: An R package to estimate sampling effort in studies of ecological communities. bioRxiv:2020.2003.2019.996991.

McArdle, B. H., & Anderson, M. J. (2004). Variance heterogeneity, transformations, and models of species abundance: a cautionary tale. Canadian Journal of Fisheries and Aquatic Sciences, 61, 1294-1302.

Examples

Run this code

# NOT RUN {
###To speed up the simulation of these examples, the cases, sites and N were set small.

##Single site: micromollusk from Cayo Nuevo (Yucatan, Mexico)
data(micromollusk)

#Estimation of parameters of pilot data
par.mic<-assempar(data = micromollusk,
                  type= "P/A",
                  Sest.method = "average")

#Simulation of 3 data sets, each one with 10 potential sampling units from a single site
sim.mic<-simdata(par.mic, cases = 3, N = 10, sites = 1)

##Multiple sites: Sponges from Alacranes National Park (Yucatan, Mexico).
data(sponges)

#Estimation of parameters of pilot data
par.spo<-assempar (data = sponges,
                    type= "counts",
                    Sest.method = "average")

#Simulation of 3 data sets, each one with 10 potential sampling units in 3 sites.
sim.spo<-simdata(par.spo, cases = 3, N = 10, sites = 3)
# }

Run the code above in your browser using DataLab