Learn R Programming

simba (version 0.3-4)

makead: Create artificial data set (species matrix).

Description

The functions allow for the automated creation of artificial data (species matrix). The user can choose between random organization or a gradient. The gradient can be defined via a gradient vector which allows for fine tuning of the gradient. ads has a different implementation and produces better results for gradients.

Usage

makead(nspec, nplots, avSR = NULL, anc = NULL, grad.v = NULL, 
cf = 0.2, puq = 0.01)

ads(nspec, nplots, avSR = NULL, anc = NULL, grad.v = NULL, reord = TRUE, cf = 0.2, puq = 0.01)

ads.hot(nspec, nplots, avSR = NULL, anc = NULL, grad.v = NULL, frac=0.5, reord=TRUE, cf=0.2, puq=0.01)

ads.fbg(nspec, nplots, grad.v, n.iter = 100, method = "ads", ...)

Arguments

nspec
Numbers of species you want to be in the data-set. Meaningless if anc != NULL.
nplots
Numbers of plots you want to be in the data-set. Meaningless if anc != NULL.
avSR
Average species richness. If anc is given, it is calculated from the data when the default is not changed. If avSR != NULL, the given value is taken instead. In the actual version not implemented in ads.
anc
If a model species matrix is available (either a real data-set, or another artificial data-set) on which creation should be based, give it here. Rows must be plots and columns be species. The first three parameters are then obtained from this set. However
grad.v
A numeric vector describing the gradient, or - in case of ads.hot - the hotspot. Must have the same length as nplots (or nrow(anc) respectively). See details.
cf
Determines the probability of the species to occur on the plots. In other words, it changes the shape of the species accumulation curve. Set to NULL if no natural species accumulation should be applied (may sometimes increase the visibility of the gradien
puq
Percentage of ubiquitous species. Set to NULL if the produced gradients seem to be unclear or if you don't want ubiquitous species to be in the data-set. Only used if a gradient vector is given (which is then not applied to the given percentage of species
reord
Triggers reordering of the columns in the produced gradient matrix (see details). May considerably change the resulting matrix. Defaults to TRUE.
frac
Numeric between 0 and 1 giving the percentage of species which should occur on the hotspot-gradient only (see details).
n.iter
Number of iterations when ads.fbg is used for finding the species matrix representing best the prescribed gradient (see details).
method
Which method of makead, ads, ads.hot should be used?
...
Further arguments to the function specified in method

Value

  • The three functions for creating an artificial species matrix each return a presence/absence species matrix with rows representing plots/sampling units and columns representing species. ads.fbg returns a list with
  • matThe species matrix as for the three artificial data set functions.
  • r2.adjThe adjusted r2 value for the regression of the first axis DCA scores of the resulting species matrix against the position on the prescribed gradient as described by the gradient vector grad.v.

encoding

UTF-8

Details

There are three different implementations to create an artificial species matrix and a fourth function ads.fbg that allows to use either of the three possibilities to find a "best" gradient. makead first applies the natural species accumulation curve, the gradient for each species is represented by a vector containing numerics between 0 and 1. Both matrices are added so that values between 0 and 2 result. Through an iteration procedure a breakvalue is defined above which all entries are converted to 1. Values below are converted to 0 resulting in a presence/absence matrix. However the random element seems to be too strong to get evident gradient representations. Therefore ads is implemented. It works different. First, a gradient is applied. As with makead the gradient is always applied in two directions so that half of the species are more likely to occur on plots on one side of the gradient, whereas the others are more likely to occur on the other side of the gradient. Subsequently, species occurrence for all species will oscillate around nplots/2. If puq is specified the given percentage of species is divided from the whole matrix before the gradient is applied. With the parameter cf a vector is produced representing quasi-natural occurrence of the species on the plots: Most species are rare and few species are very common. This is described by a power function $y = \frac{1}{x^{cf}}$ with x starting at 2 and gives a vector of length nspec representing the number of times each species is occuring. These numbers are applied to the gradient matrix and from the species occurrences only as many as specified by the respective number are randomly sampled. In cases were the occurrence number given by the vector exceeds the occurrences resulting from the gradient matrix, the species in the gradient matrix is replaced by a new one for which occurrence is not following the gradient and represents the number of occurrences given by the vector. The idea behind this is, that also in nature a species occuring on more than about half of the plots will likely be independent from a specific gradient. In both cases (makead and ads) a totally random species matrix (under consideration of natural species occurrence, see cf) is obtained by randomly shuffling these occurrences on the columns (species) of the "natural species occurrence" matrix. Contrarily to the other two functions, ads.hot allows for the creation of an artficial data-set including a hotspot of species richness and composition. In this case, frac can be used to specify which proportion of the total number of species should only occur on the hotspot gradient. All other species occur randomly on the plots. However, with the hotspot-gradient (grad.v) you can influence the explicitness of the hotspot. The function ads.fbg allows for finding the best gradient representation with one of the above functions. A gradient is considered to be represented best, when the correlation between the first axis scores of a DCA (which is calculated with decorana of package vegan and the gradient positions as described by the gradient vector grad.v are maximized. ads.fbg just runs the specified makead function n.iter times and gives out the best result matrix and the r2.adjust value that has been obtained.

Examples

Run this code
## create a random data-set with 200 species on 60 plots
artda <- makead(200, 60, avSR=25)

## create a gradient running from North to South (therefore you 
## need a spatially explicit model of your data which is obtained 
## with hexgrid())
coor <- hexgrid(0, 4000, 200)
coor <- coor[order(coor$ROW),] #causes coordinates to be in order.
## then the gradient vektor can easily be generated from the ROW names
gradvek <- as.numeric(coor$ROW)
## check how many plots your array has
nrow(coor)
## create a data-set with 200 species
artda <- ads(200, 100, grad.v=gradvek)
## see the species frequency distribution curve
plot(sort(colSums(artda)))

Run the code above in your browser using DataLab