The major function of the package, calculating the bias effect of sampling bias due to geographic structures, such as the vicinity to cities, airports, rivers and roads. Results are projected to space, and can be compared numerically.
calculate_bias(
x,
gaz = NULL,
res = 1,
buffer = NULL,
restrict_sample = NULL,
terrestrial = TRUE,
inp_raster = NULL,
mcmc_rescale_distances = 1000,
mcmc_iterations = 1e+05,
mcmc_burnin = 20000,
mcmc_outfile = NULL,
prior_q = c(1, 0.01),
prior_w = c(1, 1),
plot_raster = FALSE,
verbose = FALSE,
run_null_model = FALSE,
use_hyperprior = TRUE
)An object of the S3-class ‘sampbias’, which is a list including the following objects:
A list of summary statistics
for the sampbias analyses, including the total number of occurrence points
in x, the total number of species in x, the extent of the
output rasters as well as the settings for res, binsize, and
convexhull used in the analyses.
a SpatRaster
indicating occurrence records per grid cell, with resolution res.
a SpatRaster with indicating the number of species per
grid cell, with resolution res.
a list of SpatRaster,
with the same length as gaz. Each element is the spatial projection of the
bias effect for a sources of bias in gaz. The last raster in the list
is the average over all bias sources.
a data.frame,
with the estimated bias effect for each bias source in gaz, at the
distances specified by biasdist.
an object of the class data.frame, with one species
occurrence record per line, and at least three columns, named
‘species’, ‘decimalLongitude’, and ‘decimalLatitude’.
a list of geographic gazetteers as SpatVector or sf.
If NULL, a set of default gazetteers, representing large scale occurrence
of airports, cities, rivers, and roads is used. See Details.
numerical. The raster resolution for the distance calculation to
the geographic features and the data visualization, in decimal degrees. The
default is to one degree, but higher resolution will be desirable for most
analyses. res together with the extent of the input data determine
computation time and memory requirements.
numerical. The size of the geographic buffer around the extent
of ras for the distance calculations in degrees, to account for
geographic structures neighbouring the study area (such as a road right
outside the study area). Should be a multiple of res. Default is to
res * 10. See Details.
a SpatVector object. If provided the area for
the bias test will be restricted to raster cells within these polygons (and
the extent of the sampled points in x). Make sure to use adequate values
for res. Default = NULL.
logical. If TRUE, the empirical distribution (and the
output maps) are restricted to terrestrial areas. Uses the
rnaturalearth:::ne_countries to define what is
terrestrial. Default = TRUE.
an object of class SpatRaster. A template raster for
the counts and distance calculation. Can be used to provide a special
resolution, or for different coordinate reference systems. See vignette.
numerical. rescaling factor for the distance calculation
numerical. the number of iterations for the MCMC, by default 100,000
numerical. the burn-in for the MCMC, default is to 20,000
character string. the path on where to write the results of the MCMC, optional.
the gamma prior for the sampling rate $q$, which represents the expected number of occurrences per cell in the absence of biases. In the format c(shape,rate).
the gamma prior for the steepness of the Poisson rate decline, such that w approximating 0 results in a null model of uniform sampling rate q across cells. In the format c(shape,rate).
logical. If TRUE, a plot of the occurrence raster is shown for diagnostic purposes. Default = FALSE
logical. If TRUE, progress is reported. Default = FALSE.
logical. Run a null model with bias weights set to zero.
logical. If TRUE a hyperprior on the bias weights is used for regularization to avoid over-parametrization.
The default gazetteers delivered with the package are simplified from http://www.naturalearthdata.com/downloads/. They include major features, and for small scale analyses custom gazetteers should be used.
For computational convenience, the gazetteers are cropped to the extent of the point occurrence data sets. To account for the fact, that, relevant structures might lay directly outside this extent, but still influencing the distribution of samples in the study area, the buffer option, gives the area, around the extent that should be included in the distance calculation.
Visit https://github.com/azizka/sampbias/wiki for more information on distance calculation and the algorithm behind sampbias.
summary.sampbias is.sampbias
plot.sampbias
# \donttest{
#simulate data
x <- data.frame(species = rep(sample(x = LETTERS, size = 5), times = 20),
decimalLongitude = runif(n = 100, min = 0, max = 20),
decimalLatitude = runif(n = 100, min = -4, max = 4))
out <- calculate_bias(x, terrestrial = TRUE, buffer = 0)
summary(out)
plot(out)
# }
Run the code above in your browser using DataLab