gen.comm.mat: Create a community matrix based on BINs abundances/incidences

Description

This function generates a community matrix (~site X species) using the data retrieved bold.fetch() based on BIN abundances/incidences.

Usage

gen.comm.mat(
  bold.df,
  taxon.rank,
  taxon.name = NULL,
  site.cat = NULL,
  grids = FALSE,
  gridsize = NULL,
  pre.abs = FALSE,
  view.grids = FALSE
)

Value

An 'output' list containing:

comm.matrix = A site X species like matrix based on BINs.
grids = A sf data frame containing the grid geometry and corresponding cell id.
grid_plot = A grid_plot overlaid on a world map with cell ids.

Arguments

bold.df: the bold ‘data.frame’ generated from bold.fetch().
taxon.rank: A single character value specifying the taxonomic hierarchical rank. Needs to be provided by default.
taxon.name: A single or multiple character vector specifying the taxonomic names associated with the ‘taxon.rank’. Default value is NULL.
site.cat: A single or multiple character vector specifying the countries for which a community matrix should be created. Default value is NULL.
grids: A logical value specifying Whether the community matrix should be based on grids as ‘sites’. Default value is NULL.
gridsize: A numeric value of the size of the grid if the grids=TRUE;Size is in sq.m. Default value is NULL.
pre.abs: A logical value specifying whether the generated matrix should be converted into a 'presence-absence' matrix.
view.grids: A logical value specifying viewing the grids overlaid on a map with respective cell ids. Default value is FALSE.

Details

The function transforms the bold.fetch() downloaded data into a site X species like matrix. Instead of species counts (or abundances) though, values in each cell are the counts (or abundances) of a specific BIN from a site.cat site category or a ‘grid’. These counts can be generated at any taxonomic hierarchical level for a single or multiple taxa (This can also be done for 'bin_uri'; the difference being that the numbers in each cell would be the number of times that respective BIN is found at a particular site.cat or 'grid'). site.cat can be any of the geography fields (Meta data on fields can be checked using the bold.fields.info()). Alternatively, grids = TRUE will generate grids based on the BIN occurrence data (latitude, longitude) with the size of the grid determined by the user (in sq.m.). For grids generation, rows with no latitude and longitude data are removed (even if a corresponding site.cat information is available) while NULL entries for site.cat are allowed if they have a latitude and longitude value (This is done because grids are drawn based on the bounding boxes which only use latitude and longitude values).grids converts the Coordinate Reference System (CRS) of the data to a ‘Mollweide' projection by which distance based grid can be correctly specified. A cell id is also given to each grid with the lowest number assigned to the lowest latitudinal point in the dataset. The cellids can be changed as per the user by making changes in the grids_final sf data frame stored in the output. The grids can be visualized with view.grids=TRUE. The plot obtained is a visualization of the grids with their respective names. Please note that a) if the data has many closely located grids, visualization with view.grids can get confusing. The argument pre.abs will convert the counts (or abundances) to 1 and 0. This dataset can then directly be used as the input data for functions from packages like vegan for biodiversity analyses.