This function screens name for common substitutes,
abbreviations, qualifiers, and notations expressing uncertainty in
taxonomic identifications. When any of these notations are present,
the taxonomic name is considered uncertain, while in their absence, the
taxonomic name is considered certain. A pre-defined named list of terms
is screened for by default (i.e.
list(subspecies = c("(?<!n\\. )ssp\\.", "(?<!n\\. )subsp\\."), ...)),
with the following names and values:
subspecies: ssp., subsp. (while ignoring n. ssp. and n. subsp.)
species: sp., spp. (while ignoring n. sp. and n. spp.)
genus: gen. (while ignoring n. gen. and n. gen.)
family: fam. (while ignoring n. fam.)
indeterminable: indeterminabilis, indeterminata, indet., ind.
uncertain: incerta, ind., ?, "", ''
confer: confer, cf., cfr., conf.
dubia: dubia, sp. dub., nomen dubium
incertae: incertae sedis, inc. sed.
problematica: problematica
informal: informal
unavailable: NA
trace: ex., exuvia, exuviae
not_specified: NO_X_SPECIFIED, where X is any character string
Additional terms to screen for can be provided via the terms argument
via a named list (e.g. terms = list(custom = "species1")). In addition,
the pre-defined named list can be modified to omit, or update certain
terms (e.g. terms = list(species = NULL) or
terms = list(genus = c("(?<!n\\. )gen\\.")). Note, while this function
intends to minimise false positives (e.g. use of "sp." over "sp" to avoid
mid-name matches, ignoring "n. gen." (new genus) but flagging "gen."),
it is the responsibility of the user to understand the scale of risk for
screened terms with respect to the input data.
The pre-defined list is intended to be comprehensive, and is informed by:
Sigovini, M., Keppel, E., & Tagliapietra, D. (2016). Open Nomenclature in
the biodiversity era. Methods in Ecology and Evolution, 7(10), 1217-1225.
tools:::Rd_expr_doi("10.1111/2041-210X.12594").
If you wish additional terms to be screened for by default, please
raise a GitHub Issue.