Taxonomic names

NOT_CRAN <- identical(tolower(Sys.getenv("NOT_CRAN")), "true") knitr::opts_chunk$set( comment = "#>", collapse = TRUE, warning = FALSE, message = FALSE, purl = NOT_CRAN, eval = NOT_CRAN )

You have probably, or will, run into problems with taxonomic names. For example, you may think you know how a taxonomic name is spelled, but then GBIF will not agree with you. Or, perhaps GBIF will have multiple versions of the taxon, spelled in slightly different ways. Or, the version of the name that they think is the right one does not match what yo think is the right one.

This isn't really anyone's fault. It's a result of there not being one accepted taxonomic source of truth across the globe. There are many different taxonomic databases. GBIF makes their own backbone taxonomy that they use as a source of internal truth for taxonomic names. The accepted names in the backbone taxonomy match those in the database of occurrences - so do try to figure out what backbone taxonomy version of the name you want.

Another source of problems stems from the fact that names are constantly changing. Sometimes epithets change, sometimes generic names, and sometimes higher names like family or tribe. These changes can take a while to work their way in to GBIF's data.

The following are some examples of confusing name bits. We'll update these if GBIF's name's change. The difference between each pair of names is highlighted in bold.

Load rgbif


Helper function

To reduce code duplication, we'll use a little helper function to make a call to name_backbone() for each input name, then rbind them together:

name_rbind <- function(..., rank = "species") { columns <- c('usageKey', 'scientificName', 'canonicalName', 'rank', 'status', 'confidence', 'matchType', 'synonym') df <- lapply(list(...), function(w) { rgbif::name_backbone(w, rank = rank)[, columns] }) data.frame(, df)) }

And another function to get the taxonomic data provider

taxon_provider <- function(x) { tt <- name_usage(key = x)$data datasets(uuid = tt$constituentKey)$data$title }

We use taxon_provider() below to get the taxonomy provider in the bulleted list of details for each taxon (even though you don't see it called, we use it, but the code isn't shown :)).

Pinus sylvestris vs. P. silvestris

(c1 <- name_rbind("Pinus sylvestris", "Pinus silvestris"))
  • P. sylvestris w/ occurrences count from the data provider
occ_count(c1$usageKey[[1]]) taxon_provider(c1$usageKey[[1]])
  • P. silvestris w/ occurrences count from the data provider
occ_count(c1$usageKey[[2]]) taxon_provider(c1$usageKey[[2]])

Macrozamia platyrachis vs. M. platyrhachis

(c2 <- name_rbind("Macrozamia platyrachis", "Macrozamia platyrhachis"))
  • M. platyrachis w/ occurrences count from the data provider
occ_count(c2$usageKey[[1]]) taxon_provider(c2$usageKey[[1]])
  • M. platyrhachis w/ occurrences count from the data provider
occ_count(c2$usageKey[[2]]) taxon_provider(c2$usageKey[[2]])

Cycas circinalis vs. C. circinnalis

(c3 <- name_rbind("Cycas circinalis", "Cycas circinnalis"))
  • C. circinalis w/ occurrences count from the data provider
occ_count(c3$usageKey[[1]]) taxon_provider(c3$usageKey[[1]])
  • C. circinnalis w/ occurrences count from the data provider
occ_count(c3$usageKey[[2]]) taxon_provider(c3$usageKey[[2]])

Isolona perrieri vs. I. perrierii

(c4 <- name_rbind("Isolona perrieri", "Isolona perrierii"))
  • I. perrieri w/ occurrences count from the data provider
occ_count(c4$usageKey[[1]]) taxon_provider(c4$usageKey[[1]])
  • I. perrierii w/ occurrences count from the data provider
occ_count(c4$usageKey[[2]]) taxon_provider(c4$usageKey[[2]])

Wiesneria vs. Wisneria

(c5 <- name_rbind("Wiesneria", "Wisneria", rank = "genus"))
  • Wiesneria w/ occurrences count from the data provider
occ_count(c5$usageKey[[1]]) taxon_provider(c5$usageKey[[1]])
  • Wisneria w/ occurrences count from the data provider
occ_count(c5$usageKey[[2]]) taxon_provider(c5$usageKey[[2]])

The take away messages from this vignette

  • Make sure you are using the name you think you're using
  • Realize that GBIF's backbone taxonomy is used for occurrence data
  • Searching for occurrences by name matches against backbone names, not other names (e.g., synonyms)
  • GBIF may at some points in time have multiple version of the same name in their own backbone taxonomy - These can usually be separated by data provider (e.g., Catalogue of Life vs. International Plant Names Index)
  • There are different ways to search for names - make sure are familiar with the four different name search functions, all starting with name_