country_dictionary provides a set of lookup tables used to standardize
country names and country codes in occurrence datasets.
The dictionary is built from rnaturalearthdata::map_units110
and consolidates a wide variety of country name variants (in several
languages and formats), as well as multiple coding systems, into a single
suggested standardized name.
This object is used internally by functions that clean or harmonize
country fields, ensuring that country names in occurrence datasets (e.g.,
"Brasil","brasil", "BR", "BRA", "République Française") are all
mapped consistently to a single standardized form ("brazil", "france",
etc.).
country_dictionaryA named list of two data frames:
country_nameA data frame with two columns:
country_nameCharacter. Lowercased and accent-stripped country
name variants (from multiple rnaturalearthdata fields such as
name, name_long, abbrev, formal_en, and alternative names in
several languages).
country_suggestedCharacter. The standardized country name,
derived from the name column of map_units110, also lowercased and
accent-stripped.
country_codeA data frame with two columns:
country_codeCharacter. Country codes from several systems, including ISO-2, ISO-3, FIPS, postal codes, and others, after filtering invalid or ambiguous codes.
country_suggestedCharacter. The standardized country name corresponding to each code.
The dictionary is generated by:
extracting multiple name and code fields from
rnaturalearthdata::map_units110,
converting names to lowercase and removing accents,
converting codes to uppercase,
removing invalid or ambiguous codes (e.g., -99, "J", various
country mismatches),
and ensuring uniqueness across all entries.
data(country_dictionary)
head(country_dictionary$country_name)
head(country_dictionary$country_code)
Run the code above in your browser using DataLab