Learn R Programming

RuHere (version 1.0.1)

fix_countries: Identify and correct coordinates based on country information

Description

This function identifies and correct inverted and transposed coordinates based on country information

Usage

fix_countries(
  occ,
  long = "decimalLongitude",
  lat = "decimalLatitude",
  country_column,
  correct_country = "correct_country",
  distance = 5,
  progress_bar = FALSE,
  verbose = TRUE
)

Value

The original occ data.frame with the coordinates in the long and lat

columns corrected, and an additional column (country_issues) indicating whether the coordinates are:

  • correct: the record falls within the assigned country;

  • inverted: longitude and/or latitude have reversed signs;

  • swapped: longitude and latitude are transposed (i.e., each appears in the other's column). incorrect: the record falls outside the assigned country and could not be corrected.

Arguments

occ

(data.frame) a dataset with occurrence records, preferably with country information checked using check_countries().

long

(character) column name with longitude. Default is 'decimalLongitude'.

lat

lat (character) column name with latitude. Default is 'decimalLatitude'.

country_column

(character) name of the column containing the country information.

correct_country

(character) name of the column with logical value indicating whether each record falls within the country specified in the metadata. Default is 'correct_country'. See details.

distance

(numeric) maximum distance (in kilometers) a record can fall outside the country assigned in the country_column. Default is 5.

progress_bar

(logical) whether to display a progress bar during processing. If TRUE, the 'pbapply' package must be installed. Default is FALSE.

verbose

(logical) whether to print messages about function progress. Default is TRUE.

Details

The function checks and corrects coordinate errors in occurrence records by testing whether each point falls within the expected country polygon (from RuHere’s internal world map).

The input occurrence data must contain a column (specified in the correct_country argument) with logical values indicating which records to check and fix — only those marked as FALSE will be processed. This column can be obtained by running the check_countries() function.

It runs a series of seven tests to detect common issues such as inverted signs or swapped latitude/longitude values. Inverted coordinates have their signs flipped (e.g., -45 instead of 45), placing the point in the opposite hemisphere, while swapped coordinates have latitude and longitude values exchanged (e.g., -47, -15 instead of -15, -47).

For each test, country borders are buffered by distance km to account for minor positional errors.

The type of issue (or "correct") is recorded in a new column, country_issues. Records that match their assigned country after any correction are updated accordingly, while remaining mismatches are labeled "incorrect".

This function can be used internally by check_countries() to automatically identify and fix common coordinate errors.

Examples

Run this code
# Load example data
data("occurrences", package = "RuHere") # Import example data

# Standardize country names
occ_country <- standardize_countries(occ = occurrences,
                                     return_dictionary = FALSE)

# Check whether records fall within the assigned countries
occ_country_checked <- check_countries(occ = occ_country,
                                       country_column = "country_suggested")

# Fix records with incorrect or misassigned countries
occ_country_fixed <- fix_countries(occ = occ_country_checked,
                                   country_column = "country_suggested")

Run the code above in your browser using DataLab