Learn R Programming

gatoRs: Geographic and Taxonomic Occurrence R-Based Scrubbing

Natalie N. Patten, Michelle L. Gaynor, Douglas E. Soltis, and Pamela S. Soltis

Overview

gatoRs (Geographic and Taxonomic Occurrence R-Based Scrubbing) provides users with tools for downloading and processing biodiversity data. Click here for the full user guide.

Installation

install.packages("devtools")
devtools::install_github("nataliepatten/gatoRs")

Quick Start

Our package aims to streamline downloading and processing of biodiversity specimen data. Here is a quick example of how to download and clean with our package.

Step 1: Download

library(gatoRs)
galaxdf <- gators_download(synonyms.list = c("Galax urceolata", "Galax aphylla"), 
                write.file = FALSE,
                gbif.match = "fuzzy",
                idigbio.filter = TRUE)

Step 2: Clean

  • We do not recommend jumping to our full clean function. See our extended introduction here!

clean_data <- full_clean(galaxdf,
                         synonyms.list = c("Galax urceolata", "Galax aphylla"), 
                         digits = 3,
                         basis.list = c("Preserved Specimen","Physical specimen"), 
                         accepted.name = "Galax urceolata")

Copy Link

Version

Install

install.packages('gatoRs')

Monthly Downloads

664

Version

1.0.0

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Natalie N. Patten

Last Published

July 5th, 2023

Functions in gatoRs (1.0.0)

get_gbif

Used in gators_download() - Download data from the Global Biodiversity Information Facility
matchColClasses

matchColClasses
need_to_georeference

Identify Missing Information - Find records which lack coordinate information
full_clean

Full Cleaning - Wrapper function to speed clean
needed_records

Identify Missing Information - Find records with redacted or missing data
process_flagged

Locality Cleaning - Find possibly problematic occurrence records
get_idigbio

Used in gators_download() - Download data from Integrated Digitized Biocollections
one_point_per_pixel

Spatial Correction - One point per pixel
%>%

Pipe operator
thin_points

Spatial Correction - Spatially thin records
gators_download

Download - Download specimen data from both iDigBio and GBIF
remove_skewed

Used in basic_locality_clean() - Removed skewed locality
remove_duplicates

Remove Duplicates - Remove records with identical event dates and coordinates
taxa_clean

Taxonomic Cleaning - Filter and resolve taxon names
suppress_output

Suppress print statements and messages
fixAfterPeriod

Fix taxonomic capitalization of a species name when there are periods involved.
filter_fix_names

Used in gators_download() - Filter iDigBio results by scientific name
basic_locality_clean

Locality Cleaning - Remove missing and improbable coordinates
fix_columns

Used in gators_download() - Fill out taxonomic name columns
data

Downloaded data from gators_download() for Galax urceolata with default settings and 'limit' set to 5: data <- gators_download(synonyms.list = c("Galax urceolata", "Galax aphylla"), limit = 5)
correct_class

gatoRs Download - Correct classes of data frame columns
fix_names

Used in gators_download() - Fix taxonomic name capitalization
data_chomp

Subset Data - Get species, longitude, and latitude columns
citation_bellow

Cite Data - Get GBIF citations
basis_clean

Basis Cleaning - Removes records with certain record basis