auk_split: Split an eBird data file by species

Description

Given an eBird Basic Dataset (EBD) and a list of species, split the file into multiple text files, one for each species. This function is typically used after auk_filter() has been applied if the resulting file is too large to be read in all at once.

Usage

auk_split(file, species, prefix = "", ext = "txt", sep = "\t",
  overwrite = FALSE)

Arguments

file

character; input file.

species

species character; species to filter and split by, provided as scientific or English common names, or a mixture of both. These names must match the official eBird Taxomony (ebird_taxonomy).

prefix

character; a file and directory prefix. For example, if splitting by species "A" and "B" and prefix = "data/ebd_", the resulting files will be "data/ebd_A.txt" and "data/ebd_B.txt".

ext

character; file extension, typically "txt".

sep

character; the input field separator, the eBird file is tab separated by default. Must only be a single character and space delimited is not allowed since spaces appear in many of the fields.

overwrite

logical; overwrite output files if they already exists

Value

A vector of output filenames, one for each species.

Examples

Run this code

# NOT RUN {
species <- c("Gray Jay", "Cyanocitta stelleri")
# get the path to the example data included in the package
# in practice, provide path to a filtered ebd file
# e.g. f <- "data/ebd_filtered.txt
f <- system.file("extdata/ebd-sample.txt", package = "auk")
# output to a temporary directory for example
# in practice, provide the path to the output location
# e.g. prefix <- "output/ebd_"
prefix <- file.path(tempdir(), "ebd_")
species_files <- auk_split(f, species = species, prefix = prefix)
# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples