neon_read: read in neon tabular data

Description

read in neon tabular data

Usage

neon_read(
  table = NA,
  product = NA,
  site = NA,
  start_date = NA,
  end_date = NA,
  ext = NA,
  dir = neon_dir(),
  files = NULL,
  .id = NA,
  ...
)

Arguments

table

the name of a downloaded NEON table in the store, see neon_store

product

Include only files matching this NEON productCode(s)

site

4-letter site code(s) to filter on. Leave as NA to search all.

start_date

Download only files as recent as (YYYY-MM-DD). Leave as NA to download up to the most recent available data.

end_date

Download only files up to end_date (YYYY-MM-DD). Leave as NA to download all prior data.

ext

only match files with this file extension(s)

dir

Location where files should be downloaded. By default will use the appropriate applications directory for your system (see rappdirs::user_data_dir). This default also be configured by setting the environmental variable NEONSTORE_HOME, see Sys.setenv or Renviron.

files

optionally, specify a vector of file paths directly (e.g. as provided from neon_index) and specify table argument as NULL.

.id

add an additional id column with metadata from filename.

...

additional arguments to vroom::vroom, can usually be omitted.

Details

NEON's tabular data files are separated out into separate .csv files for each site for each month of sampling. In principle, each file has identical columns. vroom::vroom can read in a data table that has been sharded into many files like this much much faster than other parsers can read in each table iteratively, (and thus can greatly out-perform the 'stacking" methods in neonUtilities).

Unfortunately, not all datasets are entirely consistent in their use of columns. neon_read works around this by parsing such tables in groups of matching schema, which is still reasonably fast.

For convenience, neon_read takes the name of a table in the local store.

Examples

Run this code

# NOT RUN {
neon_read("brd_countdata-expanded")

## Read in specific files from the neon_index():
files <- neon_index(table = "brd_countdata-expanded")$path
neon_read(files = files)

# }

Run the code above in your browser using DataLab