use_schema: Create a `schema` for a Darwin Core Archive

Description

A schema is an xml document that maps the files and field names in a DwCA. This map makes it easier to reconstruct one or more related datasets so that information is matched correctly. It works by detecting column names on csv files in a specified directory; these should all be Darwin Core terms for this function to produce reliable results. This function assumes that the publishing directory is named "data-publish". This function is primarily internal and is called by build_archive(), but is exported for clarity and debugging purposes.

Usage

use_schema(overwrite = FALSE, quiet = FALSE)

Value

Does not return an object to the workspace; called for the side effect of building a schema file in the publication directory.

Arguments

overwrite: By default, use_schema() will not overwrite existing files. If you really want to do so, set this to TRUE.
quiet: (logical) Should progress messages be suppressed? Default is set to FALSE; i.e. messages are shown.

Details

To be compliant with the Darwin Core Standard, the schema file must be called meta.xml, and this function enforces that.

Examples

Run this code

# \dontshow{
# Note we use `setwd()` and `proj_set()` in place of 
# `usethis::local_project()` because, unlike 
# in /tests, these sections are wrapped in `dontshow` which exits the 
# temporary directory *before* any actual code is run.
.old_wd <- getwd()
temp_dir <- tempdir()
usethis::proj_set(path = temp_dir, force = TRUE)
setwd(temp_dir)
# }

# First build some data to add to our archive
df <- tibble::tibble(
  occurrenceID = c("a1", "a2"),
  species = c("Eolophus roseicapilla", "Galaxias truttaceus"))
  
use_data_occurrences(df, quiet = TRUE)

# Now we can build a schema document to describe that dataset
use_schema(quiet = TRUE)

# Check that specified files have been created
list.files("data-publish") 

# The publish directory now contains:
#  - "occurrences.csv" which contains data
#  - "meta.xml" which is the schema document

# \dontshow{
unlink("data-publish", recursive = TRUE)
usethis::proj_set(path = .old_wd, force = TRUE)
setwd(.old_wd)
# }