OAIHarvester (version 0.3-1)

harvest: OAI-PMH Harvester

Description

Harvest a repository using Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) requests.

Usage

oaih_harvest(baseurl, prefix = "oai_dc",
             from = NULL, until = NULL, set = NULL,
             transform = TRUE)

Arguments

baseurl

a character string giving the base URL of the repository.

prefix

a character vector with the formats in which metadata should be obtained, or NULL, indicating all available formats. The default ("oai_dc") corresponds to the mandatory OAI unqualified Dublin Core metadata schema.

from, until

character strings or Date or POSIXt date/time objects giving datestamps to be used as lower or upper bounds, respectively, for datestamp-based selective harvesting (i.e., only harvest records with datestamps in the given range). If character, dates and times must be encoded using ISO 8601 in either %F or %FT%TZ format (see strptime). The trailing Z must be used when including time. OAI-PMH implies UTC for data/time specifications.

set

a character vector giving the sets to be used for selective harvesting (i.e., only harvest records in the given sets), or NULL.

transform

a logical indicating whether the OAI-PMH XML results to “useful” R data structures via oaih_transform. Default: true.

Value

If the OAI-PMH request was successful, the result of the request as XML or (default) transformed to “useful” R data structures.

Details

This is a high-level function for conveniently harvesting metadata from a repository, allowing specifying several metadata formats or sets. It also maps datestamps specified as R date or date/time objects to valid OAI-PMH datestamps according to the granularity of the repository.