dpla_bulk: Get bulk DPLA data

Description

Get bulk DPLA data

Usage

dpla_bulk(year, month, key, ...)
dpla_bulk_list(...)

Arguments

year

(character) a year between 2015 and the current year

month

(character) between 1 (January) and 12 (December)

key

(character) a dataset name key, see Details.

...

Curl options passed on to crul::HttpClient()

Value

dpla_bulk_list returns the partial paths for JSON dataset dumps; append the base URL https://dpla-provider-export.s3.amazonaws.com to the beginning to get the full URL. dpla_bulk returns a path to the compressed JSON file.

Allowed Keys

all
artstor
bhl
cdl
david_rumsey
digital_commonwealth
digitalnc
esdn
georgia
getty
gpo
harvard
hathitrust
il
indiana
internet_archive
kdl
lc
maine
maryland
mdl
michigan
missouri_hub
mwdl
nara
nypl
pa
pennsylvania
scdl
smithsonian
tennessee
the_portal_to_texas_history
tn
uiuc
usc
virginia
washington
wisconsin

Details

All data in the DPLA repository are available for download as gzipped JSON files. These include the standard DPLA fields, as well as the complete record received from the partner.

See https://digitalpubliclibraryofamerica.atlassian.net/wiki/spaces/TECH/pages/5931056/Database+export+files for description of the structure of the files

dpla_bulk doesn't attempt to read in the bulk JSON files as they can be quite large - so we leave that to the user.

Examples

Run this code

# NOT RUN {
dpla_bulk(year = 2016, month = 4, key = "nypl")
res <- dpla_bulk(year = 2017, month = 1, key = "artstor")
# }

Run the code above in your browser using DataLab