ckanr v0.3.0

0

Monthly downloads

0th

Percentile

Client for the Comprehensive Knowledge Archive Network ('CKAN') API

Client for 'CKAN' API (<https://ckan.org/>). Includes interface to 'CKAN' 'APIs' for search, list, show for packages, organizations, and resources. In addition, provides an interface to the 'datastore' API.

Readme

ckanr

cran checks Build Status Build status codecov.io rstudio mirror downloads cran version

ckanr is an R client for the CKAN API.

Description

CKAN is an open source set of tools for hosting and providing data on the web. (CKAN users could include non-profits, museums, local city/county governments, etc.).

ckanr allows users to interact with those CKAN websites to create, modify, and manage datasets, as well as search and download pre-existing data, and then to proceed using in R for data analysis (stats/plotting/etc.). It is meant to be as general as possible, allowing you to work with any CKAN instance.

Installation

Stable CRAN version

install.packages("ckanr")

Development version

install.packages("devtools")
devtools::install_github("ropensci/ckanr")
library('ckanr')

Note: the default base CKAN URL is set to http://data.techno-science.ca/. Functions requiring write permissions in CKAN additionally require a privileged CKAN API key. You can change this using ckanr_setup(), or change the URL using the url parameter in each function call. To set one or both, run:

ckanr_setup() # restores default CKAN url to http://data.techno-science.ca/
ckanr_setup(url = "http://data.techno-science.ca/")
ckanr_setup(url = "http://data.techno-science.ca/", key = "my-ckan-api-key")

ckanr package API

There are a suite of CKAN things (package, resource, etc.) that each have a set of functions in this package. The functions for each CKAN thing have an S3 class that is returned from most functions, and can be passed to most other functions (this also facilitates piping). The following is a list of the function groups for certain CKAN things, with the prefix for the functions that work with that thing, and the name of the S3 class:

  • Packages (aka packages) - package_*() - ckan_package
  • Resources - resource_*() - ckan_resource
  • Related - related_*() - ckan_related
  • Users - user_*() - ckan_user
  • Groups - group_*() - ckan_group
  • Tags - tag_*() - ckan_tag
  • Organizations - organization_*() - ckan_organization
  • Groups - group_*() - ckan_group
  • Users - user_*() - ckan_user
  • Related items - related_*() - ckan_related

The S3 class objects all look very similar; for example:

<CKAN Resource> 8abc92ad-7379-4fb8-bba0-549f38a26ddb
  Name: Data From Digital Portal
  Description:
  Creator/Modified: 2015-08-18T19:20:59.732601 / 2015-08-18T19:20:59.657943
  Size:
  Format: CSV

All classes state the type of object, have the ID to the right of the type, then have a varying set of key-value fields deemed important. This printed object is just a summary of an R list, so you can index to specific values (e.g., result$description). If you feel there are important fields left out of these printed summaries, let us know.

note: Many examples are given in brief for readme brevity

Packages

List packages

package_list(as = "table")
#>  [1] "artifact-data-agriculture"                                  
#>  [2] "artifact-data-aviation"                                     
#>  [3] "artifact-data-bookbinding"                                  
#>  [4] "artifact-data-chemistry"                                    
#>  [5] "artifact-data-communications"                               
#>  [6] "artifact-data-computing-technology"                         
#>  [7] "artifact-data-domestic-technology"                          
#>  [8] "artifact-data-energy-electric"                              
#>  [9] "artifact-data-exploration-and-survey"                       
#> [10] "artifact-data-fisheries"                                    
...

Show a package

package_show('34d60b13-1fd5-430e-b0ec-c8bc7f4841cf')
#> <CKAN Package> 34d60b13-1fd5-430e-b0ec-c8bc7f4841cf 
#>   Title: Artifact Data - Vacuum Tubes
#>   Creator/Modified: 2014-10-28T18:12:11.453636 / 2016-06-13T20:06:50.014352
#>   Resources (up to 5): Artifact Data - Vacuum Tubes (XML), Data Dictionary, Tips (English), Tips (French), Données d'artefact - Tubes Electronique (XML)
#>   Tags (up to 5): Vacuum Tubes
#>   Groups (up to 5): communications

Search for packages

x <- package_search(q = '*:*', rows = 2)
x$results
#> [[1]]
#> <CKAN Package> 99f457c9-ea24-48a1-87be-b52385825b6a 
#>   Title: Artifact Data - All Artifacts
#>   Creator/Modified: 2014-10-24T17:39:06.411039 / 2016-06-14T21:31:27.983485
#>   Resources (up to 5): Artifact Data - All Artifacts (XML), Data Dictonary, Tips (English), Tips (French), Données d'artefact - Tout les artefacts (XML)
#>   Tags (up to 5): Agriculture, Alimentation, Aviation, Espace, Food
#>   Groups (up to 5): everything
#> 
#> [[2]]
#> <CKAN Package> 443cb020-f2ae-48b1-be67-90df1abd298e 
#>   Title: Artifact Data - Location - Canada Aviation and Space Museum
#>   Creator/Modified: 2014-10-28T20:39:23.561940 / 2016-06-14T18:59:17.786219
#>   Resources (up to 5): Artifact Data - Location - Canada Aviation and Space Museum (XML), Data Dictionary, Tips (English), Tips (French), Jeux de données XML - Emplacements - Musée de l'aviation et de l'espace du Canada
#>   Tags (up to 5): Canada Aviation and Space Museum, Location
#>   Groups (up to 5): location

Resources

Search for resources

x <- resource_search(q = 'name:data', limit = 2)
x$results
#> [[1]]
#> <CKAN Resource> e179e910-27fb-44f4-a627-99822af49ffa 
#>   Name: Artifact Data - Exploration and Survey (XML)
#>   Description: XML Dataset
#>   Creator/Modified: 2014-10-28T15:50:35.374303 / 
#>   Size: 
#>   Format: XML
#> 
#> [[2]]
#> <CKAN Resource> ba84e8b7-b388-4d2a-873a-7b107eb7f135 
#>   Name: Data Dictionary
#>   Description: Data dictionary for CSTMC artifact datasets.
#>   Creator/Modified: 2014-11-03T18:01:02.094210 / 
#>   Size: 
#>   Format: XLS

Users

List users

user_list()[1:2]
#> [[1]]
#> <CKAN User> ee100ca6-2363-4db8-b24b-066e865c33ec 
#>   Name: CSTMC
#>   Display Name: CSTMC
#>   Full Name: 
#>   No. Packages: 
#>   No. Edits: 0
#>   Created: 2014-10-16T18:15:03.685929
#> 
#> [[2]]
#> <CKAN User> de64d5d4-86ab-4510-820b-f0bd86ea7a79 
#>   Name: default
#>   Display Name: default
#>   Full Name: 
#>   No. Packages: 
#>   No. Edits: 0
#>   Created: 2014-03-20T02:55:40.628968

Groups

List groups

group_list(as = 'table')[, 1:3]
#>                         display_name description
#> 1                     Communications            
#> 2 Domestic and Industrial Technology            
#> 3                         Everything            
#> 4                           Location            
#> 5                          Resources            
#> 6         Scientific Instrumentation            
#> 7                     Transportation            
#>                                title
#> 1                     Communications
#> 2 Domestic and Industrial Technology
#> 3                         Everything
#> 4                           Location
#> 5                          Resources
#> 6         Scientific Instrumentation
#> 7                     Transportation

Show a group

group_show('communications', as = 'table')$users
#>   openid about capacity     name                    created
#> 1     NA  <NA>    admin     marc 2014-10-24T14:44:29.885262
#> 2     NA          admin sepandar 2014-10-23T19:40:42.056418
#>                         email_hash sysadmin
#> 1 a32002c960476614370a16e9fb81f436    FALSE
#> 2 10b930a228afd1da2647d62e70b71bf8     TRUE
#>   activity_streams_email_notifications  state number_of_edits
#> 1                                FALSE active             516
#> 2                                FALSE active              44
#>   number_administered_packages display_name fullname
#> 1                           40         marc     <NA>
#> 2                            1     sepandar         
#>                                     id
#> 1 27778230-2e90-4818-9f00-bbf778c8fa09
#> 2 b50449ea-1dcc-4d52-b620-fc95bf56034b

Tags

List tags

tag_list('aviation', as = 'table')
#>   vocabulary_id                     display_name
#> 1            NA                         Aviation
#> 2            NA Canada Aviation and Space Museum
#>                                     id                             name
#> 1 cc1db2db-b08b-4888-897f-a17eade2461b                         Aviation
#> 2 8d05a650-bc7b-4b89-bcc8-c10177e60119 Canada Aviation and Space Museum

Show tags

tag_show('Aviation')$packages[[1]][1:3]
#> $owner_org
#> [1] "fafa260d-e2bf-46cd-9c35-34c1dfa46c57"
#> 
#> $maintainer
#> [1] ""
#> 
#> $relationships_as_object
#> list()

Organizations

List organizations

organization_list()
#> [[1]]
#> <CKAN Organization> fafa260d-e2bf-46cd-9c35-34c1dfa46c57 
#>   Name: cstmc
#>   Display name: CSTMC
#>   No. Packages: 
#>   No. Users: 0

ckanr's dplyr interface

ckanr implements a dplyr SQL interface to CKAN's datastore. You can access any resource in the datastore directly using only the CKAN resource ID.

Note: this will only work for resources which were uploaded successfully to the datastore - they will show the green "Data API" button in CKAN.

ckan <- ckanr::src_ckan("https://my.ckan.org/")
res_id <- "my-ckan-resource-id"
dplyr::tbl(src = ckan$con, from = res_id) %>% as_tibble(.)

Examples of different CKAN APIs

See ckanr::servers() for a list of CKAN servers. Ther are 125 as of 2019-07-11.

The Natural History Museum

Website: https://data.nhm.ac.uk/

ckanr_setup(url = "https://data.nhm.ac.uk")
x <- package_search(q = '*:*', rows = 1)
x$results
#> [[1]]
#> <CKAN Package> d68e20f4-a56d-4a8a-a8d7-dc478ba64c76 
#>   Title: Wallace and Banks drawers
#>   Creator/Modified: 2018-08-15T13:31:33.694910 / 2019-07-08T13:13:01.096684
#>   Resources (up to 5): Drawer-level images
#>   Tags (up to 5): 
#>   Groups (up to 5):
NA

The National Geothermal Data System

Website: http://geothermaldata.org/

ckanr_setup("http://search.geothermaldata.org")
x <- package_search(q = '*:*', rows = 1)
x$results
#> [[1]]
#> <CKAN Package> 71ffb979-c3c8-467c-ab67-fe9477f0abda 
#>   Title: Resource Analysis for Deep Direct-Use Feasibility Study in East Texas, Part 2 MEMO SMU DDU GeologicVariability-TravPeak29Jan2019.xlsx
#>   Creator/Modified: 2019-07-02T22:55:08.989201 / 2019-07-02T22:55:09.055454
#>   Resources (up to 5): MEMO SMU DDU GeologicVariability-TravPeak29Jan2019.xlsx
#>   Tags (up to 5): DDU, Deep direct-use, East Texas, Eastman Chemical, Heat flow
#>   Groups (up to 5):
NA

Contributors

  • Scott Chamberlain
  • Imanuel Costigan
  • Sharla Gelfand
  • Florian Mayer
  • Wush Wu

Meta

  • Please report any issues or bugs.
  • License: MIT
  • Get citation information for ckanr in R doing citation(package = 'ckanr')
  • Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

ropensci

Functions in ckanr

Name Description
ds_create_dataset Datastore - create a new resource on an existing dataset
group_list List groups.
ds_search Datastore - search or get a dataset from CKRAN datastore
group_delete Delete a group
ckanr-package R client for the CKAN API
group_update Update a group
ckanr_settings Get or set ckanr CKAN settings
license_list Return the list of licenses available for datasets on the site.
as.ckan_tag ckan_tag class helpers
ds_create Add a new table to a datastore
dashboard_count Number of new activities of an authorized user
group_patch Update a group's metadata
related_show Show a related item
related_list List related items
ckan_info Get information on a CKAN server
ckanr-deprecated Deprecated functions in ckanr
organization_show Show an organization
organization_list List organization
related_update Update a related item
group_show Show a package
package_create Create a package
package_activity_list Return a list of the package's activity
%>% Pipe operator
ping Ping a CKAN server to test that it's up or down.
resource_update Update a resource's file attachment
revision_list Return a list of the IDs of the site's revisions.
resource_search Search for resources.
package_delete Delete a package
resource_create Create a resource
organization_create Create an organization
package_list_current List current packages with resources.
organization_delete Delete an organization
package_patch Update a package's metadata
package_show Show a package.
user_follower_count Return a a user's follower count
user_follower_list Return a a user's follower count
resource_show Show a resource.
ckanr_setup Configure default CKAN settings
dashboard_activity_list Authorized user's dashboard activity stream
ds_search_sql Datastore - search or get a dataset from CKRAN datastore
package_search Search for packages.
related_delete Delete a related item.
resource_delete Delete a resource.
user_activity_list Return a list of a user's activities
package_revision_list Return a dataset (package's) revisions as a list of dictionaries.
related_create Create a related item
group_create Create a group
package_list List datasets.
tag_list List tags.
user_followee_count Return a a user's follower count
user_delete Delete a user.
tag_create Create a tag
user_create Create a user.
package_update Update a package
resource_patch Update a resource's metadata
tag_search List tags.
servers CKAN server URLS and other info
tag_show Show a tag.
src_ckan Connect to CKAN with dplyr
user_list Return a list of the site's user accounts.
user_show Show a user.
ckan_fetch Download a file
as.ckan_resource ckan_resource class helpers
as.ckan_related ckan_related class helpers
as.ckan_group ckan_group class helpers
ckan_classes ckanr S3 classes
as.ckan_user ckan_user class helpers
as.ckan_package ckan_package class helpers
as.ckan_organization ckan_organization class helpers
changes Get an activity stream of recently changed datasets on a site.
No Results!

Vignettes of ckanr

Name
ckanr.Rmd
No Results!

Last month downloads

Details

License MIT + file LICENSE
LazyData true
URL https://github.com/ropensci/ckanr
BugReports https://github.com/ropensci/ckanr/issues
VignetteBuilder knitr
Encoding UTF-8
RoxygenNote 6.1.1
X-schema.org-keywords database, open-data, ckan, api, data, dataset
X-schema.org-applicationCategory Data Access
X-schema.org-isPartOf "https://ropensci.org"
NeedsCompilation no
Packaged 2019-07-23 01:47:39 UTC; sckott
Repository CRAN
Date/Publication 2019-07-23 04:30:02 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/ckanr)](http://www.rdocumentation.org/packages/ckanr)