rdatamarket

The rdatamarket package is an R client for the DataMarket.com API, fetching the contents and metadata of datasets on DataMarket.com into R.

To install the package:

> install.packages("rdatamarket")

(If you are on Linux and get error messages involving RCurl, you may need to install a package called libcurl4-openssl-dev or similar, to get RCurl working.)

... and then load the package:

> library(rdatamarket)

Quick start

Just find the data you want on datamarket.com, then copy the URL from your browser (or a short URL to it) into dmlist or dmseries:

> plot(dmseries("http://datamarket.com/data/set/17tm/#ds=17tm!kqc=17.v.i"))
> plot(dmseries("http://data.is/nyFeP9"))
> l <- dmlist("http://data.is/nyFeP9"))

If you need to go through an HTTP proxy, set it up this way:

> dmCurlOptions(proxy="http://outproxy.mycompany.com")

Reading metadata

Get a dataset object (find the ID in a datamarket URL, or just paste in the whole URL if you like):

> oil <- dminfo("17tm")
> oil <- dminfo("http://datamarket.com/data/set/17tm/#ds=17tm!kqc=17.v.i"))
> print(oil)
Title: "Oil: Production tonnes"
Provider: "BP"
Dimensions:
  "Country" (60 values):
    "Algeria"
    "Angola"
    "Argentina"
    "Australia"
    "Azerbaijan"
    [...]

See all the values of the Country dimension:

> oil$dimensions[[1]]$values
  a  "Algeria"
 17  "Angola"
  d  "Argentina"
  z  "Australia"
 1l  "Azerbaijan"
 1b  "Brazil"
  v  "Brunei"
 1h  "Cameroon"
 13  "Canada"
 1o  "Chad"
[...]

Here's a dataset with two dimensions (besides time):

> p<-dminfo("http://datamarket.com/data/set/12r9/male-population-thousands")
> print(p)
Title: "Male population (thousands)"
Provider: "United Nations" (citing "United Nations Population Division")
Dimensions:
  "Country or Area" (229 values):
    "Afghanistan"
    "Africa"
    "Albania"
    "Algeria"
    "Angola"
    [...]
  "Variant" (5 values):
    "Constant-fertility scenario"
    "Estimate variant"
    "High variant"
    "Low variant"
    "Medium variant" 

Reading data

From that last dataset, fetch the UN's population prediction for Sweden and Somalia in the constant-fertility scenario (note the “(thousands)” in the dataset title):

> dmseries(p, 'Country or Area'=c("Somalia", "Sweden"),
           Variant="Constant-fertility scenario")
             Somalia   Sweden
2010-07-01  4642.070 4613.551
2015-07-01  5357.233 4725.918
2020-07-01  6211.305 4840.434
2025-07-01  7243.572 4942.865
2030-07-01  8490.929 5021.646
2035-07-01  9990.910 5083.680
2040-07-01 11793.524 5144.685
2045-07-01 13966.319 5211.212
2050-07-01 16597.110 5281.437

> dmlist(p, 'Country or Area'=c("Somalia", "Sweden"),
         Variant="Constant-fertility scenario")
   Country.or.Area                     Variant Year     Value
1          Somalia Constant-fertility scenario 2010  4642.070
2          Somalia Constant-fertility scenario 2015  5357.233
3          Somalia Constant-fertility scenario 2020  6211.305
4          Somalia Constant-fertility scenario 2025  7243.572
5          Somalia Constant-fertility scenario 2030  8490.929
6          Somalia Constant-fertility scenario 2035  9990.910
7          Somalia Constant-fertility scenario 2040 11793.524
8          Somalia Constant-fertility scenario 2045 13966.319
9          Somalia Constant-fertility scenario 2050 16597.110
10          Sweden Constant-fertility scenario 2010  4613.551
11          Sweden Constant-fertility scenario 2015  4725.918
12          Sweden Constant-fertility scenario 2020  4840.434
13          Sweden Constant-fertility scenario 2025  4942.865
14          Sweden Constant-fertility scenario 2030  5021.646
15          Sweden Constant-fertility scenario 2035  5083.680
16          Sweden Constant-fertility scenario 2040  5144.685
17          Sweden Constant-fertility scenario 2045  5211.212
18          Sweden Constant-fertility scenario 2050  5281.437

The above demonstrates dimension filtering; dimensions and their values can be specified by their $id or their $title, to fetch the data filtered to specific values of a dimension. If no filtering is specified, all of the dataset is fetched (careful: some datasets are enormous, and the DataMarket.com API may truncate extremely large responses).

Copy Link

Version

Down Chevron

Install

install.packages('rdatamarket')

Monthly Downloads

62

Version

0.6.5

License

MIT + file LICENSE

Last Published

November 24th, 2014

Functions in rdatamarket (0.6.5)