bigrquery v1.0.0

0

Monthly downloads

0th

Percentile

An Interface to Google's 'BigQuery' 'API'

Easily talk to Google's 'BigQuery' database from R.

Readme

bigrquery

Build
Status CRAN
Status Coverage
status

The bigrquery package makes it easy to work with data stored in Google BigQuery by allowing you to query BigQuery tables and retrieve metadata about your projects, datasets, tables, and jobs. The bigrquery package provides three levels of abstraction on top of BigQuery:

  • The low-level API provides thin wrappers over the underlying REST API. All the low-level functions start with bq_, and mostly have the form bq_noun_verb(). This level of abstraction is most appropriate if you’re familiar with the REST API and you want do something not supported in the higher-level APIs.

  • The DBI interface wraps the low-level API and makes working with BigQuery like working with any other database system. This is most convenient layer if you want to execute SQL queries in BigQuery or upload smaller amounts (i.e. \<100 MB) of data.

  • The dplyr interface lets you treat BigQuery tables as if they are in-memory data frames. This is the most convenient layer if you don’t want to write SQL, but instead want dbplyr to write it for you.

Installation

The current bigrquery release can be installed from CRAN:

install.packages("bigrquery")

The newest development release can be installed from GitHub:

# install.packages('devtools')
devtools::install_github("r-dbi/bigrquery")

Usage

Low-level API

library(bigrquery)
billing <- bq_test_project() # replace this with your project ID 
sql <- "SELECT year, month, day, weight_pounds FROM `publicdata.samples.natality`"

tb <- bq_project_query(billing, sql)
#> Auto-refreshing stale OAuth token.
bq_table_download(tb, max_results = 10)
#> # A tibble: 10 x 4
#>     year month   day weight_pounds
#>    <int> <int> <int>         <dbl>
#>  1  1969     1    20          6.44
#>  2  1969     1     9          6.38
#>  3  1969     1     9          7.19
#>  4  1969     1    11          8.13
#>  5  1969     1     3          7.25
#>  6  1969     1    15          5.06
#>  7  1969     1    25         NA   
#>  8  1969     1     4          7.06
#>  9  1969     1     6          7.19
#> 10  1969     1    26          3.53

DBI

library(DBI)

con <- dbConnect(
  bigrquery::bigquery(),
  project = "publicdata",
  dataset = "samples",
  billing = billing
)
con 
#> <BigQueryConnection>
#>   Dataset: publicdata.samples
#>   Billing: bigrquery-examples

dbListTables(con)
#> [1] "github_nested"   "github_timeline" "gsod"            "natality"       
#> [5] "shakespeare"     "trigrams"        "wikipedia"

dbGetQuery(con, sql, n = 10)
#> # A tibble: 10 x 4
#>     year month   day weight_pounds
#>    <int> <int> <int>         <dbl>
#>  1  1969     1    20          6.44
#>  2  1969     1     9          6.38
#>  3  1969     1     9          7.19
#>  4  1969     1    11          8.13
#>  5  1969     1     3          7.25
#>  6  1969     1    15          5.06
#>  7  1969     1    25         NA   
#>  8  1969     1     4          7.06
#>  9  1969     1     6          7.19
#> 10  1969     1    26          3.53

dplyr

library(dplyr)

natality <- tbl(con, "natality")

natality %>%
  select(year, month, day, weight_pounds) %>% 
  head(10) %>%
  collect()
#> # A tibble: 10 x 4
#>     year month   day weight_pounds
#>    <int> <int> <int>         <dbl>
#>  1  1969    11    29          6.00
#>  2  1969     2     6          8.94
#>  3  1969     5    16          6.88
#>  4  1970     9     4          7.13
#>  5  1970     1    24          7.63
#>  6  1970     6     6          9.00
#>  7  1970    10    30          6.50
#>  8  1971     3    18          5.75
#>  9  1971     8    11          6.19
#> 10  1971     1    23          5.75

Important details

Authentication

When using bigquery interactively, you’ll be prompted to authorize bigrquery in the browser. Your credentials will be cached across sessions in .httr-oauth. For non-interactive usage, you’ll need to download a service token JSON file and use set_service_token().

Note that bigrquery requests permission to modify your data; but it will never do so unless you explicitly request it (e.g. by calling bq_table_delete() or bq_table_upload()).

Billing project

If you just want to play around with the bigquery API, it’s easiest to start with the Google’s free sample data. You’ll still need to create a project, but if you’re just playing around, it’s unlikely that you’ll go over the free limit (1 TB of queries / 10 GB of storage).

To create a project:

  1. Open https://console.cloud.google.com/ and create a project. Make a note of the “Project ID” in the “Project info” box.

  2. Click on “APIs & Services”, then “Dashboard” in the left the left menu.

  3. Click on “Enable Apis and Services” at the top of the page, then search for “BigQuery API” and “Cloud storage”.

Use your project ID as the billing project whenever you work with free sample data; and as the project when you work with your own data.

Functions in bigrquery

Name Description
table-dep Table API deprecated
list_projects List projects deprecated
insert_upload_job Create a new upload job deprecated
wait_for Wait for a job to complete deprecated
list_tabledata Retrieve data from a table deprecated
src_bigquery A BigQuery data source for dplyr.
list_datasets List datasets deprecated
insert_query_job Create a new query job deprecated
query_exec Run a asynchronous query and retrieve results deprecated
bq_field BiqQuery field (and fields) class
bigrquery-package bigrquery: An Interface to Google's 'BigQuery' 'API'
bq_projects List available projects
bigquery BigQuery DBI driver
api-job BigQuery job: retrieve metadata
api-dataset BigQuery datasets
api-table BigQuery tables
api-project BigQuery project methods
api-perform BigQuery jobs: perform a job
DBI DBI methods
get_job Check status of a job deprecated
id-dep Table/dataset objects deprecated
insert_extract_job Create a new extract job deprecated
bq_table_download Download table data
bq_test_project Project to use for testing bigrquery
bq_query Submit query to BigQuery
bq_refs S3 classes that reference remote BigQuery datasets, tables and jobs
dataset-dep Dataset API deprecated
get_access_cred Get and set access credentials
No Results!

Last month downloads

Details

License GPL-3 | file LICENSE
URL https://github.com/rstats-db/bigrquery
BugReports https://github.com/rstats-db/bigrquery/issues
Encoding UTF-8
LinkingTo progress, rapidjsonr, Rcpp
LazyData true
RoxygenNote 6.0.1
Collate 'RcppExports.R' 'auth.R' 'bigrquery.R' 'bq-dataset.R' 'bq-download.R' 'bq-field.R' 'bq-job.R' 'bq-param.R' 'bq-parse.R' 'bq-perform.R' 'bq-project.R' 'bq-projects.R' 'bq-query.R' 'bq-refs.R' 'bq-request.R' 'bq-table.R' 'bq-test.R' 'camelCase.R' 'dbi-driver.R' 'dbi-connection.R' 'dbi-result.R' 'dplyr.R' 'gs-object.R' 'old-dataset.R' 'old-id.R' 'old-job-extract.R' 'old-job-query.R' 'old-job-upload.R' 'old-job.R' 'old-project.R' 'old-projects.R' 'old-query.R' 'old-table.R' 'old-tabledata.R' 'secret.R' 'utils.R' 'zzz.R'
NeedsCompilation yes
Packaged 2018-04-24 14:44:14 UTC; hadley
Repository CRAN
Date/Publication 2018-04-24 17:41:28 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/bigrquery)](http://www.rdocumentation.org/packages/bigrquery)