bigQueryR (version 0.2.0)

bqr_query_asynch: BigQuery query asynchronously

Description

Use for big results > 10000 that write to their own destinationTableId.

Usage

bqr_query_asynch(projectId, datasetId, query, destinationTableId, writeDisposition = c("WRITE_EMPTY", "WRITE_TRUNCATE", "WRITE_APPEND"))

Arguments

projectId
projectId to be billed.
datasetId
datasetId of where query will execute.
query
The BigQuery query as a string.
destinationTableId
Id of table the results will be written to.
writeDisposition
Behaviour if destination table exists. See Details.

Value

A Job object to be queried via bqr_get_job

Details

For bigger queries, asynchronous queries save the results to another BigQuery table. You can check the progress of the job via bqr_get_job

You may now want to download this data. For large datasets, this is best done via extracting the BigQuery result to Google Cloud Storage, then downloading the data from there.

You can read how to create a bucket at Google Cloud Storage at https://cloud.google.com/storage/docs/cloud-console

writeDisposition - behaviour if destinationTable already exists:

  • WRITE_TRUNCATE: BigQuery overwrites the table data.
  • WRITE_APPEND: BigQuery appends the data to the table
  • WRITE_EMPTY: If contains data, a 'duplicate' error is returned

See Also

Other BigQuery asynch query functions: bqr_download_extract, bqr_extract_data, bqr_get_job, bqr_grant_extract_access, bqr_wait_for_job

Examples

Run this code

## Not run: 
# library(bigQueryR)
# 
# ## Auth with a project that has at least BigQuery and Google Cloud Storage scope
# bqr_auth()
# 
# ## make a big query
# job <- bqr_query_asynch("your_project", 
#                         "your_dataset",
#                         "SELECT * FROM blah LIMIT 9999999", 
#                         destinationTableId = "bigResultTable")
#                         
# ## poll the job to check its status
# ## its done when job$status$state == "DONE"
# bqr_get_job("your_project", job$jobReference$jobId)
# 
# ##once done, the query results are in "bigResultTable"
# ## extract that table to GoogleCloudStorage:
# # Create a bucket at Google Cloud Storage at 
# # https://console.cloud.google.com/storage/browser
# 
# job_extract <- bqr_extract_data("your_project",
#                                 "your_dataset",
#                                 "bigResultTable",
#                                 "your_cloud_storage_bucket_name")
#                                 
# ## poll the extract job to check its status
# ## its done when job$status$state == "DONE"
# bqr_get_job("your_project", job_extract$jobReference$jobId)
# 
# ## to download via a URL and not logging in via Google Cloud Storage interface:
# ## Use an email that is Google account enabled
# ## Requires scopes:
# ##  https://www.googleapis.com/auth/devstorage.full_control
# ##  https://www.googleapis.com/auth/cloud-platform
# ## set via options("bigQueryR.scopes") and reauthenticate if needed
# 
# download_url <- bqr_grant_extract_access(job_extract, "your@email.com")
# 
# ## download_url may be multiple if the data is > 1GB
# 
# ## End(Not run)


Run the code above in your browser using DataLab