bigQueryR (version 0.2.0)

bqr_extract_data: Extract data asynchronously

Description

Use this instead of bqr_query for big datasets. Requires you to make a bucket at https://console.cloud.google.com/storage/browser

Usage

bqr_extract_data(projectId, datasetId, tableId, cloudStorageBucket, filename = paste0("big-query-extract-", gsub(" |:|-", "", Sys.time()), "-*.csv"), compression = c("NONE", "GZIP"), destinationFormat = c("CSV", "NEWLINE_DELIMITED_JSON", "AVRO"), fieldDelimiter = ",", printHeader = TRUE)

Arguments

projectId
The BigQuery project ID.
datasetId
A datasetId within projectId.
tableId
ID of table you wish to extract.
cloudStorageBucket
URI of the bucket to extract into.
filename
Include a wildcard (*) if extract expected to be > 1GB.
compression
Compression of file.
destinationFormat
Format of file.
fieldDelimiter
fieldDelimiter of file.
printHeader
Whether to include header row.

Value

A Job object to be queried via bqr_get_job

See Also

https://cloud.google.com/bigquery/exporting-data-from-bigquery

Other BigQuery asynch query functions: bqr_download_extract, bqr_get_job, bqr_grant_extract_access, bqr_query_asynch, bqr_wait_for_job

Examples

Run this code

## Not run: 
# library(bigQueryR)
# 
# ## Auth with a project that has at least BigQuery and Google Cloud Storage scope
# bqr_auth()
# 
# ## make a big query
# job <- bqr_query_asynch("your_project", 
#                         "your_dataset",
#                         "SELECT * FROM blah LIMIT 9999999", 
#                         destinationTableId = "bigResultTable")
#                         
# ## poll the job to check its status
# ## its done when job$status$state == "DONE"
# bqr_get_job("your_project", job$jobReference$jobId)
# 
# ##once done, the query results are in "bigResultTable"
# ## extract that table to GoogleCloudStorage:
# # Create a bucket at Google Cloud Storage at 
# # https://console.cloud.google.com/storage/browser
# 
# job_extract <- bqr_extract_data("your_project",
#                                 "your_dataset",
#                                 "bigResultTable",
#                                 "your_cloud_storage_bucket_name")
#                                 
# ## poll the extract job to check its status
# ## its done when job$status$state == "DONE"
# bqr_get_job("your_project", job_extract$jobReference$jobId)
# 
# ## to download via a URL and not logging in via Google Cloud Storage interface:
# ## Use an email that is Google account enabled
# ## Requires scopes:
# ##  https://www.googleapis.com/auth/devstorage.full_control
# ##  https://www.googleapis.com/auth/cloud-platform
# ## set via options("bigQueryR.scopes") and reauthenticate if needed
# 
# download_url <- bqr_grant_extract_access(job_extract, "your@email.com")
# 
# ## download_url may be multiple if the data is > 1GB
# 
# ## End(Not run)

Run the code above in your browser using DataCamp Workspace