bqr_upload_data: Upload data to BigQuery

Description

Upload data to BigQuery

Usage

bqr_upload_data(projectId = bq_get_global_project(),
  datasetId = bq_get_global_dataset(), tableId, upload_data,
  create = c("CREATE_IF_NEEDED", "CREATE_NEVER"), overwrite = FALSE,
  schema = NULL, sourceFormat = c("CSV", "DATASTORE_BACKUP",
  "NEWLINE_DELIMITED_JSON", "AVRO"), wait = TRUE, autodetect = FALSE)

Arguments

projectId

The BigQuery project ID.

datasetId

A datasetId within projectId.

tableId

ID of table where data will end up.

upload_data

The data to upload, a data.frame object or a Google Cloud Storage URI

create

Whether to create a new table if necessary, or error if it already exists.

overwrite

If TRUE will delete any existing table and upload new data.

schema

If upload_data is a Google Cloud Storage URI, supply the data schema. For CSV a helper function is available by using schema_fields on a data sample

sourceFormat

If upload_data is a Google Cloud Storage URI, supply the data format. Default is CSV

wait

If uploading a data.frame, whether to wait for it to upload before returning

autodetect

Experimental feature that auto-detects schema for CSV and JSON files

Value

TRUE if successful, FALSE if not.

Details

A temporary csv file is created when uploading from a local data.frame

For larger file sizes up to 5TB, upload to Google Cloud Storage first via gcs_upload then supply the object URI of the form gs://project-name/object-name to the upload_data argument.

You also need to supply a data schema. Remember that the file should not have a header row.

Examples

Run this code

# NOT RUN {
# }
# NOT RUN {
 library(googleCloudStorageR)
 library(bigQueryR)
 
 gcs_global_bucket("your-project")
 
 ## custom upload function to ignore quotes and column headers
 f <- function(input, output) {
   write.table(input, sep = ",", col.names = FALSE, row.names = FALSE, 
               quote = FALSE, file = output, qmethod = "double")}
   
 ## upload files to Google Cloud Storage
 gcs_upload(mtcars, name = "mtcars_test1.csv", object_function = f)
 gcs_upload(mtcars, name = "mtcars_test2.csv", object_function = f)
 
 ## create the schema of the files you just uploaded
 user_schema <- schema_fields(mtcars)
 
 ## load files from Google Cloud Storage into BigQuery
 bqr_upload_data(projectId = "your-project", 
                datasetId = "test", 
                tableId = "from_gcs_mtcars", 
                upload_data = c("gs://your-project/mtcars_test1.csv", 
                                "gs://your-project/mtcars_test2.csv"),
                schema = user_schema)



# }
# NOT RUN {
# }