rows-db: Updating database tables

Description

These methods provide a framework for manipulating individual rows in existing tables. All operations expect that both existing and new data are presented in two compatible tbl objects.

If y lives on a different data source than x, it can be copied automatically by setting copy = TRUE, just like for dplyr::left_join().

On mutable backends like databases, these operations manipulate the underlying storage. In contrast to all other operations, these operations may lead to irreversible changes to the underlying database. Therefore, in-place updates must be requested explicitly with in_place = TRUE. By default, an informative message is given. Unlike compute() or copy_to(), no new tables are created.

The sql_rows_*() functions return the SQL used for the corresponding rows_*() function with in_place = FALSE. y needs to be located on the same data source as x.

sql_returning_cols() and sql_output_cols() construct the SQL required to support the returning argument. Two methods are required, because the syntax for SQL Server (and some other databases) is vastly different from Postgres and other more standardized DBs.

Usage

# S3 method for tbl_dbi
rows_insert(
  x,
  y,
  by = NULL,
  ...,
  in_place = NULL,
  copy = FALSE,
  check = NULL,
  returning = NULL
)
# S3 method for tbl_dbi
rows_update(
  x,
  y,
  by = NULL,
  ...,
  in_place = NULL,
  copy = FALSE,
  check = NULL,
  returning = NULL
)
# S3 method for tbl_dbi
rows_patch(
  x,
  y,
  by = NULL,
  ...,
  in_place = NULL,
  copy = FALSE,
  check = NULL,
  returning = NULL
)
# S3 method for tbl_dbi
rows_upsert(
  x,
  y,
  by = NULL,
  ...,
  in_place = NULL,
  copy = FALSE,
  check = NULL,
  returning = NULL
)
# S3 method for tbl_dbi
rows_delete(
  x,
  y,
  by = NULL,
  ...,
  in_place = NULL,
  copy = FALSE,
  check = NULL,
  returning = NULL
)
sql_rows_insert(x, y, ..., returning_cols = NULL)
sql_rows_update(x, y, by, ..., returning_cols = NULL)
sql_rows_patch(x, y, by, ..., returning_cols = NULL)
sql_rows_delete(x, y, by, ..., returning_cols = NULL)
sql_returning_cols(x, returning_cols, ...)
sql_output_cols(x, returning_cols, output_delete = FALSE, ...)

Arguments

A pair of data frames or data frame extensions (e.g. a tibble). y must have the same columns of x or a subset.

An unnamed character vector giving the key columns. The key values must uniquely identify each row (i.e. each combination of key values occurs at most once), and the key columns must exist in both x and y.

By default, we use the first column in y, since the first column is a reasonable place to put an identifier variable.

...

Other parameters passed onto methods.

in_place

Should x be modified in place? This argument is only relevant for mutable backends (e.g. databases, data.tables).

When TRUE, a modified version of x is returned invisibly; when FALSE, a new object representing the resulting changes is returned.

copy

If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.

check

Set to TRUE to always check keys, or FALSE to never check. The default is to check only if in_place is TRUE or NULL.

Currently these checks are no-ops and need yet to be implemented.

returning

<tidy-select> Columns to return of the inserted data. Note that also columns not in y but automatically created when inserting into x can be returned, for example the id column.

Due to upstream limitations, a warning is given if this argument is passed unquoted. To avoid the warning, quote the argument manually: use e.g. returning = quote(everything()) .

returning_cols

A character vector of unquote column names to return, created from the returning argument. Methods for database that do not support this should raise an error.

output_delete

For sql_output_cols(), construct the SQL for a DELETE operation.

Value

A tbl object of the same structure as x. If in_place = TRUE, the underlying data is updated as a side effect, and x is returned, invisibly. If return columns are specified with returning then the resulting tibble is stored in the attribute returned_rows. This can be accessed with get_returned_rows().

Examples

Run this code

# NOT RUN {
data <- dbplyr::memdb_frame(a = 1:3, b = letters[c(1:2, NA)], c = 0.5 + 0:2)
data

try(rows_insert(data, tibble::tibble(a = 4, b = "z")))
rows_insert(data, tibble::tibble(a = 4, b = "z"), copy = TRUE)
rows_update(data, tibble::tibble(a = 2:3, b = "w"), copy = TRUE, in_place = FALSE)
rows_patch(data, dbplyr::memdb_frame(a = 1:4, c = 0), in_place = FALSE)

rows_insert(data, dbplyr::memdb_frame(a = 4, b = "z"), in_place = TRUE)
data
rows_update(data, dbplyr::memdb_frame(a = 2:3, b = "w"), in_place = TRUE)
data
rows_patch(data, dbplyr::memdb_frame(a = 1:4, c = 0), in_place = TRUE)
data
# }

Run the code above in your browser using DataLab