DBItest (version 1.4)

DBIspec: DBI specification

Description

The DBI package defines the generic DataBase Interface for R. The connection to individual DBMS is made by packages that import DBI (so-called DBI backends). This document formalizes the behavior expected by the functions declared in DBI and implemented by the individal backends.

To ensure maximum portability and exchangeability, and to reduce the effort for implementing a new DBI backend, the DBItest package defines a comprehensive set of test cases that test conformance to the DBI specification. In fact, this document is derived from comments in the test definitions of the DBItest package. This ensures that an extension or update to the tests will be reflected in this document.

Arguments

Getting started

A DBI backend is an R package, which should import the DBI and methods packages. For better or worse, the names of many existing backends start with ‘R’, e.g., RSQLite, RMySQL, RSQLServer; it is up to the package author to adopt this convention or not.

Driver

Each DBI backend implements a driver class, which must be an S4 class and inherit from the DBIDriver class. This section describes the construction of, and the methods defined for, this driver class. Construction The backend must support creation of an instance of this driver class with a constructor function. By default, its name is the package name without the leading ‘R’ (if it exists), e.g., SQLite for the RSQLite package. For the automated tests, the constructor name can be tweaked using the constructor_name tweak. The constructor must be exported, and it must be a function that is callable without arguments. For the automated tests, unless the constructor_relax_args tweak is set to TRUE, an empty argument list is expected. Otherwise, an argument list where all arguments have default values is also accepted. dbDataType("DBIDriver", "ANY") The backend can override the DBI::dbDataType() generic for its driver class. This generic expects an arbitrary object as second argument and returns a corresponding SQL type as atomic character value with at least one character. As-is objects (i.e., wrapped by base::I()) must be supported and return the same results as their unwrapped counterparts. To query the values returned by the default implementation, run example(dbDataType, package = "DBI"). If the backend needs to override this generic, it must accept all basic R data types as its second argument, namely base::logical(), base::integer(), base::numeric(), base::character(), dates (see base::Dates()), date-time (see base::DateTimeClasses()), and base::difftime(). It also must accept lists of raw vectors and map them to the BLOB (binary large object) data type. The behavior for other object types is not specified.

Parametrized queries and statements

DBI supports parametrized (or prepared) queries and statements via the DBI::dbBind() generic. Parametrized queries are different from normal queries in that they allow an arbitrary number of placeholders, which are later substituted by actual values. Parametrized queries (and statements) serve two purposes:
  • The same query can be executed more than once with different values. The DBMS may cache intermediate information for the query, such as the execution plan, and execute it faster.
  • Separation of query syntax and parameters protects against SQL injection.
The placeholder format is currently not specified by DBI; in the future, a uniform placeholder syntax may be supported. Consult the backend documentation for the supported formats. For automated testing, backend authors specify the placeholder syntax with the placeholder_pattern tweak. Known examples are:
  • ? (positional matching in order of appearance) in RMySQL and RSQLite
  • $1 (positional matching by index) in RPostgres and RSQLite
  • :name and $name (named matching) in RSQLite
DBI clients execute parametrized statements as follows:
  1. Call DBI::dbSendQuery() or DBI::dbSendStatement() with a query or statement that contains placeholders, store the returned DBIResult object in a variable. Mixing placeholders (in particular, named and unnamed ones) is not recommended. It is good practice to register a call to DBI::dbClearResult() via on.exit() right after calling dbSendQuery(), see the last enumeration item.
  2. Construct a list with parameters that specify actual values for the placeholders. The list must be named or unnamed, depending on the kind of placeholders used. Named values are matched to named parameters, unnamed values are matched by position. All elements in this list must have the same lengths and contain values supported by the backend; a data.frame() is internally stored as such a list. The parameter list is passed a call to dbBind() on the DBIResult object.
  3. Retrieve the data or the number of affected rows from the DBIResult object.
    • For queries issued by dbSendQuery(), call DBI::dbFetch().
    • For statements issued by dbSendStatements(), call DBI::dbGetRowsAffected(). (Execution begins immediately after the dbBind() call, the statement is processed entirely before the function returns. Calls to dbFetch() are ignored.)
  4. Repeat 2. and 3. as necessary.
  5. Close the result set via DBI::dbClearResult().

Transactions

dbBegin("DBIConnection") and dbCommit("DBIConnection") Transactions are available in DBI, but actual support may vary between backends. A transaction is initiated by a call to DBI::dbBegin() and committed by a call to DBI::dbCommit(). Both generics expect an object of class DBIConnection and return TRUE (invisibly) upon success. The implementations are expected to raise an error in case of failure, but this is difficult to test in an automated way. In any way, both generics should throw an error with a closed connection. In addition, a call to DBI::dbCommit() without a call to DBI::dbBegin() should raise an error. Nested transactions are not supported by DBI, an attempt to call DBI::dbBegin() twice should yield an error. Data written in a transaction must persist after the transaction is committed. For example, a table that is missing when the transaction is started but is created and populated during the transaction must exist and contain the data added there both during and after the transaction. The behavior is not specified if other arguments are passed to these functions. In particular, RSQLite issues named transactions if the name argument is set. The transaction isolation level is not specified by DBI.