Learn R Programming

⚠️There's a newer version (2.5.0) of this package.Take me there.

dbplyr

Overview

dbplyr is the database backend for dplyr. It allows you to use remote database tables as if they are in-memory data frames by automatically converting dplyr code into SQL.

To learn more about why you might use dbplyr instead of writing SQL, see vignette("sql"). To learn more about the details of the SQL translation, see vignette("translation-verb") and vignette("translation-function").

Installation

# The easiest way to get dbplyr is to install the whole tidyverse:
install.packages("tidyverse")

# Alternatively, install just dbplyr:
install.packages("dbplyr")

# Or the the development version from GitHub:
# install.packages("devtools")
devtools::install_github("tidyverse/dbplyr")

Usage

dbplyr is designed to work with database tables as if they were local data frames. To demonstrate this I’ll first create an in-memory SQLite database and copy over a dataset:

library(dplyr, warn.conflicts = FALSE)

con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
copy_to(con, mtcars)

Note that you don’t actually need to load dbplyr with library(dbplyr); dplyr automatically loads it for you when it sees you working with a database. Database connections are coordinated by the DBI package. Learn more at http://dbi.r-dbi.org/

Now you can retrieve a table using tbl() (see ?tbl_dbi for more details). Printing it just retrieves the first few rows:

mtcars2 <- tbl(con, "mtcars")
mtcars2
#> # Source:   table<mtcars> [?? x 11]
#> # Database: sqlite 3.25.3 [:memory:]
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1  21       6  160    110  3.9   2.62  16.5     0     1     4     4
#>  2  21       6  160    110  3.9   2.88  17.0     0     1     4     4
#>  3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
#>  4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1
#>  5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2
#>  6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1
#>  7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4
#>  8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2
#>  9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2
#> 10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4
#> # … with more rows

All dplyr calls are evaluated lazily, generating SQL that is only sent to the database when you request the data.

# lazily generates query
summary <- mtcars2 %>% 
  group_by(cyl) %>% 
  summarise(mpg = mean(mpg, na.rm = TRUE)) %>% 
  arrange(desc(mpg))

# see query
summary %>% show_query()
#> <SQL>
#> SELECT `cyl`, AVG(`mpg`) AS `mpg`
#> FROM `mtcars`
#> GROUP BY `cyl`
#> ORDER BY `mpg` DESC

# execute query and retrieve results
summary %>% collect()
#> # A tibble: 3 x 2
#>     cyl   mpg
#>   <dbl> <dbl>
#> 1     4  26.7
#> 2     6  19.7
#> 3     8  15.1

Copy Link

Version

Install

install.packages('dbplyr')

Monthly Downloads

568,678

Version

1.4.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Hadley Wickham

Last Published

April 23rd, 2019

Functions in dbplyr (1.4.0)

lahman

Cache and retrieve an src_sqlite of the Lahman baseball database.
tbl_sql

Create an SQL tbl (abstract)
testing

Infrastructure for testing dplyr
translate_sql

Translate an expression to sql.
win_over

Generate SQL expression for window functions
db_compute

More db generics
dbplyr-package

dbplyr: A 'dplyr' Back End for Databases
ident

Flag a character vector as SQL identifiers
named_commas

Provides comma-separated string out ot the parameters
in_schema

Refer to a table in a schema
simulate_dbi

Simulate database connections
sql

SQL escaping.
src_sql

Create a "sql src" object
nycflights13

Database versions of the nycflights13 data
tbl_lazy

Create a local lazy tibble
window_order

Override window order and frame
arrange.tbl_lazy

Arrange rows by variables in a remote database table
collapse.tbl_sql

Force computation of query
copy_to.src_sql

Copy a local data frame to a DBI backend.
memdb_frame

Create a database table in temporary in-memory database.
lazy_ops

Lazy operations
sql_substr

Create an sql translator
src_dbi

dplyr backend for any DBI-compatible database
build_sql

Build a SQL string.
sql_expr

Generate SQL from R expressions
sql_quote

Helper function for quoting sql elements.
do.tbl_sql

Perform arbitrary computation on remote backend
escape

Escape/quote a string.
partial_eval

Partially evaluate an expression.
remote_name

Metadata about a remote table
join_query

Build and render SQL from a sequence of lazy operations
sql_escape_logical

More SQL generics
join.tbl_sql

Join sql tbls.