taxadb (version 0.1.0)

mutate_db: Add new variables to a database

Description

dplyr::mutate() cannot pass arbitrary R functions over a database connection. This function provides a way to work around this, by querying the data in chunks and applying the function to each chunk, which is then appended back out to a temporary table.

Usage

mutate_db(.data, r_fn, col, new_column, n = 5000L, ...)

Arguments

.data

A dplyr::tbl that uses a database connection, tbl_dbi class.

r_fn

any R function that can be called on a vector (column) of the table

col

the name of the column to which the R function is applied. (Note, dplyr::mutate() can operate on an arbitrary list of columns, this function only operates on a single column at this time...)

new_column

column name for the new column.

n

the number of rows included in each chunk, see DBI::dbFetch()

...

named arguments to be passed to r_fn

Value

a dplyr tbl connection to the temporary table in the database

Examples

Run this code
# NOT RUN {
     ## All examples use a temporary directory
   Sys.setenv(TAXADB_HOME=tempdir())
  

  #Clean a list of messy common names
  names <- clean_names(c("Steller's jay", "coopers Hawk"),
               binomial_only = FALSE, remove_sp = FALSE, remove_punc = TRUE)

  #Get cleaned common names from a provider and search for cleaned names in that table
  taxa_tbl("itis", "common") %>%
  mutate_db(clean_names, "vernacularName", "vernacularNameClean",
            binomial_only = FALSE, remove_sp = FALSE, remove_punc = TRUE) %>%
  filter(vernacularNameClean %in% names)



# }

Run the code above in your browser using DataLab