rlang 0.1

library("rlang") knitr::opts_chunk$set(collapse = TRUE, comment = "#>")

It is with great pleasure that we announce the first release of rlang. This package provides tools for working with core language features of R and the tidyverse. You can install it by running:

install.packages("rlang")

(rlang is not currently installed with the tidyverse package, but it will be in the near future.)

rlang includes a large number of tools, and we'll be working to describe and document them clearly in the future. In this blog post, we'll introduce the "tidy evaluation" framework, and discuss some of the design principles that underlie rlang. You can learn more at http://rlang.tidyverse.org.

Tidy evaluation

Tidy evaluation, or tidyeval for short, is a new approach to non-standard evaluation (NSE) that will be implemented in all tidyverse grammars, including dplyr, tidyr, and ggplot2.

Tidyeval is built on top of three key tools:

  • Quosures, a data structure that captures both an expression and its environment. Quosures are a subtype of formulas that have special support in tidyeval grammars. You create quosures with quo(), enquo() and quos().

  • Quasiquotation, a tool that lets you "unquote", or evaluate, values in the middle of expressions that are otherwised quoted.

  • Tools for evaluating expressions containing quosures: eval_tidy() and as_overscope(). This is what you will need to create your own grammars.

The complete system is too much to describe in a blog post, so there are two places to learn more:

  • To learn how tidyeval will help you program with data analysis grammars read programming with dplyr, a vignette that will be included in the upcoming dplyr release.

  • To learn more about the theory behind tidyeval, read the tidy evaluation vignette included in rlang.

Features and principles in rlang

Many rlang functions overlap with base R functions: the goal of rlang, like many tidyverse packages, is not to allow you to do fundamentally new things, but to do things with greater ease. One way that rlang makes your life easier is by adopting a consistent set of principles that thread throughout the package.

We describe four important principles below:

  • Splicing and unquoting syntax.
  • Pattern-matching predicates.
  • Naming conventions.
  • Comprehensive documentation.

Splicing and unquoting syntax

All rlang functions taking ... support a special syntax for splicing and unquoting. For example, take the lang() function which creates unevaluated function calls (it's similar to base::call()). The first argument is the name of the function to call, and the subsequent arguments are the arguments to that function:

lang("foo", x = 1, y = "a", z = TRUE)

What happens if you already have the arguments in a list?

args <- list(x = 1, y = "a", z = TRUE) lang("foo", args)

You can use the unquote-splice operator, !!!, to splice the contents of the list in:

lang("foo", splice(args))

To use this in your own code, call dots_list():

capture_dots <- function(...) { dots_list(...) } str(capture_dots(a = 1, b = 2, c = 3)) str(capture_dots(!!! list(a = 1, b = 2), c = 3))

Using dots_list() means that you don't need to provide an extra argument that takes an explicit list, or relying on your users knowing how to correctly use do.call().

Pattern-matching predicates

purrr provides an extensive set of predicate functions like is_character() and is_list() that make it easy to check that arguments are the type that you expect.

There are two main differences compared to base R equivalents. Firstly, they are less surprisingly: for example is_vector(factor("a")) returns TRUE and is_atomic(NULL) returns FALSE. They also have arguments that allow you to check other properties. For example, you can check that vectors have a given length:

is_list(mtcars) is_list(mtcars, n = 10) is_list(mtcars, n = 11)

This particularly useful for more complex types like calls where you can check the number of arguments (n), the function name, or its namespace (ns):

call <- quote(base::foo(bar, baz)) is_lang(call, n = 3) is_lang(call, n = 2) is_lang(call, name = "bar") is_lang(call, name = "foo") is_lang(call, ns = "rlang") is_lang(call, ns = "base") is_lang(call, "foo", n = 2, ns = "base")

Consistent naming

rlang uses strong naming conventions to make it easier to remember what a function does, to support autocomplete, and to hopefully make it easier to guess the name of a function.

  • Prefixes and suffixes for input and output type:

    rlang tries to follow the general rule that prefixes designate the input type of a function while suffixes indicate the output type. For instance, env_bind() takes an environment while pkg_env() returns one.

  • Side-effects of setter functions: If an rlang setter starts with set_, it means it doesn't have side effects; it returns a modified object. If it starts with mut_, it changes its input in place.

  • Constructors. If a constructor takes dots, it is named after the output type:

    env(x = 1) chr(x = "a") lang("foo", x = NULL)

    On the other hand, if it takes components as formed objects, it is prefixed with new_:

    new_function(list(x = NULL), quote({ x }))
  • Scalar versus vectorised functions.

    What's the difference between has_name() and have_name()? The former is a scalar predicate while the latter is vectorised:

    has_name(mtcars, "cyl") have_name(mtcars) have_name(c(a = 1, 2))

    For that reason, is_na() is different from the base R function is.na(): it is a scalar predicate. On the other hand, are_na() is a vector predicate.

    x <- c(1L, 2L, NA, 3L) is_na(x) are_na(x)

    This consistency is a helpful hint to beginners as it's often hard to know if a function is vectorised.

Comprehensive documentation

rlang's documentation is intended to be didactic and introduce mid-level R programmers to deeper concepts and features of the language. For instance:

  • ?env provides an introduction to scoping issues in R.

  • ?lang and ?pairlist explain the structure of R expressions.

  • ?cnd_signal, ?with_handlers, and ?exiting go over the condition system in R.

Writing good documentation is hard, so expect these to get better over time.