vadr v0.01

by Peter Meilstrup

Macros, metaprogramming, and pleasant programming idioms.

This package implements workalikes for the author's (and perhaps your) favorite features from other languages, making R programs shorter and more expressive. Destructurimg bind lets you perform parallel assignments and unpack nested lists. A chain macro lets you quickly string several operations together, similar to Unix pipelines. A collection of functions makes it easy to work with `...` arguments and missing values. Behind the scenes is an efficient macro utility, with a much improved alternative to the bquote() function.

Readme

vadr

The Death Star is blowing up. vadr is being split into a few smaller, more focused packages that will actually make it to CRAN.

* memo  * fexpr  * interp
|       |        |
+---+---+--------+
    |
    * macro
    |
    +-- * chain
    .
    .
    .
    +-- * match
    |   |
    |   * syntax
    |
    +-- * iter

(vadr's pieces)

  • memo: An in-memory cache (used for macro expansions)
  • fexpr: to !ols to manipulate R promises, unevaluated arguments, missing arguments and ... lists.
  • interp: Ruby-style string interpolation.
  • macro: Common Lisp style macros and quasiquotation operators for R.
  • chain: Pipeline-style computations, faster and more expressive than magrittr.

(future developments):

  • match: Python-style destructuring binds and ML_style pattern matching
  • iter: Python-style list comprehensions and iterators
  • syntax: Scheme-style hygeinic(ish) macros

Original documentation follows...

vadr

R has been seduced by the dark S of the Force. It is more PHP now than Lisp. Its mind is twisted and evil.

But there is good in R, I can feel it. I can save it. I have to try.

R is a curious language. At its core is a Lisp interpreter with first-class environments and lazy evaluation implemented in terms of underlying fexprs. It's a language whose core was made flexible enough to reimplement a weird old language like S-PLUS on top of.

Oddly, all the good bits of R seem to have been buried under an implementation of weird old S-PLUS.

I like the core language trapped underleath there. It's a bit like what John Shutt was talking about in his thesis on Kernel. I'd like to elevate the core above the S facade.

Luckily, R is one of the most syntactically malleable languages out there, if you look at it right.

Look at your favorite language. Then at R. Then back to your favorite language. Then back to R, Sadly, R isn't your favorite language. But it could smell like your favorite language. Look down. Now back to R. Where are you? You're writing code in the language your language could smell like. Anything is possible. R's on a horse.

This package implements workalikes for the author's (and perhaps your) favorite features from other languages, making R programs shorter and more expressive. Here are some samples of what you can do:

Destructuring bind

bind[] assigns to several variables at once, by unpacking a list. Say you have some data coming in with an awkward, messy format, and you want to extract and reorganize some of the data.

record <- list("Marilyn", "Monroe", dob=list("June", 1, 1926),
               profession="film star", "born Norma Jean Baker",
               fbi_file_no="FFIJ8SN_65",
               "1947 California Artichoke Queen",
               list("August", 5, 1962))

You could do the ordinary way:

first <- record[[1]]
last <- record[[2]]
bday <- record$dob[[2]]
bmonth <- record$dob[[1]]
byear <- record$dob[[3]]
dday <- record[[length(record)]][[2]]
dmonth <- record[[length(record)]][[1]]
dyear <- record[[length(record)]][[3]]
record[["fbi_file_no"]] <- NULL
notes <- #....uhh, everything else, somehow?

My eyes glaze over. Or you could use bind[]:

bind[first, last, dob=bind[bmonth, bday, byear],
     fbi_file_no=, ...=notes, bind[dmonth, dday, dyear]] <- record

Chains

You ever take some data and pass it through a function, then pass the result theough another function, then pass that through another function, in a series of steps? I do that all the time.

You basically have two options for how to write such code: assign every intermediate result to a var, probably reusing the same variable name, which I hate because I don't want to give a name to data until it actually is what it's name suggests; or do it all at once in a deeply nested function call, which gets you Dagwood Sandwich Code.

Example: Let's compute the perimeter of the 137-gon inscribed in the unit circle.

If you are comfortable with an array-oriented language (such as R), you might see this task and think: "Ok, so get the (x,y) coordinates of the vertices, then difference them to get edge lengths, then add lengths up for the perimiter."

You could write it serial assignment style, until you run out of patience for naming things:

n <- 137
radians <- seq(0, 2*pi, length=n+1)
coords <- cbind(sin(radians), cos(radians))
differences <- apply(coords, 2, diff)
segment.lengths <- sqrt(rowSums(differences^2))
perimeter <- sum(segment.lengths)

Or you could write inscrutable Dagwood sandwich style, where, for example, the 2 and the ^2 wind up an enormous distance from the functions (apply and rowSums) they are argument or subsequent to:

n <- 137
sum(sqrt(rowSums(apply(sapply(c("sin", "cos"), do.call,
                              list(seq(0, 2*pi, length=137))),
                       2, diff)^2)))

This package provides an alternative for this kind of code, chain. Here's chain style. It's a bit like a Unix pipeline, and a bit more like the -> macro in Clojure. It is compact and reads well; things start at the beginning and you read along to the end, no jumping around, the 2 is right next to apply where it belongs and it's not junked up with a bunch of temporary names.

n <- 137
perimeter <- chain(n,
                   seq(length=.+1, 0, 2*pi),
                   cbind(sin(.), cos(.)),
                   apply(2, diff),
                   .^2, rowSums, sqrt, sum)

You can narrate this left to right. "Start with your number of sides. Sample that many times (plus one, oh fenceposts) over [0, 2*pi]. Sine and cosine of that gives you coordinates. Take differences and apply Pythagoras, squaring, summing and rooting to get the length of each side. Add it all up and you have your perimiter."

Easy string substitution

The "%#%" operator splices data into strings much like Python's "%" or Ruby's "#".

".(x), .(y)!" %#% c(x="Hello", y="World!")

But this being R, we can do it on collections of strings too. Here's a "99 bottles" implementation:

bottles <- interply(
  ".(ifelse(n%%100>0, n%%100, 'no more')) bottle.('s'[n%%100!=1]) of beer")
initcap <- function(x) {substr(x, 1, 1) <- toupper(substr(x, 1, 1)); x}
verse <- interply(
  paste0(".(initcap(bottles(n=n))) on the wall, .(bottles(n=n)).\n",
         ".(c('Go to the store and buy some more,',",
         "    'Take one down and pass it around,')[(n%%100!=0)+1])",
         " .(initcap(bottles(n=n-1))) on the wall."))
cat(verse(n=99:0), sep="\n\n")

Partial application (currying)

If you ever want to provide default arguments to a function before handing it off somewhere else, or other such tricks, this package provide both "leftward" and "rightward" partial application functions, as well as %()%, a full-apply which can be less tricksy than do.call().

printReport <- cat %<<<% "Message: " %<<% c(sep="\n", "-----\n")
printReport %()% c("message one", "message two", "message three")

These partial application utilities are fully integrated with good handling for dot-dot-dot lists mentioned in the next section.

Dot-Dot-Dot lists and missing values

Variadic arguments (...) and missing values are two of the trickiest spots of R's semantics, and there are very few tools to work with them -- besides missing there's substitute and do.call, both of which are hairy and mostly serve other purposes. Mostly people treat ... as an opaque block to pass along to another function. This package contains a number of functions that let you work explicitly with ... lists, concatenating and subsetting them, while still allowing R's lazy-evaluation semantics to do the right thing. So a function using dots can decide whether and when to evaluate each of its unnamed arguments:

inSomeOrder <- function(...) invisible(list %()% sample(dots(...)))
inSomeOrder(print("Boing!"), print("Boom"), print("Tschack!"), print("Ping"),
            print("Zong"), print("Pssh"))
# [1] "Boing!"
# [1] "Zong"
# [1] "Ping"
# [1] "Boom"
# [1] "Pssh"
# [1] "Tschack!"

For a more pointed example, consider switch. Switch takes its first argument and uses it to decide which if its subsequent arguments to evaluate.

Consider trying to implement an R function that has the behavior of switch properly (not as a C function, and not inspecting the stack using match.call() or parent.frame() which are evil.) This is doable in pure R but wacky and slow -- the only way I can see to selectively evaluate one named argument is to build a function that takes that argument:

switch2 <- function(expr, ...) {
  n <- names(substitute(list(...)))[-1]
  if (!is.null(n))
      arglist <- as.pairlist(structure(
          rep(list(quote(expr=)), length(n)),
          names=n))
  else
      (arglist <- as.pairlist(alist(...=)))

  if (is.numeric(expr))
      body <- as.name(paste0("..", expr))
  else
      body <- as.name(expr)
  f <- eval(substitute(`function`(arglist, body),
                         list(arglist=arglist, body=body)))
  f(...)
}

But with a direct interface to manipulate dotlists, switch is easy:

switch3 <- function(expr, ...) {
  dots(...)[[expr]]
}

You may also use dots_unpack() to inspect the contents of as-yet-unevaluated dots objects, exposing R's promise mechanism:

x <- 1
y <- 2
d <- dots(a=x, b=y, c=x+y)
unpack(d)
#   name         envir  expr value
# a    a <environment>     x  NULL
# b    b <environment>     y  NULL
# c    c <environment> x + y  NULL
# > y <- 3
(function(b, ...) b) %()% d #force the "b" slot to evaluate
# [1] 3
unpack(d)
#   name         envir  expr value
# a    a <environment>     x  NULL
# b    b          NULL     y     3
# c    c <environment> x + y  NULL
c %()% d
# a b c
# 1 3 4
> unpack(d)
#   name envir  expr value
# a    a  NULL     x     1
# b    b  NULL     y     3
# c    c  NULL x + y     4

Quasiquotation

qq implements quaqsiquotation, which is a way to build expressions and code out of data. qq is more capable than substitute or bquote and faster than the latter. Think 'string substitution' as above, but for syntactically correct code. See the qq vignette for more details.

Macros

Many of the facilities in vadr are implemented in terms of macros. Macros work similarly to the computing-on-the-language facilities you may be familiar with, but much of their work can be memoized so they can be faster. Vignette to follow.

Functions in vadr

Name Description
expressions Extract unevaluated expressions.
arg_env Get environment or expression from a named argument.
ammoc Evaluate all arguments in order, but return the first.
%()% Partially and fully apply arguments to functions.
arg_dots Fetch promises associated with named arguments.
dots_environments Extract or manipulate environments contained in dots lists.
%<~% Modify a variable.
bind Unpack a list and assign to multiple variables.
as.dots Convert a list of expressions into a ... object (a list of promises.)
chain Chain the output of one expression into the input of another.
index.data.frame index.data.frame
dots2env Convert an a ... object into an environment, without forcing promises.
env2dots Convert an environment into a ... object, without forcing promises.
dots Capture a list of ... arguments as an object.
expand_macros Expand any macros in the quoted expression.
dots_names Extract or change the argument names of ... arguments.
dots_unpack Show information about a ... object.
find_macros List all macros.
interpolate Evaluate expressions within strings.
fun A compact way to define a function.
macro Turn an expression-substituting function into a nonstandard-evaluating function.
dots_missing Detect missing arguments in link... arguments
qq Quasiquotation. Perform template substitutions on a quoted R expressions.
qqply Repeatedly expand an expression against sequences of values.
make_unique_names Modify some character strings unique with respect with an existing set of (unmodified) character strings.
missing_value Return an empty symbol.
quote_args Quote all arguments, like alist. But when bare words are given as arguments, interpret them as the argument name, rather than the argument value. Return a pairlist. This emulates the syntax used to specify function arguments and defaults.
put Modify part of a value.
quoting.env Given a list of names, build an environment such that evaluating any expression using those names just gets you the expression back.
run_as_command run_as_command Interpret command line arguments and invokes some function with them.
shortcutting-or Evaluate the first argument; if null, evaluate and return the second argument.
vadr Make R the language you wish R was like.
with_arg Inject named arguments into several calls and evaluate those calls.
mply Alternative to mapply with a cleaner calling convention.
No Results!

Details

Type Package
Date 2012-09-21
License MIT + file LICENSE
Collate 'ammoc.R' 'macro.R' 'anaphora.R' 'bind.R' 'chain.R' 'dots.R' 'fun.R' 'getpromise.R' 'indexing.R' 'mply.R' 'onLoad.R' 'parse.R' 'programming.R' 'qq_fast.R' 'quasiquote.R' 'resources.R' 'run_as_script.R' 'vadr-description.R' 'with_args.R'
VignetteBuilder knitr

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/vadr)](http://www.rdocumentation.org/packages/vadr)