purrrlyr (version 0.0.5)

by_row: Apply a function to each row of a data frame

Description

by_row() and invoke_rows() apply ..f to each row of .d. If ..f's output is not a data frame nor an atomic vector, a list-column is created. In all cases, by_row() and invoke_rows() create a data frame in tidy format.

Usage

by_row(.d, ..f, ..., .collate = c("list", "rows", "cols"),
  .to = ".out", .labels = TRUE)

invoke_rows(.f, .d, ..., .collate = c("list", "rows", "cols"), .to = ".out", .labels = TRUE)

Arguments

.d

A data frame.

...

Further arguments passed to ..f.

.collate

If "list", the results are returned as a list- column. Alternatively, if the results are data frames or atomic vectors, you can collate on "cols" or on "rows". Column collation require vector of equal length or data frames with same number of rows.

.to

Name of output column.

.labels

If TRUE, the returned data frame is prepended with the labels of the slices (the columns in .d used to define the slices). They are recycled to match the output size in each slice if necessary.

.f, ..f

A function to apply to each row. If ..f does not return a data frame or an atomic vector, a list-column is created under the name .out. If it returns a data frame, it should have the same number of rows within groups and the same number of columns between groups.

Value

A data frame.

Details

By default, the whole row is appended to the result to serve as identifier (set .labels to FALSE to prevent this). In addition, if ..f returns a multi-rows data frame or a non-scalar atomic vector, a .row column is appended to identify the row number in the original data frame.

invoke_rows() is intended to provide a version of pmap() for data frames. Its default collation method is "cols", which makes it equivalent to mdply() from the plyr package. Note that invoke_rows() follows the signature pattern of the invoke family of functions and takes .f as its first argument.

The distinction between by_row() and invoke_rows() is that the former passes a data frame to ..f while the latter maps the columns to its function call. This is essentially like using invoke() with each row. Another way to view this is that invoke_rows() is equivalent to using by_row() with a function lifted to accept dots (see lift()).

See Also

by_slice()

Examples

Run this code
# NOT RUN {
# ..f should be able to work with a list or a data frame. As it
# happens, sum() handles data frame so the following works:
mtcars %>% by_row(sum)

# Other functions such as mean() may need to be adjusted with one
# of the lift_xy() helpers:
mtcars %>% by_row(purrr::lift_vl(mean))

# To run a function with invoke_rows(), make sure it is variadic (that
# it accepts dots) or that .f's signature is compatible with the
# column names
mtcars %>% invoke_rows(.f = sum)
mtcars %>% invoke_rows(.f = purrr::lift_vd(mean))

# invoke_rows() with cols collation is equivalent to plyr::mdply()
p <- expand.grid(mean = 1:5, sd = seq(0, 1, length = 10))
p %>% invoke_rows(.f = rnorm, n = 5, .collate = "cols")
# }
# NOT RUN {
p %>% plyr::mdply(rnorm, n = 5) %>% dplyr::tbl_df()
# }
# NOT RUN {
# To integrate the result as part of the data frame, use rows or
# cols collation:
mtcars[1:2] %>% by_row(function(x) 1:5)
mtcars[1:2] %>% by_row(function(x) 1:5, .collate = "rows")
mtcars[1:2] %>% by_row(function(x) 1:5, .collate = "cols")
# }

Run the code above in your browser using DataLab