by_slice: Apply a function to slices of a data frame

Description

by_slice() applies ..f on each group of a data frame. Groups should be set with slice_rows() or group_by().

Usage

by_slice(.d, ..f, ..., .collate = c("list", "rows", "cols"), .to = ".out",
  .labels = TRUE)

Arguments

A sliced data frame.

..f

A function to apply to each slice. If ..f does not return a data frame or an atomic vector, a list-column is created under the name .out. If it returns a data frame, it should have the same number of rows within groups and the sa

...

Further arguments passed to ..f.

.collate

If "list", the results are returned as a list- column. Alternatively, if the results are data frames or atomic vectors, you can collate on "cols" or on "rows". Column collation require vector of equal length or data frames with same number of rows.

.to

Name of output column.

.labels

If TRUE, the returned data frame is prepended with the labels of the slices (the columns in .d used to define the slices). They are recycled to match the output size in each slice if necessary.

Value

A data frame.

Details

by_slice() provides equivalent functionality to dplyr's do() function. In combination with map(), by_slice() is equivalent to summarise_each() and mutate_each(). The distinction between mutating and summarising operations is not as important as in dplyr because we do not act on the columns separately. The only constraint is that the mapped function must return the same number of rows for each variable mapped on.

Examples

Run this code

# Here we fit a regression model inside each slice defined by the
# unique values of the column "cyl". The fitted models are returned
# in a list-column.
mtcars %>%
  slice_rows("cyl") %>%
  by_slice(partial(lm, mpg ~ disp))

# by_slice() is especially useful in combination with map().

# To modify the contents of a data frame, use rows collation. Note
# that unlike dplyr, Mutating and summarising operations can be
# used indistinctly.

# Mutating operation:
df <- mtcars %>% slice_rows(c("cyl", "am"))
df %>% by_slice(dmap, ~ .x / sum(.x), .collate = "rows")

# Summarising operation:
df %>% by_slice(dmap, mean, .collate = "rows")

# Note that mapping columns within slices is best handled by dmap():
df %>% dmap(~ .x / sum(.x))
df %>% dmap(mean)

# If you don't need the slicing variables as identifiers, switch
# .labels to FALSE:
mtcars %>%
  slice_rows("cyl") %>%
  by_slice(partial(lm, mpg ~ disp), .labels = FALSE) %>%
  flatten() %>%
  map(coef)

Run the code above in your browser using DataLab