f_nest_by: Create a subset of data for each group

Description

A faster nest_by().

Usage

f_nest_by(
  data,
  ...,
  .add = FALSE,
  .order = df_group_by_order_default(data),
  .by = NULL,
  .cols = NULL,
  .drop = df_group_by_drop_default(data)
)

Value

A row-wise grouped_df of the corresponding data of each group.

Arguments

data: data frame.
...: Variables to group by.
.add: Should groups be added to existing groups? Default is FALSE.
.order: Should groups be ordered? If FALSE groups will be ordered based on first-appearance.
Typically, setting order to FALSE is faster.
.by: (Optional). A selection of columns to group by for this operation. Columns are specified using tidyselect.
.cols: (Optional) alternative to ... that accepts a named character vector or numeric vector. If speed is an expensive resource, it is recommended to use this.
.drop: Should unused factor levels be dropped? Default is TRUE.

Examples

Run this code

library(dplyr)
library(fastplyr)

# Stratified linear-model example

models <- iris %>%
  f_nest_by(Species) %>%
  mutate(model = list(lm(Sepal.Length ~ Petal.Width + Petal.Length, data = first(data))),
         summary = list(summary(first(model))),
         r_sq = first(summary)$r.squared)
models
models$summary

# dplyr's `nest_by()` is admittedly more convenient
# as it performs a double bracket subset `[[` on list elements for you
# which we have emulated by using `first()`

# `f_nest_by()` is faster when many groups are involved

models <- iris %>%
  nest_by(Species) %>%
  mutate(model = list(lm(Sepal.Length ~ Petal.Width + Petal.Length, data = data)),
         summary = list(summary(model)),
         r_sq = summary$r.squared)
models$summary

models$summary[[1]]

Run the code above in your browser using DataLab