reframe() with per-group optimisationsA faster reframe() with per-group optimisations
f_reframe(.data, ..., .by = NULL, .order = group_by_order_default(.data))A data frame of specified results.
A data frame.
Name-value pairs of summary functions. Expressions with
across() are also accepted.
(Optional). A selection of columns to group by for this operation. Columns are specified using tidy-select.
Should the groups be returned in sorted order?
If FALSE, this will return the groups in order of first appearance,
and in many cases is faster.
fastplyr data-masking functions like f_mutate and f_summarise operate
very similarly to their dplyr counterparts but with some crucial
differences.
Optimisations for by-group operations kick in for
common statistical functions which are detailed below.
A message will be printed which one can disable
by running options(fastplyr.inform = FALSE).
When this happens, the expressions which become optimised no longer
obey data-masking rules pertaining to sequential and dependent expression
execution.
For example,
the pseudo code
f_summarise(data, mean = mean(x), mean2 = round(mean), .by = g)
when optimised will not work because the named col mean will not be visible
in later expressions.
One can disable fastplyr optimisations
globally by running options(fastplyr.optimise = F).
Some functions are internally optimised using 'collapse' fast statistical functions. This makes execution on many groups very fast.
For fast quantiles (percentiles) by group, see tidy_quantiles
List of currently optimised functions
dplyr::n -> <custom_expression>
dplyr::row_number -> <custom_expression> (only for f_mutate)
dplyr::cur_group -> <custom_expression>
dplyr::cur_group_id -> <custom_expression>
dplyr::cur_group_rows -> <custom_expression> (only for f_mutate)
dplyr::lag -> <custom_expression> (only for f_mutate)
dplyr::lead -> <custom_expression> (only for f_mutate)
base::sum -> collapse::fsum
base::prod -> collapse::fprod
base::min -> collapse::fmin
base::max -> collapse::fmax
stats::mean -> collapse::fmean
stats::median -> collapse::fmedian
stats::sd -> collapse::fsd
stats::var -> collapse::fvar
dplyr::first -> collapse::ffirst
dplyr::last -> collapse::flast
dplyr::n_distinct -> collapse::fndistinct