pick_top_k

source

force later arguments to bind by name.

partitioning (window function) column names.

partitionby

character, ordering (in window function) column names.

orderby

character, reverse ordering (in window function) of these column names.

reverse

integer, number of rows to limit to in each group.

character, command to compute row-order/rank.

order_expression

character, column name to write per-group rank in (no ties).

order_column

logical, if TRUE retain the order column in the result.

keep_order_column

environment to look for values in.

This is an example of building up a desired pre-prepared pipeline fragment from relop nodes.

A piped query generator based on Edgar F. Codd's relational
algebra, and on production experience using 'SQL' and 'dplyr' at big data
scale.  The design represents an attempt to make 'SQL' more teachable by
denoting composition by a sequential pipeline notation instead of nested
queries or functions.   The implementation delivers reliable high
performance data processing on large data systems such as 'Spark',
databases, and 'data.table'. Package features include: data processing trees
or pipelines as observable objects (able to report both columns
produced and columns used), optimized 'SQL' generation as an explicit
user visible table modeling step, plus explicit query reasoning and checking.

Last chance! 50% off unlimited learning

pick_top_k: Build an optree pipeline that selects up to the top k rows from each group in the given order.

Description

Usage

Arguments

Examples