normalize_cols

This is an example of building up a desired pre-prepared pipeline fragment from relop nodes.

A piped query generator based on Edgar F. Codd's relational
algebra, and on production experience using 'SQL' and 'dplyr' at big data
scale.  The design represents an attempt to make 'SQL' more teachable by
denoting composition by a sequential pipeline notation instead of nested
queries or functions.   The implementation delivers reliable high
performance data processing on large data systems such as 'Spark',
databases, and 'data.table'. Package features include: data processing trees
or pipelines as observable objects (able to report both columns
produced and columns used), optimized 'SQL' generation as an explicit
user visible table modeling step, plus explicit query reasoning and checking.

John Mount

rquery

Relational Query Generator for Data Manipulation at Scale

 Win-Vector LLC

normalize_cols function

<dl><dt>source</dt>
<dd>relop tree or data.frame source.</dd>
<dt>columns</dt>
<dd>character, columns to normalize.</dd>
<dt>...</dt>
<dd>force later arguments to bind by name.</dd>
<dt>partitionby</dt>
<dd>partitioning (window function) column names to define partitions.</dd>
<dt>env</dt>
<dd>environment to look for values in.</dd></dl>

Arguments

Build an optree pipeline that normalizes a set of columns so each column sums to one in each partition. — normalize_cols

<dl>

<dt>source</dt>
<dd>relop tree or data.frame source.</dd>


<dt>columns</dt>
<dd>character, columns to normalize.</dd>


<dt>...</dt>
<dd>force later arguments to bind by name.</dd>


<dt>partitionby</dt>
<dd>partitioning (window function) column names to define partitions.</dd>


<dt>env</dt>
<dd>environment to look for values in.</dd>

</dl>

normalize_cols: Build an optree pipeline that normalizes a set of columns so each column sums to one in each partition.

Description

Usage

Arguments

Examples