stream_reduce

stream_reduce.shard_row_groups

stream_reduce.shard_dataset

Applies <code>f()</code> to each partition (row-group) and combines results with
<code>combine()</code> into a single accumulator. This keeps peak memory bounded by the
largest single partition (plus your accumulator).

Provides a parallel execution runtime for R that emphasizes
deterministic memory behavior and efficient handling of large shared inputs.
'shard' enables zero-copy parallel reads via shared/memory-mapped segments,
encourages explicit output buffers to avoid large result aggregation, and
supervises worker processes to mitigate memory drift via controlled recycling.
Diagnostics report peak memory usage, end-of-run memory return, and hidden
copy/materialization events to support reproducible performance benchmarking.

Bradley Buchsbaum

shard

Deterministic, Zero-Copy Parallel Execution for R

stream_reduce function

<dl><dt>x</dt>
<dd>A <code>shard_row_groups</code> or <code>shard_dataset</code> handle.</dd>
<dt>f</dt>
<dd>Function <code>(chunk, ...) -&gt; value</code> producing a per-partition value.</dd>
<dt>init</dt>
<dd>Initial accumulator value.</dd>
<dt>combine</dt>
<dd>Function <code>(acc, value) -&gt; acc</code> to update the accumulator.</dd>
<dt>...</dt>
<dd>Passed to <code>f()</code>.</dd></dl>

Arguments

Stream over row-groups/datasets and reduce — stream_reduce

<dl>

<dt>x</dt>
<dd>A <code>shard_row_groups</code> or <code>shard_dataset</code> handle.</dd>


<dt>f</dt>
<dd>Function <code>(chunk, ...) -&gt; value</code> producing a per-partition value.</dd>


<dt>init</dt>
<dd>Initial accumulator value.</dd>


<dt>combine</dt>
<dd>Function <code>(acc, value) -&gt; acc</code> to update the accumulator.</dd>


<dt>...</dt>
<dd>Passed to <code>f()</code>.</dd>

</dl>

stream_reduce: Stream over row-groups/datasets and reduce

Description

Usage

Value

Arguments

Examples