sdf_bind

sdf_bind_rows

sdf_bind_cols

Spark tbls to combine.
Each argument can either be a Spark DataFrame or a list of
 Spark DataFrames
When row-binding, columns are matched by name, and any missing
 columns with be filled with NA.
When column-binding, rows are matched by position, so all data
 frames must have the same number of rows.

Data frame identifier.
When <code>id</code> is supplied, a new column of identifiers is
 created to link each row to its original Spark DataFrame. The labels
 are taken from the named arguments to <code>sdf_bind_rows()</code>. When a
 list of Spark DataFrames is supplied, the labels are taken from the
 names of the list. If no names are found a numeric sequence is
 used instead.

<code>sdf_bind_rows()</code> and <code>sdf_bind_cols()</code> are implementation of the common pattern of
<code>do.call(rbind, sdfs)</code> or <code>do.call(cbind, sdfs)</code> for binding many
Spark DataFrames into one.

R interface to Apache Spark, a fast and general engine for big data
processing, see <http://spark.apache.org>. This package supports connecting to
local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end,
and provides an interface to Spark's built-in machine learning algorithms.

Javier Luraschi

sparklyr

sdf_bind: Bind multiple Spark DataFrames by row and column

Description

Usage

Arguments

Value

Details