sdf_with_sequential_id

A <code>spark_connection</code>, <code>ml_pipeline</code>, or a <code>tbl_spark</code>.

The name of the column to host the generated IDs.

The starting value of the id column

from

Add a sequential ID column to a Spark DataFrame. The Spark
<code>zipWithIndex</code> function is used to produce these. This differs from
<code>sdf_with_unique_id</code> in that the IDs generated are independent of
partitioning.

R interface to Apache Spark, a fast and general engine for big data
processing, see <http://spark.apache.org>. This package supports connecting to
local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end,
and provides an interface to Spark's built-in machine learning algorithms.

Javier Luraschi

sparklyr

R Interface to Apache Spark

sdf_with_sequential_id: Add a Sequential ID Column to a Spark DataFrame

Description

Usage

Arguments