spark_read_delta

The path to the file. Needs to be accessible from the cluster.
Supports the <samp>"hdfs://"</samp>, <samp>"s3a://"</samp> and <samp>"file://"</samp> protocols.

path

The name to assign to the newly generated table.

name

The version of the delta table to read.

version

The timestamp of the delta table to read. For example,
<code>"2019-01-01"</code> or <code>"2019-01-01'T'00:00:00.000Z"</code>.

timestamp

A list of strings with additional options.

options

The number of partitions used to distribute the
generated table. Use 0 (the default) to avoid partitioning.

repartition

Boolean; should the data be loaded eagerly into memory? (That
is, should the table be cached?)

memory

Boolean; overwrite the table with the given name if it
already exists?

overwrite

Optional arguments; currently unused.

Read from Delta Lake into a Spark DataFrame.

R interface to Apache Spark, a fast and general engine for big data
processing, see <http://spark.apache.org>. This package supports connecting to
local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end,
and provides an interface to Spark's built-in machine learning algorithms.

spark_read_delta: Read from Delta Lake into a Spark DataFrame.

Description

Usage

Arguments

See Also