spark_write_parquet

A Spark DataFrame or dplyr operation

The path to the file. Needs to be accessible from the cluster.
Supports the <samp>"hdfs://"</samp>, <samp>"s3n://"</samp> and <samp>"file://"</samp> protocols.

path

Specifies the behavior when data or table already exists.

mode

A list of strings with additional options. See <a href="http://spark.apache.org/docs/latest/sql-programming-guide.html#configuration">http://spark.apache.org/docs/latest/sql-programming-guide.html#configuration</a>.

options

Partitions the output by the given columns on the file system.

partition_by

Optional arguments; currently unused.

Serialize a Spark DataFrame to the
<a href="https://parquet.apache.org/">Parquet</a> format.

R interface to Apache Spark, a fast and general engine for big data
processing, see <http://spark.apache.org>. This package supports connecting to
local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end,
and provides an interface to Spark's built-in machine learning algorithms.

Javier Luraschi

sparklyr

R Interface to Apache Spark

Kevin Ushey

JJ Allaire

 RStudio

 The Apache Software Foundation

spark_write_parquet function

A list of strings with additional options. See <a href='http://spark.apache.org/docs/latest/sql-programming-guide.html#configuration'>http://spark.apache.org/docs/latest/sql-programming-guide.html#configuration</a>.

Serialize a Spark DataFrame to the
<a href='https://parquet.apache.org/'>Parquet</a> format.

spark_write_parquet: Write a Spark DataFrame to a Parquet file

Description

Usage

Arguments

See Also