write.df: Save the contents of SparkDataFrame to a data source.

Description

The data source is specified by the source and a set of options (...). If source is not specified, the default data source configured by spark.sql.sources.default will be used.

Usage

write.df(df, path = NULL, ...)
saveDF(df, path, source = NULL, mode = "error", ...)
write.df(df, path = NULL, ...)
# S4 method for SparkDataFrame
write.df(df, path = NULL, source = NULL,
  mode = "error", ...)
# S4 method for SparkDataFrame,character
saveDF(df, path, source = NULL,
  mode = "error", ...)

Arguments

a SparkDataFrame.

path

a name for the table.

...

additional argument(s) passed to the method.

source

a name for external data source.

mode

one of 'append', 'overwrite', 'error', 'ignore' save mode (it is 'error' by default)

Details

Additionally, mode is used to specify the behavior of the save operation when data already exists in the data source. There are four modes:

append: Contents of this SparkDataFrame are expected to be appended to existing data.
overwrite: Existing data is expected to be overwritten by the contents of this SparkDataFrame.
error: An exception is expected to be thrown.
ignore: The save operation is expected to not save the contents of the SparkDataFrame and to not change the existing data.

Other SparkDataFrame functions: SparkDataFrame-class, agg, arrange, as.data.frame, attach, cache, coalesce, collect, colnames, coltypes, createOrReplaceTempView, crossJoin, dapplyCollect, dapply, describe, dim, distinct, dropDuplicates, dropna, drop, dtypes, except, explain, filter, first, gapplyCollect, gapply, getNumPartitions, group_by, head, histogram, insertInto, intersect, isLocal, join, limit, merge, mutate, ncol, nrow, persist, printSchema, randomSplit, rbind, registerTempTable, rename, repartition, sample, saveAsTable, schema, selectExpr, select, showDF, show, storageLevel, str, subset, take, union, unpersist, withColumn, with, write.jdbc, write.json, write.orc, write.parquet, write.text

Examples

Run this code

# NOT RUN {
sparkR.session()
path <- "path/to/file.json"
df <- read.json(path)
write.df(df, "myfile", "parquet", "overwrite")
saveDF(df, parquetPath2, "parquet", mode = saveMode, mergeSchema = mergeSchema)
# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

Details

See Also

Examples