Applies an R function to a Spark object (typically, a Spark DataFrame).
spark_apply(x, f, columns = colnames(x), memory = TRUE, group_by = NULL,
packages = TRUE, ...)An object (usually a spark_tbl) coercable to a Spark DataFrame.
A function that transforms a data frame partition into a data frame.
The function f has signature f(df, group1, group2, ...) where
df is a data frame with the data to be processed and group1 to
groupN contain the values of the group_by values. When
group_by is not specified, f takes only one argument.
A vector of column names or a named vector of column types for the transformed object. Defaults to the names from the original object and adds indexed column names when not enough columns are specified.
Boolean; should the table be cached into memory?
Column name used to group by data frame partitions.
Boolean; distribute .libPaths() packages to nodes?
Optional arguments; currently unused.