Construct a pivot table over a Spark Dataframe, using a syntax similar to
that from reshape2::dcast
.
sdf_pivot(x, formula, fun.aggregate = "count")
A spark_connection
, ml_pipeline
, or a tbl_spark
.
A two-sided R formula of the form x_1 + x_2 + ... ~ y_1
.
The left-hand side of the formula indicates which variables are used for grouping,
and the right-hand side indicates which variable is used for pivoting. Currently,
only a single pivot column is supported.
How should the grouped dataset be aggregated? Can be a length-one character vector, giving the name of a Spark aggregation function to be called; a named R list mapping column names to an aggregation method, or an R function that is invoked on the grouped dataset.