sparklyr (version 1.5.0)

sdf_quantile: Compute (Approximate) Quantiles with a Spark DataFrame

Description

Given a numeric column within a Spark DataFrame, compute approximate quantiles (to some relative error).

Usage

sdf_quantile(
  x,
  column,
  probabilities = c(0, 0.25, 0.5, 0.75, 1),
  relative.error = 1e-05
)

Arguments

x

A spark_connection, ml_pipeline, or a tbl_spark.

column

The column(s) for which quantiles should be computed. Multiple columns are only supported in Spark 2.0+.

probabilities

A numeric vector of probabilities, for which quantiles should be computed.

relative.error

The relative error -- lower values imply more precision in the computed quantiles.