SparkR (version 2.1.2)

cume_dist: cume_dist

Description

Window function: returns the cumulative distribution of values within a window partition, i.e. the fraction of rows that are below the current row.

Usage

cume_dist(x = "missing")

# S4 method for missing cume_dist()

Arguments

x

empty. Should be used with no argument.

Details

N = total number of rows in the partition cume_dist(x) = number of values before (and including) x / N

This is equivalent to the CUME_DIST function in SQL.

See Also

Other window_funcs: dense_rank, lag, lead, ntile, percent_rank, rank, row_number

Examples

Run this code
# NOT RUN {
  df <- createDataFrame(mtcars)
  ws <- orderBy(windowPartitionBy("am"), "hp")
  out <- select(df, over(cume_dist(), ws), df$hp, df$am)
# }

Run the code above in your browser using DataLab