dplyr (version 0.7.3)

ranking: Windowed rank functions.

Description

Six variations on ranking functions, mimicking the ranking functions described in SQL2003. They are currently implemented using the built in rank function, and are provided mainly as a convenience when converting between R and SQL. All ranking functions map smallest inputs to smallest outputs. Use desc() to reverse the direction.

Usage

row_number(x)

ntile(x, n)

min_rank(x)

dense_rank(x)

percent_rank(x)

cume_dist(x)

Arguments

x

a vector of values to rank. Missing values are left as is. If you want to treat them as the smallest or largest values, replace with Inf or -Inf before ranking.

n

number of groups to split up into.

Details

  • row_number(): equivalent to rank(ties.method = "first")

  • min_rank(): equivalent to rank(ties.method = "min")

  • dense_rank(): like min_rank(), but with no gaps between ranks

  • percent_rank(): a number between 0 and 1 computed by rescaling min_rank to [0, 1]

  • cume_dist(): a cumulative distribution function. Proportion of all values less than or equal to the current rank.

  • ntile(): a rough rank, which breaks the input vector into n buckets.

Examples

Run this code
# NOT RUN {
x <- c(5, 1, 3, 2, 2, NA)
row_number(x)
min_rank(x)
dense_rank(x)
percent_rank(x)
cume_dist(x)

ntile(x, 2)
ntile(runif(100), 10)

# row_number can be used with single table verbs without specifying x
# (for data frames and databases that support windowing)
mutate(mtcars, row_number() == 1L)
mtcars %>% filter(between(row_number(), 1, 10))
# }

Run the code above in your browser using DataCamp Workspace