Learn R Programming

fastplyr (version 0.5.1)

group_id: Fast group and row IDs

Description

These are tidy-based functions for calculating group IDs and row IDs.

  • group_id() returns an integer vector of group IDs the same size as the x.

  • row_id() returns an integer vector of row IDs.

  • f_consecutive_id() returns an integer vector of consecutive run IDs.

The add_ variants add a column of group IDs/row IDs.

Usage

group_id(x, order = TRUE, ascending = TRUE, as_qg = FALSE)

row_id(x, ascending = TRUE)

f_consecutive_id(x)

Value

An integer vector.

Arguments

x

A vector or data frame.

order

Should the groups be ordered? When order is TRUE (the default) the group IDs will be ordered but not sorted.
If FALSE the order of the group IDs will be based on first appearance.

ascending

Should the order be ascending or descending? The default is TRUE.
For row_id() this determines if the row IDs are in increasing or decreasing order.

as_qg

Should the group IDs be returned as a collapse "qG" class? The default (FALSE) always returns an integer vector.

Details

Note - When working with data frames it is highly recommended to use the add_ variants of these functions. Not only are they more intuitive to use, they also have optimisations for large numbers of groups.

group_id

This assigns an integer value to unique elements of a vector or unique rows of a data frame. It is an extremely useful function for analysis as you can compress a lot of information into a single column, using that for further operations.

row_id

This assigns a row number to each group. To assign plain row numbers to a data frame one can use add_row_id(). This function can be used in rolling calculations, finding duplicates and more.

consecutive_id

An alternative to dplyr::consecutive_id(), f_consecutive_id() also creates an integer vector with values in the range [1, n] where n is the length of the vector or number of rows of the data frame. The ID increments every time x[i] != x[i - 1] thus giving information on when there is a change in value. f_consecutive_id has a very small overhead in terms of calling the function, making it suitable for repeated calls.

See Also

add_group_id add_row_id add_consecutive_id