
Last chance! 50% off unlimited learning
Sale ends in
groupid
is an enhanced version of data.table::rleid
for atomic vectors. It generates a run-length type group-id where consecutive identical values are assigned the same integer. It is a generalization as it can be applied to unordered vectors, generate group id's starting from an arbitrary value, and skip missing values.
groupid(x, o = NULL, start = 1L, na.skip = FALSE, check.o = TRUE)
a atomic vector of any type. Attributes are not considered.
an (optional) integer ordering vector specifying the order by which to pass through x
.
integer. The starting value of the resulting group-id. Default is starting from 1. For C++ programmers, starting from 0 could be a better choice.
logical. Skip missing values i.e. if TRUE
something like groupid(c("a", NA, "a"))
gives c(1, NA, 1)
whereas FALSE
gives c(1, 2, 3)
.
logical. Programmers option: FALSE
prevents checking that each element of o
is in the range [1, length(x)]
, it only checks the length of o
. This gives some extra speed, but will terminate R if any element of o
is too large or too small.
An integer vector of class 'qG'. See qG
.
# NOT RUN {
groupid(airquality$Month)
groupid(airquality$Month, start = 0)
groupid(wlddev$country)
## Same thing since country is alphabetically ordered: (groupid is faster..)
all.equal(groupid(wlddev$country), qG(wlddev$country, na.exclude = FALSE))
## When data is unordered, group-id can be generated through an ordering..
uo <- order(rnorm(fnrow(airquality)))
monthuo <- airquality$Month[uo]
o <- order(monthuo)
groupid(monthuo, o)
identical(groupid(monthuo, o)[o], unattrib(groupid(airquality$Month)))
# }
Run the code above in your browser using DataLab