Last chance! 50% off unlimited learning
Sale ends in
rsplit
(recursively) splits a vector or data frame into subsets according to combinations of (multiple) vectors / factors and returns a (nested) list. If flatten = TRUE
, the list is flattened yielding the same result as split
. rsplit
is implemented as a wrapper around gsplit
, and significantly faster than split
.
rsplit(x, ...)# S3 method for default
rsplit(x, fl, drop = TRUE, flatten = FALSE, use.names = TRUE, ...)
# S3 method for data.frame
rsplit(x, by, drop = TRUE, flatten = FALSE, cols = NULL,
keep.by = FALSE, simplify = TRUE, use.names = TRUE, ...)
a (nested) list containing the subsets of x
.
a vector, data.frame or list.
a GRP
object, or a (list of) vector(s) / factor(s) (internally converted to a GRP
object(s)) used to split x
.
data.frame method: Same as fl
, but also allows one- or two-sided formulas i.e. ~ group1
or var1 + var2 ~ group1 + group2
. See Examples.
logical. TRUE
removes unused levels or combinations of levels from factors before splitting; FALSE
retains those combinations yielding empty list elements in the output.
logical. If fl
is a list of vectors / factors, TRUE
calls GRP
on the list, creating a single grouping used for splitting; FALSE
yields recursive splitting.
logical. TRUE
returns a named list (like split
); FALSE
returns a plain list.
data.frame method: Select columns to split using a function, column names, indices or a logical vector. Note: cols
is ignored if a two-sided formula is passed to by
.
logical. If a formula is passed to by
, then TRUE
preserves the splitting (right-hand-side) variables in the data frame.
data.frame method: Logical. TRUE
calls rsplit.default
if a single column is split e.g. rsplit(data, col1 ~ group1)
becomes the same as rsplit(data$col1, data$group1)
.
further arguments passed to GRP
. Sensible choices would be sort = FALSE
, decreasing = TRUE
or na.last = FALSE
. Note that these options only apply if fl
is not already a (list of) factor(s).
gsplit
, rapply2d
, unlist2d
, List Processing, Collapse Overview
rsplit(mtcars$mpg, mtcars$cyl)
rsplit(mtcars, mtcars$cyl)
rsplit(mtcars, mtcars[.c(cyl, vs, am)])
rsplit(mtcars, ~ cyl + vs + am, keep.by = TRUE) # Same thing
rsplit(mtcars, ~ cyl + vs + am)
rsplit(mtcars, ~ cyl + vs + am, flatten = TRUE)
rsplit(mtcars, mpg ~ cyl)
rsplit(mtcars, mpg ~ cyl, simplify = FALSE)
rsplit(mtcars, mpg + hp ~ cyl + vs + am)
rsplit(mtcars, mpg + hp ~ cyl + vs + am, keep.by = TRUE)
# Split this sectoral data, first by Variable (Emloyment and Value Added), then by Country
GGDCspl <- rsplit(GGDC10S, ~ Variable + Country, cols = 6:16)
str(GGDCspl)
# The nested list can be reassembled using unlist2d()
head(unlist2d(GGDCspl, idcols = .c(Variable, Country)))
rm(GGDCspl)
# Another example with mtcars (not as clean because of row.names)
nl <- rsplit(mtcars, mpg + hp ~ cyl + vs + am)
str(nl)
unlist2d(nl, idcols = .c(cyl, vs, am), row.names = "car")
rm(nl)
Run the code above in your browser using DataLab