separate_rows
Separate a collapsed column into multiple rows.
If a variable contains observations with multiple delimited values, this separates the values and places each one in its own row.
Usage
separate_rows(data, ..., sep = "[^[:alnum:].]+", convert = FALSE)
Arguments
- data
A data frame.
- ...
A selection of columns. If empty, all variables are selected. You can supply bare variable names, select all variables between x and z with
x:z
, exclude y with-y
. For more options, see thedplyr::select()
documentation. See also the section on selection rules below.- sep
Separator delimiting collapsed values.
- convert
If
TRUE
will automatically runtype.convert()
on the key column. This is useful if the column types are actually numeric, integer, or logical.
Rules for selection
Arguments for selecting columns are passed to
tidyselect::vars_select()
and are treated specially. Unlike other
verbs, selecting functions make a strict distinction between data
expressions and context expressions.
A data expression is either a bare name like
x
or an expression likex:y
orc(x, y)
. In a data expression, you can only refer to columns from the data frame.Everything else is a context expression in which you can only refer to objects that you have defined with
<-
.
For instance, col1:col3
is a data expression that refers to data
columns, while seq(start, end)
is a context expression that
refers to objects from the contexts.
If you really need to refer to contextual objects from a data
expression, you can unquote them with the tidy eval operator
!!
. This operator evaluates its argument in the context and
inlines the result in the surrounding function call. For instance,
c(x, !! x)
selects the x
column within the data frame and the
column referred to by the object x
defined in the context (which
can contain either a column name as string or a column position).
Examples
# NOT RUN {
df <- data.frame(
x = 1:3,
y = c("a", "d,e,f", "g,h"),
z = c("1", "2,3,4", "5,6"),
stringsAsFactors = FALSE
)
separate_rows(df, y, z, convert = TRUE)
# }
Community examples
## Separate column into rows splitting the column values with delimiter game <- c('Cricket', 'Hockey') player <- c('Dhruv, Gurnoor and Rachpal', 'Pargat, Sardar and Dhyan.Chand') roster_df <- data.frame(Game=game, Player=player) # Option 1 roster_df %>% separate_rows(Player, sep = ", | and " , convert = FALSE) # Option 2 roster_df %>% mutate(Player = str_split(Player, ", | and ")) %>% unnest() # Option 3 roster_df %>% unnest(Player = str_split(Player, ", | and "))