group_by

0th

Percentile

Group a tbl by one or more variables.

Most data operations are useful done on groups defined by variables in the the dataset. The group_by function takes an existing tbl and converts it into a grouped tbl where operations are performed "by group".

Usage
group_by(.data, ..., add = FALSE)

group_by_(.data, ..., .dots, add = FALSE)

Arguments
.data

a tbl

...

variables to group by. All tbls accept variable names, some will also accept functions of variables. Duplicated groups will be silently dropped.

add

By default, when add = FALSE, group_by will override existing groups. To instead add to the existing groups, use add = TRUE

.dots

Used to work around non-standard evaluation. See vignette("nse") for details.

Tbl types

group_by is an S3 generic with methods for the three built-in tbls. See the help for the corresponding classes and their manip methods for more details:

See Also

ungroup for the inverse operation, groups for accessors that don't do special evaluation.

Aliases
  • group_by
  • group_by_
  • regroup
Examples
library(dplyr) by_cyl <- group_by(mtcars, cyl) summarise(by_cyl, mean(disp), mean(hp)) filter(by_cyl, disp == max(disp)) # summarise peels off a single layer of grouping by_vs_am <- group_by(mtcars, vs, am) by_vs <- summarise(by_vs_am, n = n()) by_vs summarise(by_vs, n = sum(n)) # use ungroup() to remove if not wanted summarise(ungroup(by_vs), n = sum(n)) # You can group by expressions: this is just short-hand for # a mutate/rename followed by a simple group_by group_by(mtcars, vsam = vs + am) group_by(mtcars, vs2 = vs) # You can also group by a constant, but it's not very useful group_by(mtcars, "vs") # By default, group_by sets groups. Use add = TRUE to add groups groups(group_by(by_cyl, vs, am)) groups(group_by(by_cyl, vs, am, add = TRUE)) # Duplicate groups are silently dropped groups(group_by(by_cyl, cyl, cyl))
Documentation reproduced from package dplyr, version 0.5.0, License: MIT + file LICENSE

Community examples

Kaleema.bi@gmail.com at Nov 23, 2017 dplyr v0.7.3

by_cyl <- mtcars %>% group_by(cyl) # grouping doesn't change how the data looks (apart from listing # how it's grouped): by_cyl # It changes how it acts with the other dplyr verbs: by_cyl %>% summarise( disp = mean(disp), hp = mean(hp) ) by_cyl %>% filter(disp == max(disp)) # Each call to summarise() removes a layer of grouping by_vs_am <- mtcars %>% group_by(vs, am) by_vs <- by_vs_am %>% summarise(n = n()) by_vs by_vs %>% summarise(n = sum(n)) # To removing grouping, use ungroup by_vs %>% ungroup() %>% summarise(n = sum(n)) # You can group by expressions: this is just short-hand for # a mutate/rename followed by a simple group_by mtcars %>% group_by(vsam = vs + am) # By default, group_by overrides existing grouping by_cyl %>% group_by(vs, am) %>% group_vars() # Use add = TRUE to instead append by_cyl %>% group_by(vs, am, add = TRUE) %>% group_vars() # }