bind: Efficiently bind multiple data frames by row and column

Description

This is an efficient implementation of the common pattern of do.call(rbind, dfs) or do.call(cbind, dfs) for binding many data frames into one.

combine() acts like c() or unlist() but uses consistent dplyr coercion rules.

Usage

bind_rows(..., .id = NULL)
bind_cols(...)
combine(...)

Arguments

...

Data frames to combine.

Each argument can either be a data frame, a list that could be a data frame, or a list of data frames.

When row-binding, columns are matched by name, and any missing columns will be filled with NA.

When column-binding, rows are matched by position, so all data frames must have the same number of rows. To match by value, not position, see join.

.id

Data frame identifier.

When .id is supplied, a new column of identifiers is created to link each row to its original data frame. The labels are taken from the named arguments to bind_rows(). When a list of data frames is supplied, the labels are taken from the names of the list. If no names are found a numeric sequence is used instead.

Value

bind_rows() and bind_cols() return the same type as the first input, either a data frame, tbl_df, or grouped_df.

Deprecated functions

rbind_list() and rbind_all() have been deprecated. Instead use bind_rows().

Details

The output of bind_rows() will contain a column if that column appears in any of the inputs.

If combine() it is called with exactly one list argument, the list is simplified (similarly to unlist(recursive = FALSE). NULL arguments are ignored. If the result is empty, logical() is returned.

Examples

Run this code

# NOT RUN {
one <- mtcars[1:4, ]
two <- mtcars[11:14, ]

# You can supply data frames as arguments:
bind_rows(one, two)

# The contents of lists is automatically spliced:
bind_rows(list(one, two))
bind_rows(split(mtcars, mtcars$cyl))
bind_rows(list(one, two), list(two, one))


# In addition to data frames, you can supply vectors. In the rows
# direction, the vectors represent rows and should have inner
# names:
bind_rows(
  c(a = 1, b = 2),
  c(a = 3, b = 4)
)

# You can mix vectors and data frames:
bind_rows(
  c(a = 1, b = 2),
  data_frame(a = 3:4, b = 5:6),
  c(a = 7, b = 8)
)


# Note that for historical reasons, lists containg vectors are
# always treated as data frames. Thus their vectors are treated as
# columns rather than rows, and their inner names are ignored:
ll <- list(
  a = c(A = 1, B = 2),
  b = c(A = 3, B = 4)
)
bind_rows(ll)

# You can circumvent that behaviour with explicit splicing:
bind_rows(!!!ll)


# When you supply a column name with the `.id` argument, a new
# column is created to link each row to its original data frame
bind_rows(list(one, two), .id = "id")
bind_rows(list(a = one, b = two), .id = "id")
bind_rows("group 1" = one, "group 2" = two, .id = "groups")

# Columns don't need to match when row-binding
bind_rows(data.frame(x = 1:3), data.frame(y = 1:4))
# }
# NOT RUN {
# Rows do need to match when column-binding
bind_cols(data.frame(x = 1), data.frame(y = 1:2))
# }
# NOT RUN {
bind_cols(one, two)
bind_cols(list(one, two))

# combine applies the same coercion rules
f1 <- factor("a")
f2 <- factor("b")
c(f1, f2)
unlist(list(f1, f2))

combine(f1, f2)
combine(list(f1, f2))
# }

Run the code above in your browser using DataLab