complete: Complete a data frame with missing combinations of data

Description

Turns implicit missing values into explicit missing values. This is a wrapper around expand(), dplyr::left_join() and replace_na() that's useful for completing missing combinations of data.

Usage

complete(data, ..., fill = list())

Arguments

data

A data frame.

...

Specification of columns to expand. Columns can be atomic vectors or lists.

To find all unique combinations of x, y and z, including those not present in the data, supply each variable as a separate argument: expand(df, x, y, z).
To find only the combinations that occur in the data, use nesting: expand(df, nesting(x, y, z)).
You can combine the two forms. For example, expand(df, nesting(school_id, student_id), date) would produce a row for each present school-student combination for all possible dates.

When used with factors, expand() uses the full set of levels, not just those that appear in the data. If you want to use only the values seen in the data, use forcats::fct_drop().

When used with continuous variables, you may need to fill in values that do not appear in the data: to do so use expressions like year = 2010:2020 or year = full_seq(year,1).

fill

A named list that for each variable supplies a single value to use instead of NA for missing combinations.

Details

If you supply fill, these values will also replace existing explicit missing values in the data set.

Examples

Run this code

# NOT RUN {
library(dplyr, warn.conflicts = FALSE)
df <- tibble(
  group = c(1:2, 1),
  item_id = c(1:2, 2),
  item_name = c("a", "b", "b"),
  value1 = 1:3,
  value2 = 4:6
)
df %>% complete(group, nesting(item_id, item_name))

# You can also choose to fill in missing values
df %>% complete(group, nesting(item_id, item_name), fill = list(value1 = 0))
# }

Run the code above in your browser using DataLab