tidyverse: Tidyverse methods for tsibble

Description

arrange(): if not arranging key and index in past-to-future order, a warning is likely to be issued.
slice(): if row numbers are not in ascending order, a warning is likely to be issued.
select(): keeps the variables you mention as well as the index.
transmute(): keeps the variable you operate on, as well as the index and key.
summarise() will not collapse on the index variable.
Column-wise verbs, including select(), transmute(), summarise(), mutate() & transmute(), keep the time context hanging around. That is, the index variable cannot be dropped for a tsibble. If any key variable is changed, it will validate whether it's a tsibble internally. Use as_tibble() to leave off the time context.
unnest() requires argument key = id() to get back to a tsibble.

Usage

# S3 method for tbl_ts
arrange(.data, ...)
# S3 method for grouped_ts
arrange(.data, ..., .by_group = FALSE)
# S3 method for tbl_ts
filter(.data, ...)
# S3 method for tbl_ts
slice(.data, ...)
# S3 method for tbl_ts
select(.data, ..., .drop = FALSE)
# S3 method for tbl_ts
rename(.data, ...)
# S3 method for tbl_ts
mutate(.data, ..., .drop = FALSE)
# S3 method for tbl_ts
transmute(.data, ..., .drop = FALSE)
# S3 method for tbl_ts
summarise(.data, ..., .drop = FALSE)
# S3 method for tbl_ts
summarize(.data, ..., .drop = FALSE)
# S3 method for tbl_ts
group_by(.data, ..., add = FALSE)
# S3 method for grouped_ts
ungroup(x, ...)
# S3 method for tbl_ts
left_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)
# S3 method for tbl_ts
right_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)
# S3 method for tbl_ts
inner_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)
# S3 method for tbl_ts
full_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)
# S3 method for tbl_ts
semi_join(x, y, by = NULL, copy = FALSE, ...)
# S3 method for tbl_ts
anti_join(x, y, by = NULL, copy = FALSE, ...)
# S3 method for tbl_ts
gather(data, key = "key", value = "value", ...,
  na.rm = FALSE, convert = FALSE, factor_key = FALSE)
# S3 method for tbl_ts
spread(data, key, value, fill = NA, convert = FALSE,
  drop = TRUE, sep = NULL)
# S3 method for tbl_ts
nest(data, ..., .key = "data")
# S3 method for lst_ts
unnest(data, ..., key = id(), .drop = NA,
  .id = NULL, .sep = NULL, .preserve = NULL)
# S3 method for tbl_ts
unnest(data, ..., key = id(), .drop = NA,
  .id = NULL, .sep = NULL, .preserve = NULL)
# S3 method for grouped_ts
fill(data, ..., .direction = c("down", "up"))

Arguments

.data

A tbl_ts.

...

same arguments accepted as its dplyr generic.

.by_group

If TRUE, will sort first by grouping variable. Applies to grouped data frames only.

.drop

Deprecated, please use as_tibble() for .drop = TRUE instead. FALSE returns a tsibble object as the input. TRUE drops a tsibble and returns a tibble.

add

When add = FALSE, the default, group_by() will override existing groups. To add to the existing groups, use add = TRUE.

A tbl()

tbls to join

a character vector of variables to join by. If NULL, the default, *_join() will do a natural join, using all variables with common names across the two tables. A message lists the variables so that you can check they're right (to suppress the message, simply explicitly list the variables that you want to join).

To join by different variables on x and y use a named vector. For example, by = c("a" = "b") will match x.a to y.b.

copy

If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.

suffix

If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.

data

A data frame.

key

Unquoted variables to create the key (via id) after unnesting.

value

Names of new key and value columns, as strings or symbols.

This argument is passed by expression and supports quasiquotation (you can unquote strings and symbols). The name is captured from the expression with rlang::ensym() (note that this kind of interface where symbols do not represent actual objects is now discouraged in the tidyverse; we support it here for backward compatibility).

na.rm

If TRUE, will remove rows from output where the value column in NA.

convert

If TRUE will automatically run type.convert() on the key column. This is useful if the column types are actually numeric, integer, or logical.

factor_key

If FALSE, the default, the key values will be stored as a character vector. If TRUE, will be stored as a factor, which preserves the original ordering of the columns.

fill

If set, missing values will be replaced with this value. Note that there are two types of missingness in the input: explicit missing values (i.e. NA), and implicit missings, rows that simply aren't present. Both types of missing value will be replaced by fill.

drop

If FALSE, will keep factor levels that don't appear in the data, filling in missing combinations with fill.

sep

If NULL, the column names will be taken from the values of key variable. If non-NULL, the column names will be given by "<key_name><sep><key_value>".

.key

The name of the new column, as a string or symbol.

.id

Data frame identifier - if supplied, will create a new column with name .id, giving a unique identifier. This is most useful if the list column is named.

.sep

If non-NULL, the names of unnested data frame columns will combine the name of the original list-col with the names from nested data frame, separated by .sep.

.preserve

Optionally, list-columns to preserve in the output. These will be duplicated in the same way as atomic vectors. This has dplyr::select semantics so you can preserve multiple variables with .preserve = c(x, y) or .preserve = starts_with("list").

.direction

Direction in which to fill missing values. Currently either "down" (the default) or "up".

Examples

Run this code

# NOT RUN {
# Sum over sensors ----
pedestrian %>%
  summarise(Total = sum(Count))
# Back to tibble
pedestrian %>%
  as_tibble() %>%
  summarise(Total = sum(Count))
# example from tidyr
stocks <- tsibble(
  time = as.Date('2009-01-01') + 0:9,
  X = rnorm(10, 0, 1),
  Y = rnorm(10, 0, 2),
  Z = rnorm(10, 0, 4)
)
stocks %>% gather(stock, price, -time)
# example from tidyr
stocks <- tsibble(
  time = as.Date('2009-01-01') + 0:9,
  X = rnorm(10, 0, 1),
  Y = rnorm(10, 0, 2),
  Z = rnorm(10, 0, 4)
)
stocksm <- stocks %>% gather(stock, price, -time)
stocksm %>% spread(stock, price)
nested_stock <- stocksm %>% 
  nest(-stock)
stocksm %>% 
  group_by(stock) %>% 
  nest()
nested_stock %>% 
  unnest(key = id(stock))
stock_qtl <- stocksm %>% 
  group_by(stock) %>% 
  index_by(day3 = lubridate::floor_date(time, unit = "3 day")) %>% 
  summarise(
    value = list(quantile(price)), 
    qtl = list(c("0%", "25%", "50%", "75%", "100%"))
  )
unnest(stock_qtl, key = id(qtl))
# }

Run the code above in your browser using DataLab