tsibble-tidyverse: Tidyverse methods for tsibble

Description

arrange(): if not arranging key and index in past-to-future order, a warning is likely to be issued.
slice(): if row numbers are not in ascending order, a warning is likely to be issued.
select(): keeps the variables you mention as well as the index.
transmute(): keeps the variable you operate on, as well as the index and key.
summarise() reduces a sequence of values over time instead of a single summary, as well as dropping empty keys/groups.
unnest() requires argument key = NULL to get back to a tsibble.

Usage

# S3 method for tbl_ts
arrange(.data, ...)
# S3 method for tbl_ts
filter(.data, ..., .preserve = FALSE)
# S3 method for tbl_ts
slice(.data, ..., .preserve = FALSE)
# S3 method for tbl_ts
select(.data, ...)
# S3 method for tbl_ts
rename(.data, ...)
# S3 method for tbl_ts
mutate(.data, ...)
# S3 method for tbl_ts
transmute(.data, ...)
# S3 method for tbl_ts
summarise(.data, ...)
# S3 method for tbl_ts
left_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)
# S3 method for tbl_ts
right_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)
# S3 method for tbl_ts
inner_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)
# S3 method for tbl_ts
full_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)
# S3 method for tbl_ts
semi_join(x, y, by = NULL, copy = FALSE, ...)
# S3 method for tbl_ts
anti_join(x, y, by = NULL, copy = FALSE, ...)
# S3 method for tbl_ts
gather(data, key = "key", value = "value", ...,
  na.rm = FALSE, convert = FALSE, factor_key = FALSE)
# S3 method for tbl_ts
spread(data, key, value, ...)
# S3 method for tbl_ts
nest(data, ..., .key = "data")
# S3 method for tbl_ts
unnest(data, ..., key = NULL, .drop = NA,
  .id = NULL, .sep = NULL, .preserve = NULL)

Arguments

.data

A tbl_ts.

...

Same arguments accepted as its dplyr generic.

.preserve

Optionally, list-columns to preserve in the output. These will be duplicated in the same way as atomic vectors. This has dplyr::select semantics so you can preserve multiple variables with .preserve = c(x, y) or .preserve = starts_with("list").

tbls to join

a character vector of variables to join by. If NULL, the default, *_join() will do a natural join, using all variables with common names across the two tables. A message lists the variables so that you can check they're right (to suppress the message, simply explicitly list the variables that you want to join).

To join by different variables on x and y use a named vector. For example, by = c("a" = "b") will match x.a to y.b.

copy

If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.

suffix

If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.

data

A data frame.

key

Unquoted variables to create the key after unnesting.

value

Names of new key and value columns, as strings or symbols.

This argument is passed by expression and supports quasiquotation (you can unquote strings and symbols). The name is captured from the expression with rlang::ensym() (note that this kind of interface where symbols do not represent actual objects is now discouraged in the tidyverse; we support it here for backward compatibility).

na.rm

If TRUE, will remove rows from output where the value column is NA.

convert

If TRUE will automatically run type.convert() on the key column. This is useful if the column types are actually numeric, integer, or logical.

factor_key

If FALSE, the default, the key values will be stored as a character vector. If TRUE, will be stored as a factor, which preserves the original ordering of the columns.

.key

The name of the new column, as a string or symbol.

.drop

Should additional list columns be dropped? By default, unnest will drop them if unnesting the specified columns requires the rows to be duplicated.

.id

Data frame identifier - if supplied, will create a new column with name .id, giving a unique identifier. This is most useful if the list column is named.

.sep

If non-NULL, the names of unnested data frame columns will combine the name of the original list-col with the names from nested data frame, separated by .sep.

Details

Column-wise verbs, including select(), transmute(), summarise(), mutate() & transmute(), keep the time context hanging around. That is, the index variable cannot be dropped for a tsibble. If any key variable is changed, it will validate whether it's a tsibble internally. Use as_tibble() to leave off the time context.

Examples

Run this code

# NOT RUN {
library(dplyr, warn.conflicts = FALSE)
# Sum over sensors ----
pedestrian %>%
  summarise(Total = sum(Count))
# Back to tibble
pedestrian %>%
  as_tibble() %>%
  summarise(Total = sum(Count))
library(tidyr)
# example from tidyr
stocks <- tsibble(
  time = as.Date('2009-01-01') + 0:9,
  X = rnorm(10, 0, 1),
  Y = rnorm(10, 0, 2),
  Z = rnorm(10, 0, 4)
)
(stocksm <- stocks %>% gather(stock, price, -time))
stocksm %>% spread(stock, price)
nested_stock <- stocksm %>% 
  nest(-stock)
stocksm %>% 
  group_by(stock) %>% 
  nest()
nested_stock %>% 
  unnest(key = stock)
stock_qtl <- stocksm %>% 
  group_by(stock) %>% 
  index_by(day3 = lubridate::floor_date(time, unit = "3 day")) %>% 
  summarise(
    value = list(quantile(price)), 
    qtl = list(c("0%", "25%", "50%", "75%", "100%"))
  )
unnest(stock_qtl, key = qtl)
# }

Run the code above in your browser using DataLab