Unlimited learning, half price | 50% off
Get 50% off unlimited learning

timeplyr (version 1.1.0)

roll_lag: Fast rolling grouped lags and differences

Description

Inspired by 'collapse', roll_lag and roll_diff operate similarly to flag and fdiff.

Usage

roll_lag(x, n = 1L, ...)

# S3 method for default roll_lag(x, n = 1L, g = NULL, fill = NULL, ...)

# S3 method for ts roll_lag(x, n = 1L, g = NULL, fill = NULL, ...)

# S3 method for zoo roll_lag(x, n = 1L, g = NULL, fill = NULL, ...)

roll_diff(x, n = 1L, ...)

# S3 method for default roll_diff(x, n = 1L, g = NULL, fill = NULL, differences = 1L, ...)

# S3 method for ts roll_diff(x, n = 1L, g = NULL, fill = NULL, differences = 1L, ...)

# S3 method for zoo roll_diff(x, n = 1L, g = NULL, fill = NULL, differences = 1L, ...)

diff_( x, n = 1L, differences = 1L, order = NULL, run_lengths = NULL, fill = NULL )

Value

A vector the same length as x.

Arguments

x

A vector or data frame.

n

Lag. This will be recycled to match the length of x and can be negative.

...

Arguments passed onto appropriate method.

g

Grouping vector. This can be a vector, data frame or GRP object.

fill

Value to fill the first n elements.

differences

Number indicating the number of times to recursively apply the differencing algorithm. If length(n) == 1, i.e the lag is a scalar integer, an optimised method is used which avoids recursion entirely. If length(n) != 1 then simply recursion is used.

order

Optionally specify an ordering with which to apply the lags/differences. This is useful for example when applying lags chronologically using an unsorted time variable.

run_lengths

Optional integer vector of run lengths that defines the size of each lag run. For example, supplying c(5, 5) applies lags to the first 5 elements and then essentially resets the bounds and applies lags to the next 5 elements as if they were an entirely separate and standalone vector.
This is particularly useful in conjunction with the order argument to perform a by-group lag.

Details

While these may not be as fast the 'collapse' equivalents, they are adequately fast and efficient.
A key difference between roll_lag and flag is that g does not need to be sorted for the result to be correct.
Furthermore, a vector of lags can be supplied for a custom rolling lag.

roll_diff() silently returns NA when there is integer overflow. Both roll_lag() and roll_diff() apply recursively to list elements.

Examples

Run this code
library(timeplyr)
# \dontshow{
.n_dt_threads <- data.table::getDTthreads()
.n_collapse_threads <- collapse::get_collapse()$nthreads
data.table::setDTthreads(threads = 1L)
collapse::set_collapse(nthreads = 1L)
# }
x <- 1:10

roll_lag(x) # Lag
roll_lag(x, -1) # Lead
roll_diff(x) # Lag diff
roll_diff(x, -1) # Lead diff

# Using cheapr::lag_sequence()
# Differences lagged at 5, first 5 differences are compared to x[1]
roll_diff(x, cheapr::lag_sequence(length(x), 5, partial = TRUE))

# Like diff() but x/y instead of x-y
quotient <- function(x, n = 1L){
  x / roll_lag(x, n)
}
# People often call this a growth rate
# but it's just a percentage difference
# See ?roll_growth_rate for growth rate calculations
quotient(1:10)
# \dontshow{
data.table::setDTthreads(threads = .n_dt_threads)
collapse::set_collapse(nthreads = .n_collapse_threads)
# }

Run the code above in your browser using DataLab