Learn R Programming

⚠️There's a newer version (0.4.4) of this package.Take me there.

runner an R package for running operations.

About

Package contains standard running functions (aka. rolling) with additional options like varying window size, lagging, handling missings and windows depending on date. runner brings also rolling streak and rolling which, what extends beyond range of functions already implemented in R packages. This package can be successfully used to manipulate and aggregate time series or longitudinal data.

Installation

Install package from from GitHub or from CRAN.

# devtools::install_github("gogonzo/runner")
install.packages("runner")

Using runner

runner package provides functions applied on running windows. The most universal function is runner::runner which gives user possibility to apply any R function f in running window. In example below 4-months correlation is calculated lagged by 1 month.

library(runner)

x <- data.frame(
  date = seq.Date(Sys.Date(), Sys.Date() + 365, length.out = 20),
  a = rnorm(20),
  b = rnorm(20)
)

runner(
  x, 
  lag = "1 months",
  k = "4 months", 
  idx = x$date, 
  f = function(x) {
    cor(x$a, x$b)
  }
)

There are different kinds of running windows and all of them are implemented in runner.

Running windows

Following diagram illustrates what running windows are - in this case running windows of length k = 4. For each of 15 elements of a vector each window contains current 4 elements.

Window size

k denotes number of elements in window. If k is a single value then window size is constant for all elements of x. For varying window size one should specify k as integer vector of length(k) == length(x) where each element of k defines window length. If k is empty it means that window will be cumulative (like base::cumsum). Example below illustrates window of k = 4 for 10th element of vector x.

runner(1:15, k = 4)

Window lag

lag denotes how many observations windows will be lagged by. If lag is a single value than it is constant for all elements of x. For varying lag size one should specify lag as integer vector of length(lag) == length(x) where each element of lag defines lag of window. Default value of lag = 0. Example below illustrates window of k = 4 lagged by lag = 2 for 10-th element of vector x. Lag can also be negative value, which shifts window forward instead of backward.

runner(
  1:15, 
  k = 4, 
  lag = 2
)

Windows depending on date

Sometimes data points in dataset are not equally spaced (missing weekends, holidays, other missings) and thus window size should vary to keep expected time frame. If one specifies idx argument, than running functions are applied on windows depending on date. idx should be the same length as x of class Date or integer. Including idx can be combined with varying window size, than k will denote number of periods in window different for each data point. Example below illustrates window of size k = 5 lagged by lag = 2. In parentheses ranges for each window.

idx <- Sys.Date() + c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48)
runner(
  x = 1:15, 
  k = "5 days", 
  lag = "1 days", 
  idx = idx
)

running at

Runner by default returns vector of the same size as x unless one puts any-size vector to at argument. Each element of at is an index on which runner calculates function. Below illustrates output of runner for at = c(18, 27, 45, 31) which gives windows in ranges enclosed in square brackets. Range for at = 27 is [22, 26] which is not available in current indices.

idx <- c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48)
runner(
  x = idx, 
  k = 5, 
  lag = 1, 
  idx = idx, 
  at = c(18, 27, 48, 31)
)

NA padding

Using runner one can also specify na_pad = TRUE which would return NA for any window which is partially out of range - meaning that there is no sufficient number of observations to fill the window. By default na_pad = FALSE, which means that incomplete windows are calculated anyway. na_pad is applied on normal cumulative windows and on windows depending on date. In example below two windows exceed range given by idx so for these windows are empty for na_pad = TRUE. If used sets na_pad = FALSE first window will be empty (no single element within [-2, 3]) and last window will return elements within matching idx.

idx <- c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48)
runner(
  x = idx, 
  k = 5, 
  lag = 1, 
  idx = idx, 
  at = c(4, 18, 48, 51),
  na_pad = TRUE
)

Using runner with data.frame

User can also put data.frame into x argument and apply functions which involve multiple columns. In example below we calculate beta parameter of lm model on 1, 2, …, n observations respectively. On the plot one can observe how lm parameter adapt with increasing number of observation.

date <- Sys.Date() + cumsum(sample(1:3, 40, replace = TRUE)) # unequaly spaced time series
x <- cumsum(rnorm(40))
y <- 30 * x + rnorm(40)

df <- data.frame(date, y, x)

slope <- runner(
  df,
  k = 10,
  idx = "date",
  function(x) {
    coefficients(lm(y ~ x, data = x))[2]
  }
)

plot(slope)
abline(h = 30, col = "blue")

Build-in functions

With runner one can use any R functions, but some of them are optimized for speed reasons. These functions are:
- aggregating functions - length_run, min_run, max_run, minmax_run, sum_run, mean_run, streak_run
- utility functions - fill_run, lag_run, which_run

Copy Link

Version

Install

install.packages('runner')

Monthly Downloads

1,803

Version

0.3.7

License

GPL (>= 2)

Maintainer

Dawid Ka<c5><82><c4><99>dkowski

Last Published

May 17th, 2020

Functions in runner (0.3.7)

window_run

List of running windows
which_run

Running which
seq_at

Creates sequence for at as time-unit-interval
runner

Apply running function
max_run

Running maximum
streak_run

Running streak length
mean_run

Running mean
sum_run

Running sum
lag_run

Lag dependent on variable
length_run

Length of running windows
this_group

Access group data in mutate
k_by

Converts k and lag from time-unit-interval to int
fill_run

Fill NA with previous non-NA element
min_run

Running minimum
run_by

Set window parameters
reformat_k

Formats time-unit-interval to valid for runner
minmax_run

Running min/max