Learn R Programming

runner (version 0.3.7)

runner: Apply running function

Description

Applies custom function on running windows.

Usage

runner(
  x,
  f = function(x) x,
  k = integer(0),
  lag = integer(1),
  idx = integer(0),
  at = integer(0),
  na_pad = FALSE,
  type = "auto",
  ...
)

# S3 method for data.frame runner( x, f = function(x) x, k = integer(0), lag = integer(1), idx = integer(0), at = integer(0), na_pad = FALSE, type = "auto", ... )

# S3 method for default runner( x, f = function(x) x, k = integer(0), lag = integer(1), idx = integer(0), at = integer(0), na_pad = FALSE, type = "auto", ... )

Arguments

x

(vector, data.frame, matrix) Input in runner custom function f.

f

(function) Applied on windows created from x. This function is meant to summarize windows and create single element for each window, but one can also specify function which return multiple elements (runner output will be a list). By default runner returns windows as is (f = function(x)).

k

(integer vector or single value) Denoting size of the running window. If k is a single value then window size is constant for all elements, otherwise if length(k) == length(x) different window size for each element. One can also specify k in the same way as by in seq.POSIXt. More in details.

lag

(integer vector or single value) Denoting window lag. If lag is a single value then window lag is constant for all elements, otherwise if length(lag) == length(x) different window size for each element. Negative value shifts window forward. One can also specify lag in the same way as by in seq.POSIXt. More in details.

idx

(integer, Date, POSIXt) Optional integer vector containing sorted (ascending) index of observation. By default idx is index incremented by one. User can provide index with varying increment and with duplicated values. If specified then k and lag are depending on idx. Length of idx have to be equal of length x.

at

(integer, Date, POSIXt, character vector) Vector of any size and any value defining output data points. Values of the vector defines the indexes which data is computed at. Can be also POSIXt sequence increment seq.POSIXt. More in details.

na_pad

(logical single value) Whether incomplete window should return NA (if na_pad = TRUE) Incomplete window is when some parts of the window are out of range.

type

(character single value) output type ("auto", "logical", "numeric", "integer", "character"). runner by default guess type automatically. In case of failure of "auto" please specify desired type.

...

(optional) other arguments passed to the function f.

Value

vector with aggregated values for each window. Length of output is the same as length(x) or length(at) if specified. Type of the output is taken from type argument.

Details

Function can apply any R function on running windows defined by x, k, lag, idx and at. Running window can be calculated on several ways:

  • Cumulative windows applied when user doesn't specify k argument or specify k = length(x), this would mean that k is equal to number of available elements Figure: cumulativewindows.png

  • Constant sliding windows applied when user specify k as constant value keeping idx and at unspecified. lag argument shifts windows left (lag > 0) or right (lag < 0). Figure: incrementalindex.png

  • Windows depending on date If one specifies idx this would mean that output windows size might change in size because of unequally spaced indexes. Fox example 5-period window is different than 5-element window, because 5-period window might contain any number of observation (7-day mean is not the same as 7-element mean)

    Figure: runningdatewindows.png

  • Window at specific indices runner by default returns vector of the same size as x unless one specifies at argument. Each element of at is an index on which runner calculates function - which means that output of the runner is now of length equal to at. Note that one can change index of x by specifying idx. Illustration below shows output of runner for at = c(18, 27, 45, 31) which gives windows in ranges enclosed in square brackets. Range for at = 27 is [22, 26] which is not available in current indices. Figure: runnerat.png

    at can also be specified as interval of the output defined by at = "<increment>" which results in output on following indices seq.POSIXt(min(idx), max(idx), by = "<increment>"). Increment of sequence is the same as in seq.POSIXt function. It's worth noting that increment interval can't be more frequent than interval of idx - for Date the most frequent time-unit is a "day", for POSIXt a sec.

    k and lag can also be specified as using time sequence increment. Available time units are "sec", "min", "hour", "day", "DSTday", "week", "month", "quarter" or "year". To increment by number of units one can also specify <number> <unit>s for example lag = "-2 days", k = "5 weeks".

Above is not enough since k and lag can be a vector which allows to stretch and lag/lead each window freely on in time (on indices).

Examples

Run this code
# NOT RUN {
# runner returns windows as is by default
runner(1:10)

# mean on k = 3 elements windows
runner(1:10, f = mean, k = 3)

# mean on k = 3 elements windows with different specification
runner(1:10, k = 3, f = function(x) mean(x, na.rm = TRUE))

# concatenate two columns
runner(
  data.frame(
    a = letters[1:10],
    b = 1:10
  ),
  f = function(x) paste(paste0(x$a, x$b), collapse = "+"),
  type = "character"
)

# concatenate two columns with additional argument
runner(
  data.frame(
    a = letters[1:10],
    b = 1:10
  ),
  f = function(x, xxx) {
    paste(paste0(x$a, xxx, x$b), collapse = " + ")
  },
  xxx = "...",
  type = "character"
)

# number of unique values in each window (varying window size)
runner(letters[1:10],
       k = c(1, 2, 2, 4, 5, 5, 5, 5, 5, 5),
       f = function(x) length(unique(x)))

# concatenate only on selected windows index
runner(letters[1:10],
       f = function(x) paste(x, collapse = "-"),
       at = c(1, 5, 8),
       type = "character")

# 5 days mean
idx <- c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48)
runner::runner(
  x = idx,
  k = "5 days",
  lag = 1,
  idx = Sys.Date() + idx,
  f = function(x) mean(x)
)

# 5 days mean at 4-indices
runner::runner(
  x = 1:15,
  k = 5,
  lag = 1,
  idx = idx,
  at = c(18, 27, 48, 31),
  f = mean
)
# }

Run the code above in your browser using DataLab