Learn R Programming

splitTools (version 1.0.1)

create_timefolds: Creates Folds for Time Series Data

Description

This function provides a list with in- and out-of-sample indices per fold used for time series k-fold cross-validation, see Details.

Usage

create_timefolds(y, k = 5L, use_names = TRUE, type = c("extending", "moving"))

Value

A nested list with in-sample and out-of-sample indices per fold.

Arguments

y

Any vector of the same length as the data intended to split.

k

Number of folds.

use_names

Should folds be named? Default is TRUE.

type

Should in-sample data be "extending" over the folds (default) or consist of one single fold ("moving")?

Details

The data is first partitioned into \(k+1\) sequential blocks \(B_1\) to \(B_{k+1}\). Each fold consists of two index vectors: one with in-sample row numbers, the other with out-of-sample row numbers. The first fold uses \(B_1\) as in-sample and \(B_2\) as out-of-sample data. The second one uses either \(B_2\) (if type = "moving") or \(\{B_1, B_2\}\) (if type = "extending") as in-sample, and \(B_3\) as out-of-sample data etc. Finally, the kth fold uses \(\{B_1, ..., B_k\}\) ("extending") or \(B_k\) ("moving") as in-sample data, and \(B_{k+1}\) as out-of-sample data. This makes sure that out-of-sample data always follows in-sample data.

See Also

partition(), create_folds()

Examples

Run this code
y <- runif(100)
create_timefolds(y)
create_timefolds(y, use_names = FALSE)
create_timefolds(y, use_names = FALSE, type = "moving")

Run the code above in your browser using DataLab