Learn R Programming

mlr3spatiotempcv (version 0.1.1)

ResamplingSptCVCstf: Create Spatiotemporal Folds Using Predefined Groups

Description

Implementation of CAST::CreateSpaceTimeFolds().

Arguments

Super class

mlr3::Resampling -> ResamplingSptCVCstf

Public fields

space_var

character(1) Column name identifying the spatial units.

time_var

character(1) Column name identifying the temporal units.

class

character(1) Column name identifying a class unit (e.g. land cover).

Active bindings

iters

integer(1) Returns the number of resampling iterations, depending on the values stored in the param_set.

Methods

Public methods

Method new()

Create a "Spacetime Folds" resampling instance.

Usage

ResamplingSptCVCstf$new(
  id = "sptcv_cstf",
  space_var = NULL,
  time_var = NULL,
  class = NULL
)

Arguments

id

character(1) Identifier for the resampling strategy.

space_var

character(1) Column name identifying the spatial units.

time_var

character(1) Column name identifying the temporal units.

class

character(1) Column name identifying a class unit (e.g. land cover).

Method instantiate()

Materializes fixed training and test splits for a given task.

Usage

ResamplingSptCVCstf$instantiate(task)

Arguments

task

Task A task to instantiate.

space_var

[character] Column name identifying the spatial units.

time_var

[character] Column name identifying the temporal units.

class

[character] Column name identifying a class unit (e.g. land cover).

Method clone()

The objects of this class are cloneable with this method.

Usage

ResamplingSptCVCstf$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Details

Using "class" is helpful in the case that data are clustered in space and are categorical. E.g This is the case for land cover classifications when training data come as training polygons. In this case the data should be split in a way that entire polygons are held back (spacevar="polygonID") but at the same time the distribution of classes should be similar in each fold (class="LUC").

References

Meyer H, Reudenbach C, Hengl T, Katurji M, Nauss T (2018). “Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation.” Environmental Modelling & Software, 101, 1--9. 10.1016/j.envsoft.2017.12.001.

Examples

Run this code
# NOT RUN {
library(mlr3)
task = tsk("cookfarm")

# Instantiate Resampling
rcv = rsmp("sptcv_cstf",
  folds = 5,
  time_var = "Date", space_var = "SOURCEID")
rcv$instantiate(task)

# Individual sets:
rcv$train_set(1)
rcv$test_set(1)
# check that no obs are in both sets
intersect(rcv$train_set(1), rcv$test_set(1)) # good!

# Internal storage:
rcv$instance # table
# }

Run the code above in your browser using DataLab