Learn R Programming

theftdlc

Analyse and Interpret Time Series Features

Installation

You can install the stable version of theftdlc from CRAN:

install.packages("theftdlc")

You can install the development version of theftdlc from GitHub using the following:

devtools::install_github("hendersontrent/theftdlc")

General purpose

The theft package for R facilitates user-friendly access to a structured analytical workflow for the extraction of time-series features from six different feature sets (and any number of individual user-supplied features): "catch22", "feasts", "kats", "tsfeatures", "tsfresh", and "tsfel".

theftdlc extends this feature-based ecosystem by providing a suite of functions for analysing, interpreting, and visualising time-series features calculated using theft. Functionality including data quality assessments and normalisation methods, low dimensional projections (linear and nonlinear), data matrix and feature distribution visualisations, time-series classification machine learning procedures, statistical hypothesis testing, and various other statistical and graphical tools.

A high-level overview of how the theft ecosystem for R is typically accessed by users is shown below. Many more functions and options for customisation are available within the packages.

What’s in a name?

theftdlc means ‘downloadable content’ (DLC) for theft—just like you get DLCs for video games.

Quick tour

theft and theftdlc combine to create an intuitive and efficient tidy feature-based workflow. Here is an example of a single code chunk that calculates features using catch22 and a custom set of mean and standard deviation, and projects the feature space into an interpretable two-dimensional space using principal components analysis:

library(dplyr)
library(theft)
library(theftdlc)

calculate_features(data = theft::simData, 
                   feature_set = "catch22",
                   features = list("mean" = mean, "sd" = sd)) %>%
  project(norm_method = "RobustSigmoid",
          unit_int = TRUE,
          low_dim_method = "PCA") %>%
  plot()

In that example, calculate_features comes from theft, while project and the plot generic come from theftdlc.

Similarly, we can perform time-series classification using a similar simple workflow to compare the performance of catch22 against our custom set of the first two moments of the distribution:

calculate_features(data = theft::simData, 
                   feature_set = "catch22",
                   features = list("mean" = mean, "sd" = sd)) %>%
  classify(by_set = TRUE,
           n_resamples = 5,
           use_null = TRUE) %>%
  compare_features(by_set = TRUE,
                   hypothesis = "null") %>%
  head()
                hypothesis  feature_set   metric  set_mean null_mean
1 All features != own null All features accuracy 0.8177778 0.1644444
2         User != own null         User accuracy 0.7955556 0.1200000
3      catch22 != own null      catch22 accuracy 0.7511111 0.1377778
  t_statistic     p.value
1    4.759301 0.004454886
2    6.159351 0.001763084
3    4.866885 0.004119101

In this example, classify and compare_features come from theftdlc.

Please see the vignette for more information and the full functionality of both packages.

Citation

If you use theft or theftdlc in your own work, please cite both the paper:

T. Henderson and Ben D. Fulcher. Feature-Based Time-Series Analysis in R using the theft Package. arXiv, (2022).

and the software:

To cite package 'theft' in publications use:

  Henderson T (2025). _theft: Tools for Handling Extraction of Features
  from Time Series_. R package version 0.8.2,
  <https://hendersontrent.github.io/theft/>.

A BibTeX entry for LaTeX users is

  @Manual{,
    title = {theft: Tools for Handling Extraction of Features from Time Series},
    author = {Trent Henderson},
    year = {2025},
    note = {R package version 0.8.2},
    url = {https://hendersontrent.github.io/theft/},
  }

To cite package 'theftdlc' in publications use:

  Henderson T (2025). _theftdlc: Analyse and Interpret Time Series
  Features_. R package version 0.2.1,
  <https://hendersontrent.github.io/theftdlc/>.

A BibTeX entry for LaTeX users is

  @Manual{,
    title = {theftdlc: Analyse and Interpret Time Series Features},
    author = {Trent Henderson},
    year = {2025},
    note = {R package version 0.2.1},
    url = {https://hendersontrent.github.io/theftdlc/},
  }

Copy Link

Version

Install

install.packages('theftdlc')

Monthly Downloads

199

Version

0.2.1

License

MIT + file LICENSE

Maintainer

Trent Henderson

Last Published

July 29th, 2025

Functions in theftdlc (0.2.1)

stat_test

Calculate p-values for feature sets or features relative to an empirical null or each other using resampled t-tests
plot.interval_calculations

Produce a plot for a interval_calculations object
rescale_zscore

Calculate z-score for all columns in a dataset using train set central tendency and spread
plot.feature_projection

Produce a plot for a feature_projection object
resample_data

Helper function to create a resampled dataset
project

Project a feature matrix into a two-dimensional representation using PCA, MDS, t-SNE, or UMAP ready for plotting
theftdlc

Analyse and Interpret Time Series Features
plot.feature_calculations

Produce a plot for a feature_calculations object
shrink

Use a cross validated penalized maximum likelihood generalized linear model to perform feature selection
select_stat_cols

Helper function to select only the relevant columns for statistical testing
interval

Calculate interval summaries with a measure of central tendency of classification results
get_rescale_vals

Calculate central tendency and spread values for all numeric columns in a dataset
cluster

Perform cluster analysis of time series using their feature vectors
filter_duplicates

Remove duplicate features that exist in multiple feature sets and retain a reproducible random selection of one of them
fit_models

Fit classification model and compute key metrics
classify

Fit classifiers using time-series features using a resample-based approach and get a fast understanding of performance
filter_good_features

Filter resample data sets according to good feature list
make_title

Helper function for converting to title case
compare_features

Conduct statistical testing on time-series feature classification performance to identify top features or compare entire sets
find_good_features

Helper function to find features in both train and test set that are "good"