classify: Fit classifiers using time-series features using a resample-based approach and get a fast understanding of performance

Description

Fit classifiers using time-series features using a resample-based approach and get a fast understanding of performance

Usage

classify(
  data,
  classifier = NULL,
  train_size = 0.75,
  n_resamples = 30,
  by_set = TRUE,
  use_null = FALSE,
  seed = 123
)
tsfeature_classifier(
  data,
  classifier = NULL,
  train_size = 0.75,
  n_resamples = 30,
  by_set = TRUE,
  use_null = FALSE,
  seed = 123
)

Value

list containing a named vector of train-test set sizes, and a data.frame of classification performance results

Arguments

data: feature_calculations object containing the raw feature matrix produced by theft::calculate_features
classifier: function specifying the classifier to fit. Should be a function with 2 arguments: formula and data containing a classifier compatible with R's predict functionality. Please note that classify z-scores data prior to modelling using the train set's information so disabling default scaling if your function uses it is recommended. Defaults to NULL which means the following linear SVM is fit: classifier = function(formula, data){mod <- e1071::svm(formula, data = data, kernel = "linear", scale = FALSE, probability = TRUE)}
train_size: numeric denoting the proportion of samples to use in the training set. Defaults to 0.75
n_resamples: integer denoting the number of resamples to calculate. Defaults to 30
by_set: Boolean specifying whether to compute classifiers for each feature set. Defaults to TRUE. If FALSE, the function will instead find the best individually-performing features
use_null: Boolean whether to fit null models where class labels are shuffled in order to generate a null distribution that can be compared to performance on correct class labels. Defaults to FALSE
seed: integer to fix R's random number generator to ensure reproducibility. Defaults to 123

Author

Trent Henderson

Examples

Run this code


library(theft)

features <- theft::calculate_features(theft::simData,
  feature_set = "catch22")

classifiers <- classify(features,
  by_set = FALSE,
  n_resamples = 3)

Run the code above in your browser using DataLab