Fit classifiers using time-series features using a resample-based approach and get a fast understanding of performance
classify(
data,
classifier = NULL,
train_size = 0.75,
n_resamples = 30,
by_set = TRUE,
use_null = FALSE,
seed = 123
)tsfeature_classifier(
data,
classifier = NULL,
train_size = 0.75,
n_resamples = 30,
by_set = TRUE,
use_null = FALSE,
seed = 123
)
list containing a named vector of train-test set sizes, and a data.frame of classification performance results
feature_calculations object containing the raw feature matrix produced by theft::calculate_features
function specifying the classifier to fit. Should be a function with 2 arguments: formula and data containing a classifier compatible with R's predict functionality. Please note that classify z-scores data prior to modelling using the train set's information so disabling default scaling if your function uses it is recommended. Defaults to NULL which means the following linear SVM is fit: classifier = function(formula, data){mod <- e1071::svm(formula, data = data, kernel = "linear", scale = FALSE, probability = TRUE)}
numeric denoting the proportion of samples to use in the training set. Defaults to 0.75
integer denoting the number of resamples to calculate. Defaults to 30
Boolean specifying whether to compute classifiers for each feature set. Defaults to TRUE. If FALSE, the function will instead find the best individually-performing features
Boolean whether to fit null models where class labels are shuffled in order to generate a null distribution that can be compared to performance on correct class labels. Defaults to FALSE
integer to fix R's random number generator to ensure reproducibility. Defaults to 123
Trent Henderson
library(theft)
features <- theft::calculate_features(theft::simData,
feature_set = "catch22")
classifiers <- classify(features,
by_set = FALSE,
n_resamples = 3)
Run the code above in your browser using DataLab