powered by
This function takes sequences of elements and uses a machine learning classifier to predict the next elements in the sequence. It supports n-gram tokenization and k-fold cross-validation. Optionally, it can upsample the training data.
transition_predictions( sequences, classifier = "nb", ngram = 2, upsample = TRUE, k = 10 )
A list containing the mean accuracy, mean null accuracy, and a data frame of prediction errors.
A list of character strings representing sequences of elements.
A character string specifying the classifier to use. Options are 'nb' for Naive Bayes and 'forest' for random forest.
An integer specifying the number of elements to consider in the n-gram tokenization. Default is 2.
A logical value indicating whether to upsample the training data to balance class distribution. Default is TRUE.
An integer specifying the number of folds for k-fold cross-validation. Default is 10.
sequences <- list("a b c", "b c d", "c d e") result <- transition_predictions(sequences, classifier = 'nb', ngram = 2, upsample = TRUE, k = 5) print(result)
Run the code above in your browser using DataLab