Function for creating synthetic cases in order to balance the data for training with TextEmbeddingClassifierNeuralNet. This is an auxiliary function for use with get_synthetic_cases to allow parallel computations.
create_synthetic_units(embedding, target, k, max_k, method, cat, cat_freq)Returns a list which contains the text embeddings of the
new synthetic cases as a named data.frame and their labels as a named
factor.
Named data.frame containing the text embeddings.
In most cases this object is taken from EmbeddedText$embeddings.
Named factor containing the labels/categories of the corresponding cases.
int The number of nearest neighbors during sampling process.
int The maximum number of nearest neighbors during sampling process.
vector containing strings of the requested methods for generating new cases.
Currently "smote","dbsmote", and "adas" from the package smotefamily are available.
string The category for which new cases should be created.
Object of class "table" containing the absolute frequencies
of every category/label.
Other Auxiliary Functions:
array_to_matrix(),
calc_standard_classification_measures(),
check_embedding_models(),
clean_pytorch_log_transformers(),
create_iota2_mean_object(),
generate_id(),
get_coder_metrics(),
get_folds(),
get_n_chunks(),
get_stratified_train_test_split(),
get_synthetic_cases(),
get_train_test_split(),
is.null_or_na(),
matrix_to_array_c(),
split_labeled_unlabeled(),
summarize_tracked_sustainability(),
to_categorical_c()