This function creates synthetic cases for balancing the training with an object of the class TextEmbeddingClassifierNeuralNet.
get_synthetic_cases(
embedding,
times,
features,
target,
method = c("smote"),
max_k = 6
)list with the following components.
syntetic_embeddings: Named data.frame containing the text embeddings of
the synthetic cases.
syntetic_targets Named factor containing the labels of the corresponding
synthetic cases.
n_syntetic_units table showing the number of synthetic cases for every
label/category.
Named data.frame containing the text embeddings.
In most cases, this object is taken from EmbeddedText$embeddings.
int for the number of sequences/times.
int for the number of features within each sequence.
Named factor containing the labels of the corresponding embeddings.
vector containing strings of the requested methods for generating new cases.
Currently "smote","dbsmote", and "adas" from the package smotefamily are available.
int The maximum number of nearest neighbors during sampling process.
Other Auxiliary Functions:
array_to_matrix(),
calc_standard_classification_measures(),
check_embedding_models(),
clean_pytorch_log_transformers(),
create_iota2_mean_object(),
create_synthetic_units(),
generate_id(),
get_coder_metrics(),
get_folds(),
get_n_chunks(),
get_stratified_train_test_split(),
get_train_test_split(),
is.null_or_na(),
matrix_to_array_c(),
split_labeled_unlabeled(),
summarize_tracked_sustainability(),
to_categorical_c()