This function creates synthetic cases for balancing the training with an object of the class TextEmbeddingClassifierNeuralNet.
get_synthetic_cases(
embedding,
times,
features,
target,
method = c("smote"),
max_k = 6
)
list
with the following components.
syntetic_embeddings:
Named data.frame
containing the text embeddings of
the synthetic cases.
syntetic_targets
Named factor
containing the labels of the corresponding
synthetic cases.
n_syntetic_units
table
showing the number of synthetic cases for every
label/category.
Named data.frame
containing the text embeddings.
In most cases, this object is taken from EmbeddedText$embeddings.
int
for the number of sequences/times.
int
for the number of features within each sequence.
Named factor
containing the labels of the corresponding embeddings.
vector
containing strings of the requested methods for generating new cases.
Currently "smote","dbsmote", and "adas" from the package smotefamily are available.
int
The maximum number of nearest neighbors during sampling process.
Other Auxiliary Functions:
array_to_matrix()
,
calc_standard_classification_measures()
,
check_embedding_models()
,
clean_pytorch_log_transformers()
,
create_iota2_mean_object()
,
create_synthetic_units()
,
generate_id()
,
get_coder_metrics()
,
get_folds()
,
get_n_chunks()
,
get_stratified_train_test_split()
,
get_train_test_split()
,
is.null_or_na()
,
matrix_to_array_c()
,
split_labeled_unlabeled()
,
summarize_tracked_sustainability()
,
to_categorical_c()