Function for creating synthetic cases in order to balance the data for training with TextEmbeddingClassifierNeuralNet. This is an auxiliary function for use with get_synthetic_cases to allow parallel computations.
create_synthetic_units(embedding, target, k, max_k, method, cat, cat_freq)
Returns a list
which contains the text embeddings of the
new synthetic cases as a named data.frame
and their labels as a named
factor
.
Named data.frame
containing the text embeddings.
In most cases this object is taken from EmbeddedText$embeddings.
Named factor
containing the labels/categories of the corresponding cases.
int
The number of nearest neighbors during sampling process.
int
The maximum number of nearest neighbors during sampling process.
vector
containing strings of the requested methods for generating new cases.
Currently "smote","dbsmote", and "adas" from the package smotefamily are available.
string
The category for which new cases should be created.
Object of class "table"
containing the absolute frequencies
of every category/label.
Other Auxiliary Functions:
array_to_matrix()
,
calc_standard_classification_measures()
,
check_embedding_models()
,
clean_pytorch_log_transformers()
,
create_iota2_mean_object()
,
generate_id()
,
get_coder_metrics()
,
get_folds()
,
get_n_chunks()
,
get_stratified_train_test_split()
,
get_synthetic_cases()
,
get_train_test_split()
,
is.null_or_na()
,
matrix_to_array_c()
,
split_labeled_unlabeled()
,
summarize_tracked_sustainability()
,
to_categorical_c()