This function creates synthetic cases for balancing the training with an object of the class TEClassifierRegular or TEClassifierProtoNet.
get_synthetic_cases_from_matrix(
matrix_form,
times,
features,
target,
sequence_length,
method = c("smote"),
min_k = 1,
max_k = 6
)list with the following components:
syntetic_embeddings: Named data.frame containing the text embeddings of the synthetic cases.
syntetic_targets: Named factor containing the labels of the corresponding synthetic cases.
n_syntetic_units: table showing the number of synthetic cases for every label/category.
Named matrix containing the text embeddings in a matrix form.
int for the number of sequences/times.
int for the number of features within each sequence.
Named factor containing the labels of the corresponding embeddings.
int Length of the text embedding sequences.
vector containing strings of the requested methods for generating new cases. Currently "smote",
"dbsmote", and "adas" from the package smotefamily are available.
int The minimal number of nearest neighbors during sampling process.
int The maximum number of nearest neighbors during sampling process.
Other data_management_utils:
create_synthetic_units_from_matrix(),
get_n_chunks()