Function for creating synthetic cases in order to balance the data for training with TEClassifierRegular or TEClassifierProtoNet]. This is an auxiliary function for use with get_synthetic_cases_from_matrix to allow parallel computations.
create_synthetic_units_from_matrix(
matrix_form,
target,
required_cases,
k,
method,
cat,
k_s,
max_k
)Returns a list which contains the text embeddings of the new synthetic cases as a named data.frame and
their labels as a named factor.
Named matrix containing the text embeddings in matrix form. In most cases this object is taken
from EmbeddedText$embeddings.
Named factor containing the labels/categories of the corresponding cases.
int Number of cases necessary to fill the gab between the frequency of the class under
investigation and the major class.
int The number of nearest neighbors during sampling process.
vector containing strings of the requested methods for generating new cases. Currently
"smote","dbsmote", and "adas" from the package smotefamily are available.
string The category for which new cases should be created.
int Number of ks in the complete generation process.
int The maximum number of nearest neighbors during sampling process.
Other data_management_utils:
get_n_chunks(),
get_synthetic_cases_from_matrix()