Object of class R6 which stores the text embeddings
generated by an object of class TextEmbeddingModel via the method
embed().
Returns an object of class EmbeddedText. These objects are used
for storing and managing the text embeddings created with objects of class TextEmbeddingModel.
Objects of class EmbeddedText serve as input for classifiers of class
TextEmbeddingClassifierNeuralNet. The main aim of this class is to provide a structured link between
embedding models and classifiers. Since objects of this class save information on
the text embedding model that created the text embedding it ensures that only
embedding generated with same embedding model are combined. Furthermore, the stored information allows
classifiers to check if embeddings of the correct text embedding model are used for
training and predicting.
embeddings('data.frame()')
data.frame containing the text embeddings for all chunks. Documents are
in the rows. Embedding dimensions are in the columns.
new()Creates a new object representing text embeddings.
EmbeddedText$new(
model_name = NA,
model_label = NA,
model_date = NA,
model_method = NA,
model_version = NA,
model_language = NA,
param_seq_length = NA,
param_chunks = NULL,
param_overlap = NULL,
param_emb_layer_min = NULL,
param_emb_layer_max = NULL,
param_emb_pool_type = NULL,
param_aggregation = NULL,
embeddings
)model_namestring Name of the model that generates this embedding.
model_labelstring Label of the model that generates this embedding.
model_datestring Date when the embedding generating model was created.
model_methodstring Method of the underlying embedding model.
model_versionstring Version of the model that generated this embedding.
model_languagestring Language of the model that generated this embedding.
param_seq_lengthint Maximum number of tokens that processes the generating model for a chunk.
param_chunksint Maximum number of chunks which are supported by the generating model.
param_overlapint Number of tokens that were added at the beginning of the sequence for the next chunk
by this model.
param_emb_layer_minint or string determining the first layer to be included
in the creation of embeddings.
param_emb_layer_maxint or string determining the last layer to be included
in the creation of embeddings.
param_emb_pool_typestring determining the method for pooling the token embeddings
within each layer.
param_aggregationstring Aggregation method of the hidden states. Deprecated. Only included
for backward compatibility.
embeddingsdata.frame containing the text embeddings.
Returns an object of class EmbeddedText which stores the text embeddings produced by an objects of class TextEmbeddingModel. The object serves as input for objects of class TextEmbeddingClassifierNeuralNet.
get_model_info()Method for retrieving information about the model that generated this embedding.
EmbeddedText$get_model_info()list contains all saved information about the underlying
text embedding model.
get_model_label()Method for retrieving the label of the model that generated this embedding.
EmbeddedText$get_model_label()string Label of the corresponding text embedding model
clone()The objects of this class are cloneable with this method.
EmbeddedText$clone(deep = FALSE)deepWhether to make a deep clone.
Other Text Embedding:
TextEmbeddingModel,
combine_embeddings()