An implementation of the WordPiece algorithm
An implementation of the WordPiece algorithm
tok::tok_model -> tok_model_wordpiece
new()Constructor for the wordpiece tokenizer
model_wordpiece$new(
vocab = NULL,
unk_token = NULL,
max_input_chars_per_word = NULL
)vocabA dictionary of string keys and their corresponding ids.
Default: NULL.
unk_tokenThe unknown token to be used by the model.
Default: NULL.
max_input_chars_per_wordThe maximum number of characters to allow in a single word.
Default: NULL.
clone()The objects of this class are cloneable with this method.
model_wordpiece$clone(deep = FALSE)deepWhether to make a deep clone.
Other model:
model_bpe,
model_unigram,
tok_model