Learn R Programming

tok (version 0.2.0)

model_wordpiece: An implementation of the WordPiece algorithm

Description

An implementation of the WordPiece algorithm

An implementation of the WordPiece algorithm

Arguments

Super class

tok::tok_model -> tok_model_wordpiece

Methods


Method new()

Constructor for the wordpiece tokenizer

Usage

model_wordpiece$new(
  vocab = NULL,
  unk_token = NULL,
  max_input_chars_per_word = NULL
)

Arguments

vocab

A dictionary of string keys and their corresponding ids. Default: NULL.

unk_token

The unknown token to be used by the model. Default: NULL.

max_input_chars_per_word

The maximum number of characters to allow in a single word. Default: NULL.


Method clone()

The objects of this class are cloneable with this method.

Usage

model_wordpiece$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

See Also

Other model: model_bpe, model_unigram, tok_model