Rdocumentation
powered by
Learn R Programming
⚠️
There's a newer version (0.2.1) of this package.
Take me there.
tok
tok provides bindings to the [
Copy Link
Link to current version
Version
Version
0.2.1
0.2.0
0.1.4
0.1.3
0.1.2
0.1.1
0.1.0
Install
install.packages('tok')
Monthly Downloads
21,328
Version
0.2.0
License
MIT + file LICENSE
Issues
0
Pull Requests
1
Stars
46
Forks
2
Repository
https://github.com/mlverse/tok
Maintainer
Daniel Falbel
Last Published
September 30th, 2025
Functions in tok (0.2.0)
Search all functions
tok_trainer
Generic training class
tok_processor
Generic class for processors
trainer_bpe
BPE trainer
tokenizer
Tokenizer
tok_decoder
Generic class for decoders
processor_byte_level
Byte Level post processor
trainer_wordpiece
WordPiece tokenizer trainer
tok_normalizer
Generic class for normalizers
tok_model
Generic class for tokenization models
trainer_unigram
Unigram tokenizer trainer
normalizer_nfc
NFC normalizer
pre_tokenizer_byte_level
Byte level pre tokenizer
model_bpe
BPE model
normalizer_nfkc
NFKC normalizer
model_wordpiece
An implementation of the WordPiece algorithm
model_unigram
An implementation of the Unigram algorithm
pre_tokenizer_whitespace
This pre-tokenizer simply splits using the following regex:
\w+|[^\w\s]+
pre_tokenizer
Generic class for tokenizers
encoding
Encoding
decoder_byte_level
Byte level decoder