Learn R Programming

tok

tok provides bindings to the [

Copy Link

Version

Install

install.packages('tok')

Monthly Downloads

21,328

Version

0.2.1

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Daniel Falbel

Last Published

September 30th, 2025

Functions in tok (0.2.1)

trainer_wordpiece

WordPiece tokenizer trainer
trainer_bpe

BPE trainer
tok_model

Generic class for tokenization models
tok_trainer

Generic training class
tok_processor

Generic class for processors
tok_decoder

Generic class for decoders
tok_normalizer

Generic class for normalizers
trainer_unigram

Unigram tokenizer trainer
processor_byte_level

Byte Level post processor
tokenizer

Tokenizer
model_bpe

BPE model
pre_tokenizer_whitespace

This pre-tokenizer simply splits using the following regex: \w+|[^\w\s]+
pre_tokenizer

Generic class for tokenizers
pre_tokenizer_byte_level

Byte level pre tokenizer
normalizer_nfc

NFC normalizer
decoder_byte_level

Byte level decoder
model_wordpiece

An implementation of the WordPiece algorithm
model_unigram

An implementation of the Unigram algorithm
normalizer_nfkc

NFKC normalizer
encoding

Encoding