Learn R Programming

whisper (version 0.1.0)

whisper_tokenizer: Whisper BPE Tokenizer

Description

Byte-pair encoding tokenizer for Whisper models. Create Whisper Tokenizer Load or create a Whisper tokenizer from HuggingFace vocab files.

Usage

whisper_tokenizer(model = "tiny")

Value

Tokenizer object (list with encode/decode functions)

Arguments

model

Model name for vocab lookup

Examples

Run this code
# \donttest{
# Load tokenizer (requires prior model download)
if (model_exists("tiny")) {
  tok <- whisper_tokenizer("tiny")
  tok$encode("Hello world")
  tok$decode(c(50258, 50259, 50359, 50363))
}
# }

Run the code above in your browser using DataLab