Byte Level post processor
Byte Level post processor
tok::tok_processor -> tok_processor_byte_level
new()Initializes the byte level post processor
processor_byte_level$new(trim_offsets = TRUE)trim_offsetsWhether to trim the whitespaces from the produced offsets.
clone()The objects of this class are cloneable with this method.
processor_byte_level$clone(deep = FALSE)deepWhether to make a deep clone.
This post-processor takes care of trimming the offsets. By default, the ByteLevel BPE might include whitespaces in the produced tokens. If you don’t want the offsets to include these whitespaces, then this PostProcessor must be used.
Other processors:
tok_processor