tokenizers (version 0.1.0)
Description
Convert natural language text into tokens. The tokenizers have a
consistent interface and are compatible with Unicode, thanks to being built
on the 'stringi' package. Includes tokenizers for shingled n-grams, skip
n-grams, words, word stems, sentences, paragraphs, characters, lines, and
regular expressions.