⚠️There's a newer version (0.3.0) of this package. Take me there.

tokenizers (version 0.1.0)

Tokenize Text

Description

Convert natural language text into tokens. The tokenizers have a consistent interface and are compatible with Unicode, thanks to being built on the 'stringi' package. Includes tokenizers for shingled n-grams, skip n-grams, words, word stems, sentences, paragraphs, characters, lines, and regular expressions.

Copy Link

Version

Down Chevron

Install

install.packages('tokenizers')

Monthly Downloads

34,594

Version

0.1.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

April 2nd, 2016

Functions in tokenizers (0.1.0)