tokenize

A character vector of texts to be tokenized.

A character string giving the language of <code>s</code>.
    This argument is only used if <code>model</code> is <code>NULL</code> for
    selecting a default model.
    At the moment, languages <samp>en</samp> (English), <samp>es</samp> (Spanish),
    <

language

model

file

An interface to openNLP (http://opennlp.sourceforge.net/),
        a collection of natural language processing tools including a
        sentence detector, tokenizer, pos-tagger, shallow and full
        syntactic parser, and named-entity detector, using the Maxent
        Java package for training and using maximum entropy models.

tokenize: Tokenizer

Description

Usage

Arguments

Value

Details

References

Examples