An array of text strings to use as documents. The type of the array must be character.
stoplist.file
The name of a file containing stopwords (words to ignore), one per line. If the file is not in the current working directory, you may need to include a full path.
preserve.case
By default, the input text is converted to all lowercase.
token.regexp
A quoted string representing a regular expression that defines a token. The default is one or more unicode letter: "[\\p{L}]+". Note that special characters must have double backslashes.
See Also
mallet.word.freqs returns term and document frequencies, which may be useful in selecting stopwords.