mallet.import

id.array

An array of text strings to use as documents. The type of the array must be <code>character</code>.

text.array

The name of a file containing stopwords (words to ignore), one per line. If the file is not in the current working directory, you may need to include a full path.

stoplist.file

By default, the input text is converted to all lowercase.

preserve.case

A quoted string representing a regular expression that defines a token. The default is one or more unicode letter: "[\\p{L}]+". Note that special characters must have double backslashes.

token.regexp


This function takes an array of document IDs and text files (as character strings) and converts them into a Mallet instance list.


This package allows you to train topic models in mallet and load results directly into R.

mallet.import: Import text documents into Mallet format

Description

Usage

Arguments

See Also

Examples