TermDocMatrix: Term-document matrix
Description
Constructs a term-document matrix.Usage
## S3 method for class 'TextDocCol':
TermDocMatrix(object, weighting = "tf", stemming
= FALSE, language = "english", minWordLength = 3, minDocFreq = 1,
stopwords = NULL)
Arguments
object
a text document collection
weighting
the weighting mode for the term-document
matrix. Possible settings are
tf
Term frequencytf-idf
Term frequency inverse document frequencybin
Binary frequencylogical
stemming
if set, stems words before making the term-document matrix.
language
the language determines the stemming rules
minWordLength
words smaller than this number are discarded for
the term-document matrix.
minDocFreq
words that appear less often in documents than this
number are discarded for the term-document matrix.
stopwords
a plain text file with all stopwords
Value
- An S4 object of class
TermDocMatrix
which extends the class
matrix
containing a term-document matrix. The following slots
contain useful information: - WeightingThe weighting mode applied to the term-document matrix