Learn R Programming

tm (version 0.1-1)

TermDocMatrix: Term-document matrix

Description

Constructs a term-document matrix.

Usage

## S3 method for class 'TextDocCol':
TermDocMatrix(object, weighting = "tf", stemming
= FALSE, language = "english", minWordLength = 3, minDocFreq = 1,
stopwords = NULL)

Arguments

object
a text document collection
weighting
the weighting mode for the term-document matrix. Possible settings are
  • tfTerm frequency
  • tf-idfTerm frequency inverse document frequency
  • binBinary frequency
  • logical
stemming
if set, stems words before making the term-document matrix.
language
the language determines the stemming rules
minWordLength
words smaller than this number are discarded for the term-document matrix.
minDocFreq
words that appear less often in documents than this number are discarded for the term-document matrix.
stopwords
a plain text file with all stopwords

Value

  • An S4 object of class TermDocMatrix which extends the class matrix containing a term-document matrix. The following slots contain useful information:
  • WeightingThe weighting mode applied to the term-document matrix