tm (version 0.7-8)

findMostFreqTerms: Find Most Frequent Terms

Description

Find most frequent terms in a document-term or term-document matrix, or a vector of term frequencies.

Usage

findMostFreqTerms(x, n = 6L, ...)
# S3 method for DocumentTermMatrix
findMostFreqTerms(x, n = 6L, INDEX = NULL, ...)
# S3 method for TermDocumentMatrix
findMostFreqTerms(x, n = 6L, INDEX = NULL, ...)

Value

For the document-term or term-document matrix methods, a list with the named frequencies of the up to n most frequent terms occurring in each document (group). Otherwise, a single such vector of most frequent terms.

Arguments

x

A DocumentTermMatrix or TermDocumentMatrix, or a vector of term frequencies as obtained by termFreq().

n

A single integer giving the maximal number of terms.

INDEX

an object specifying a grouping of documents for rollup, or NULL (default) in which case each document is considered individually.

...

arguments to be passed to or from methods.

Details

Only terms with positive frequencies are included in the results.

Examples

Run this code
data("crude")

## Term frequencies:
tf <- termFreq(crude[[14L]])
findMostFreqTerms(tf)

## Document-term matrices:
dtm <- DocumentTermMatrix(crude)
## Most frequent terms for each document:
findMostFreqTerms(dtm)
## Most frequent terms for the first 10 the second 10 documents,
## respectively:
findMostFreqTerms(dtm, INDEX = rep(1 : 2, each = 10L))

Run the code above in your browser using DataLab