tm (version 0.7-7)

findMostFreqTerms: Find Most Frequent Terms

Description

Find most frequent terms in a document-term or term-document matrix, or a vector of term frequencies.

Usage

findMostFreqTerms(x, n = 6L, ...)
# S3 method for DocumentTermMatrix
findMostFreqTerms(x, n = 6L, INDEX = NULL, ...)
# S3 method for TermDocumentMatrix
findMostFreqTerms(x, n = 6L, INDEX = NULL, ...)

Arguments

x

A DocumentTermMatrix or TermDocumentMatrix, or a vector of term frequencies as obtained by termFreq().

n

A single integer giving the maximal number of terms.

INDEX

an object specifying a grouping of documents for rollup, or NULL (default) in which case each document is considered individually.

...

arguments to be passed to or from methods.

Value

For the document-term or term-document matrix methods, a list with the named frequencies of the up to n most frequent terms occurring in each document (group). Otherwise, a single such vector of most frequent terms.

Details

Only terms with positive frequencies are included in the results.

Examples

Run this code
# NOT RUN {
data("crude")

## Term frequencies:
tf <- termFreq(crude[[14L]])
findMostFreqTerms(tf)

## Document-term matrices:
dtm <- DocumentTermMatrix(crude)
## Most frequent terms for each document:
findMostFreqTerms(dtm)
## Most frequent terms for the first 10 the second 10 documents,
## respectively:
findMostFreqTerms(dtm, INDEX = rep(1 : 2, each = 10L))
# }

Run the code above in your browser using DataCamp Workspace