TermDocFreq: Get term frequencies and document frequencies from a document term matrix.
Description
This function takes a document term matrix as input and
returns a data frame with columns for term frequency, document frequency,
and inverse-document frequency
Usage
TermDocFreq(dtm)
Value
Returns a data.frame or tibble with 4 columns.
The first column, term is a vector of token labels.
The second column, term_freq is the count of times term
appears in the entire corpus. The third column doc_freq is the
count of the number of documents in which term appears.
The fourth column, idf is the log-weighted
inverse document frequency of term.
# Load a pre-formatted dtm and topic modeldata(nih_sample_dtm)
data(nih_sample_topic_model)
# Get the term frequencies term_freq_mat <- TermDocFreq(nih_sample_dtm)
str(term_freq_mat)