This function creates an inverse-document-frequency (IDF)
scaling matrix from a document-term matrix. The IDF is defined as follows:
idf = log(# documents in the corpus) / (# documents where the term
appears + 1)
get_idf(dtm, log_scale = log, smooth_idf = TRUE)
a document-term matrix of class dgCMatrix
or
dgTMatrix
.
logical
smooth IDF weights by adding one to document
frequencies, as if an extra document was seen containing every term in the
collection exactly once. This prevents division by zero.
ddiMatrix
IDF scaling diagonal sparse matrix.