# get_idf

0th

Percentile

##### Inverse document-frequency scaling matrix

This function creates an inverse-document-frequency (IDF) scaling matrix from a document-term matrix. The IDF is defined as follows: idf = log(# documents in the corpus) / (# documents where the term appears + 1)

##### Usage
get_idf(dtm, log_scale = log, smooth_idf = TRUE)
##### Arguments
dtm
a document-term matrix of class dgCMatrix or dgTMatrix.
log_scale
function to use in calculating the IDF matrix. Usually log is used, but it might be worth trying log2.
smooth_idf
logical smooth IDF weights by adding one to document frequencies, as if an extra document was seen containing every term in the collection exactly once. This prevents division by zero.
##### Value

ddiMatrix IDF scaling diagonal sparse matrix.