There are many local and global weighting functions. In this package, local weighting functions are prefixed with lw_ and
global weighting functions with gw_, so users can define their own weighting functions.
Local weighting functions (i.e. weighting every cell in the matrix):
lw_tf Term frequency: f(x) = x.
lw_raw Raw frequency, which is the same as the term frequency: f(x) = x.
lw_log Logarithm: f(x) = log(x + 1).
lw_bin Binary: f(x) = 1 if x > 0 and 0 otherwise.
Global weighting functions, weighting the columns of the matrix (hence, these weighting functions work according to expectation for
a document-term matrix, i.e. with the documents as the rows and the terms as the columns):
gw_idf Inverse document frequency: f(x) = log( nrow(x) / n + 1) where n = the number of rows in which the column >0.
gw_idf_alt Alternative definition of the inverse document frequency: f(x) = log( nrow(x) / n) + 1 where n = the number of rows in which the column >0.
gw_gfidf Global frequency multiplied by inverse document frequency: f(x) = colSums(x) / n where n = the number of rows in which the column >0.
gw_nor Normal(ized) frequency: f(x) = x / colSums(x^2).
gw_ent Entropy: f(x) = 1 + the relative Shannon entropy.
gw_bin Binary: f(x) = 1.
gw_raw Raw, which is the same as binary: f(x) = 1.