`weightTfIdf(m, normalize = TRUE)`

m

A

`TermDocumentMatrix`

in term frequency format.normalize

A Boolean value indicating whether the term
frequencies should be normalized.

- The weighted matrix.

`WeightingFunction`

with the
additional attributes `Name`

and `Acronym`

. *Term frequency* $\mathit{tf}_{i,j}$ counts the number of
occurrences $n_{i,j}$ of a term $t_i$ in a document
$d_j$. In the case of normalization, the term frequency
$\mathit{tf}_{i,j}$ is divided by $\sum_k n_{k,j}$.

*Inverse document frequency* for a term $t_i$ is defined as
$$\mathit{idf}_i = \log_2 \frac{|D|}{|{d \mid t_i \in d}|}$$ where
$|D|$ denotes the total number of documents and where $|{d
\mid t_i \in d}|$ is the number of documents where the term $t_i$
appears.

*Term frequency - inverse document frequency* is now defined as
$\mathit{tf}_{i,j} \cdot \mathit{idf}_i$.