get_idf

a document-term matrix of class <code>dgCMatrix</code> or
<code>dgTMatrix</code>.

<code>function</code> to use in calculating the IDF matrix.
Usually <a rd-options="" href="/link/log?package=text2vec&version=0.3.0" data-mini-rdoc="text2vec::log">log</a> is used, but it might be worth trying <a rd-options="" href="/link/log2?package=text2vec&version=0.3.0" data-mini-rdoc="text2vec::log2">log2</a>.

log_scale

<code>logical</code> smooth IDF weights by adding one to document
frequencies, as if an extra document was seen containing every term in the
collection exactly once. This prevents division by zero.

smooth_idf


This function creates an inverse-document-frequency (IDF)
  scaling matrix from a document-term matrix. The IDF is defined as follows:
  <code>idf = log(# documents in the corpus) / (# documents where the term
  appears + 1)</code>


Very fast and memory-friendly tools for text vectorization and
state-of-the-art word embeddings (GloVe). This package provides a
source-agnostic streaming API, which allows researchers to perform analysis
of collections of documents which are much larger than available RAM. All
core functions are parallelized to benefit from multicore machines.

get_idf: Inverse document-frequency scaling matrix

Description

Usage

Arguments

Value

See Also