Learn R Programming

text2vec (version 0.2.0)

prune_vocabulary: Prunes vocabulary.

Description

Perform filtering of the input vocabulary and trhrowing out very frequet and very infrequet terms. Also leaves top `max_number_of_terms` (by count) terms. See examples in vocabulary function.

Usage

prune_vocabulary(vocabulary, term_count_min = 1L, term_count_max = Inf,
  doc_proportion_min = 0, doc_proportion_max = 1,
  max_number_of_terms = Inf)

Arguments

vocabulary
vocabulary from vocabulary function output.
term_count_min
minimum number of occurences over all documents.
term_count_max
maximum number of occurences over all documents.
doc_proportion_min
minimum proportion of documents which should contain term.
doc_proportion_max
maximum proportion of documents which should contain term.
max_number_of_terms
maximum number of terms in vocabulary.

See Also

vocabulary