This function filters the input vocabulary and throws out very
frequent and very infrequent terms. See examples in for the
vocabulary function. The parameter
also be used to limit the absolute size of the vocabulary to only the most
frequently used terms.
prune_vocabulary(vocabulary, term_count_min = 1L, term_count_max = Inf, doc_proportion_min = 0, doc_proportion_max = 1, max_number_of_terms = Inf)
- a vocabulary from the vocabulary function.
- minimum number of occurences over all documents.
- maximum number of occurences over all documents.
- minimum proportion of documents which should contain term.
- maximum proportion of documents which should contain term.
- maximum number of terms in vocabulary.