This function filters the input vocabulary and throws out very
frequent and very infrequent terms. See examples in for the
vocabulary function. The parameter
also be used to limit the absolute size of the vocabulary to only the most
frequently used terms.
prune_vocabulary(vocabulary, term_count_min = 1L, term_count_max = Inf, doc_proportion_min = 0, doc_proportion_max = 1, doc_count_min = 1L, doc_count_max = Inf, vocab_term_max = Inf)
a vocabulary from the vocabulary function.
minimum number of occurences over all documents.
maximum number of occurences over all documents.
minimum proportion of documents which should contain term.
maximum proportion of documents which should contain term.
term will be kept number of documents contain this term is larger than this value
term will be kept number of documents contain this term is smaller than this value
maximum number of terms in vocabulary.