transform_filter_commons

0th

Percentile

Remove terms from a document-term matrix

This function removes very common and very uncommon words from a document-term matrix.

Usage
transform_filter_commons(dtm, term_freq = c(uncommon = 0.001, common = 0.975))
Arguments
dtm
a document-term matrix of class dgCMatrix or dgTMatrix.
term_freq
numeric vector of 2 values in between 0 and 1. The first element corresponds to frequency of uncommon words; the second element corresponds to the frequency of common words. Terms which are observed less than first value or frequency or more than second will be filtered out.
See Also

prune_vocabulary, transform_tf, transform_tfidf, transform_binary

Aliases
  • transform_filter_commons
Documentation reproduced from package text2vec, version 0.3.0, License: MIT + file LICENSE

Community examples

Looks like there are no examples yet.