powered by
A function to select terms for inclusion in a stylest2 model, based on a document-feature matrix of texts to predict and a specified cutoff.
stylest2_terms(dfm, cutoff)
A character vector of terms falling above the term frequency cutoff.
a quanteda dfm object.
dfm
a single numeric value - the quantile of term frequency under which to drop terms.
data(novels_dfm) best_cut <- stylest2_select_vocab(dfm=novels_dfm) stylest2_terms(dfm = novels_dfm, cutoff=best_cut$cutoff_pct_best)
Run the code above in your browser using DataLab