dfm_trim

a <a rd-options="" href="/link/dfm?package=quanteda&version=1.1.1" data-mini-rdoc="quanteda::dfm">dfm</a> object

minimum/maximum count or percentile frequency of
features across all documents, below/above which features will be removed

min_count, max_count

minimum/maximum number or fraction of
documents in which a feature appears, below/above which features will be
removed

min_docfreq, max_docfreq

equivalent to 1 - min_docfreq, included for comparison with
tm

sparsity

verbose

Returns a document by feature matrix reduced in size based on document and
term frequency, usually in terms of a minimum frequencies, but may also be in
terms of maximum frequencies. Setting a combination of minimum and maximum
frequencies will select features based on a range.

A fast, flexible, and comprehensive framework for
quantitative text analysis in R.  Provides functionality for corpus management,
creating and manipulating tokens and ngrams, exploring keywords in context,
forming and manipulating sparse matrices
of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and
distances, applying content dictionaries, applying supervised and unsupervised machine learning,
visually representing text and text analyses, and more.

Kenneth Benoit

quanteda

Quantitative Analysis of Textual Data

Kohei Watanabe

Paul Nulty

Adam Obeng

Haiyan Wang

Stefan M<c3><bc>ller

Benjamin Lauderdale

Will Lowe

dfm_trim function

a <a rd-options='' href='dfm'>dfm</a> object

dfm_trim: Trim a dfm using frequency threshold-based feature selection

Description

Usage

Arguments

Value

See Also

Examples