quanteda (version 4.0.1)

corpus_subset: Extract a subset of a corpus

Description

Returns subsets of a corpus that meet certain conditions, including direct logical operations on docvars (document-level variables). corpus_subset functions identically to subset.data.frame(), using non-standard evaluation to evaluate conditions based on the docvars in the corpus.

Usage

corpus_subset(x, subset, drop_docid = TRUE, ...)

Value

corpus object, with a subset of documents (and docvars) selected according to arguments

Arguments

x

corpus object to be subsetted.

subset

logical expression indicating the documents to keep: missing values are taken as false.

drop_docid

if TRUE, docid for documents are removed as the result of subsetting.

...

not used

See Also

Examples

Run this code
summary(corpus_subset(data_corpus_inaugural, Year > 1980))
summary(corpus_subset(data_corpus_inaugural, Year > 1930 & President == "Roosevelt"))

Run the code above in your browser using DataLab