Learn R Programming

quanteda.tidy (version 0.4)

distinct.corpus: Subset documents distinct/unique by document variables

Description

Select only documents that are unique/distinct with respect to values of their document variables.

Usage

# S3 method for corpus
distinct(.data, ..., .keep_all = FALSE)

Value

A corpus containing only documents with unique combinations of the specified document variables.

Arguments

.data

a corpus object with document variables

...

comma-separated list of unquoted document variables, or expressions involving document variables

.keep_all

If TRUE, keep all variables in .data. If a combination of ... is not distinct, this keeps the first row of values.

Examples

Run this code
distinct(data_corpus_inaugural[1:5], President) %>%
  summary()
distinct(data_corpus_inaugural[1:5], President, .keep_all = TRUE) %>%
  summary()

Run the code above in your browser using DataLab