Learn R Programming

textreuse (version 0.1.2)

pairwise_compare: Pairwise comparisons among documents in a corpus

Description

Given a TextReuseCorpus containing documents of class TextReuseTextDocument, this function applies a comparison function to every pairing of documents, and returns a matrix with the comparison scores.

Usage

pairwise_compare(corpus, f, ..., directional = FALSE,
  progress = interactive())

Arguments

Value

  • A square matrix with dimensions equal to the length of the corpus, and row and column names set by the names of the documents in the corpus. A value of NA in the matrix indicates that a comparison was not made. In cases of directional comparisons, then the comparison reported is f(row, column).

See Also

See these document comparison functions, jaccard_similarity, ratio_of_matches.

Examples

Run this code
dir <- system.file("extdata/legal", package = "textreuse")
corpus <- TextReuseCorpus(dir = dir)
names(corpus) <- filenames(names(corpus))

# A non-directional comparison
pairwise_compare(corpus, jaccard_similarity)

# A directional comparison
pairwise_compare(corpus, ratio_of_matches, directional = TRUE)

Run the code above in your browser using DataLab