widyr (version 0.1.5)

pairwise_similarity: Cosine similarity of pairs of items

Description

Compute cosine similarity of all pairs of items in a tidy table.

Usage

pairwise_similarity(tbl, item, feature, value, ...)

pairwise_similarity_(tbl, item, feature, value, ...)

Arguments

tbl

Table

item

Item to compare; will end up in item1 and item2 columns

feature

Column describing the feature that links one item to others

value

Value

...

Extra arguments passed on to squarely(), such as diag and upper

See Also

squarely()

Examples

Run this code

library(janeaustenr)
library(dplyr)
library(tidytext)

# Comparing Jane Austen novels
austen_words <- austen_books() %>%
  unnest_tokens(word, text) %>%
  anti_join(stop_words, by = "word") %>%
  count(book, word) %>%
  ungroup()

# closest books to each other
closest <- austen_words %>%
  pairwise_similarity(book, word, n) %>%
  arrange(desc(similarity))

closest

closest %>%
  filter(item1 == "Emma")

Run the code above in your browser using DataCamp Workspace