tidytext (version 0.1.0)

pair_count: Count pairs of items that cooccur within a group

Description

Count the number of times pairs of items cooccur within a group. This returns a table with one row for each word-word pair that occurs within a group, along with n, the number of groups the pair cooccurs in. pair_count_ is the standard-evaluation version that can be programmed with.

Usage

pair_count(data, group, value, unique_pair = TRUE, self = FALSE,
  sort = FALSE)

pair_count_(data, group_col, value_col, unique_pair = TRUE, self = FALSE, sort = FALSE)

Arguments

data
A tbl
group, group_col
Column to count pairs within
value, value_col
Column containing values to count pairs of
unique_pair
Whether to have only one pair of value1 and value2. Setting this to FALSE is useful if you want to afterwards find the most common values paired with one of interest.
self
Whether to include an item as co-occuring with itself
sort
Whether to sort in decreasing order of frequency

Value

  • A data frame with three columns: value1, value2, and n.

Examples

Run this code
library(janeaustenr)
library(dplyr)

pride_prejudice_words <- data_frame(text = prideprejudice) %>%
  mutate(line = row_number()) %>%
  unnest_tokens(word, text) %>%
  anti_join(stop_words)

# find words that co-occur within lines
pride_prejudice_words %>%
  pair_count(line, word, sort = TRUE)

# when finding words most often occuring with a particular word,
# use unique_pair = FALSE
pride_prejudice_words %>%
  pair_count(line, word, sort = TRUE, unique_pair = FALSE) %>%
  filter(value1 == "elizabeth")

Run the code above in your browser using DataCamp Workspace