compute_distance: Find minimum distance of each word to other groups
Description
Find minimum distance of each word to other groupsUsage
compute_distance(id, name, n = 1, method = "jw", p = 0.1, ...)
Arguments
id
a vector of identifiers
name
a vector of characters
n
number of words for combinations. Default to 0
.
...
Other arguments to pass to stringdist
. See the stringdist
documentation. Value
tab_accross
returns a data.frame of four columns. The first is id, the second corresponds to unique combination of words in each element of v
with length lower than n
(sorted alphabetically), the third is the count of these permutation within id
, the fourth is the count of these permutation accross i
. When the count accross group is 1 and the count within group is high, the element can be considered as an identifier of the group.
library(stringdist)
id <- c(1, 1, 2, 2)
name <- c("coca cola company", "coca cola incorporated", "apple incorporated", "apple corp")
compute_distance(id, name, n = 0)
compute_distance(id, name, n = 1)
compute_distance(id, name, n = 2)