count_combinations: Find best string combinations that identify an id
Description
Find best string combinations that identify an id
Usage
count_combinations(id, name, n = 1)
Arguments
id
a vector of identifiers
name
a vector of characters
n
number of words for combinations. Default to 1.
Value
tab_accross returns a data.frame of four columns. The first is id, the second corresponds to unique combination of words in each element of v with length lower than n (sorted alphabetically), the third is the count of these permutation within id, the fourth is the count of these permutation accross i. Intuitively, when the count accross group is 1 and the count within group is high, the element can be considered as an identifier of the group.