50% off | Unlimited Data & AI Learning

Last chance! 50% off unlimited learning

Sale ends in


fedmatch (version 2.0.6)

wgt_jaccard_distance: Computing Weighted Jaccard Distance

Description

#' wgt_jaccard_distance computes the Weighted Jaccard Distance between two strings. It is vectorized, and accepts only two equal-length string vectors.

Usage

wgt_jaccard_distance(string_1, string_2, corpus, nthreads = 1)

Value

numeric vector with the Weighted Jaccard distances for each element of string_1 and string_2.

Arguments

string_1

character vector

string_2

character vector

corpus

corpus data.table, constructed with fedmatch::build_corpus

nthreads

number of threads to use in the underlying C++ code

Details

See the vignette fuzzy_matching for details on how the Weighted Jaccard similarity is computed.