Computes string distances between query keys and a string column in a materialized block. Optionally uses exact-match blocking on a second column (e.g., genus) to reduce the search space.
block_fuzzy_lookup(
block,
column,
keys,
method = "dl",
max_dist = 0.2,
block_col = NULL,
block_keys = NULL,
n_threads = 4L
)A data.frame with columns query_idx (1-based position in keys),
fuzzy_dist (normalized distance), plus all columns from the block.
A vectra_block from materialize().
Character scalar. Name of the string column to fuzzy-match against.
Character vector. Query strings to match.
Character. Distance method: "dl" (Damerau-Levenshtein, default),
"levenshtein", or "jw" (Jaro-Winkler).
Numeric. Maximum normalized distance (default 0.2).
Optional character scalar. Column name for exact-match blocking
(e.g., genus). When provided, only rows where block_col matches the
corresponding block_keys value are compared.
Optional character vector (same length as keys). Exact-match
values for blocking. Required when block_col is provided.
Integer. Number of OpenMP threads (default 4L).