Controls for text data used in the blocking
function (if representation = shingles
), passed to tokenize_character_shingles.
controls_txt(
n_shingles = 2L,
n_chunks = 10L,
lowercase = TRUE,
strip_non_alphanum = TRUE
)
Returns a list with parameters.
length of shingles (default 2L
),
passed to (default 10L
),
should the characters be made lower-case? (default TRUE
),
should punctuation and white space be stripped? (default TRUE
).
Maciej Beręsewicz