Learn R Programming

blocking (version 1.0.1)

controls_txt: Controls for processing character data

Description

Controls for text data used in the blocking function (if representation = shingles), passed to tokenize_character_shingles.

Usage

controls_txt(
  n_shingles = 2L,
  n_chunks = 10L,
  lowercase = TRUE,
  strip_non_alphanum = TRUE
)

Value

Returns a list with parameters.

Arguments

n_shingles

length of shingles (default 2L),

n_chunks

passed to (default 10L),

lowercase

should the characters be made lower-case? (default TRUE),

strip_non_alphanum

should punctuation and white space be stripped? (default TRUE).

Author

Maciej Beręsewicz