A dataset containing regular expression meant to match commonly (OCR) misread occupations in directory entries. For each occupation a replacement pattern is provided for used in substitution operations as well as a boolean operator indicating whether the corresponding regex is case sensitive or not.
globals_occupationsA data frame with 3 variables:
regex for occupation matching
replacement pattern for substitution operations
boolean operator indicating whether the corresponding regex is case sensitive or not.