This file contains helper functions for text processing and cleaning used in the 'sumup' package. The functions provide capabilities for text correction, cleaning, dataset preparation, and handling abbreviation replacements.
Cleans narrative text columns (sterk, verbeter, feedback) in a dataset.
text_clean(data, corrections_file)A cleaned dataset with standardized text.
A data.table or data.frame containing text data.
Path to a CSV file containing typo corrections.
Joyce M.W. Moonen - van Loon
This file defines several functions:
text_clean(): Cleans multiple columns in a dataset.
correct_text(): Applies typo corrections using a predefined correction list.
clean_text(): Cleans and formats input text, handling HTML tags, spacing, punctuation, and typos.
get_corrections(): Reads a correction file and processes typo replacements.
create_dataset_narratives(): Prepares a dataset by restructuring and normalizing its textual content.
replace_abbr(): Corrects abbreviations.