Learn R Programming

sumup (version 1.0.1)

text_clean: Text Cleaning and Processing Functions

Description

This file contains helper functions for text processing and cleaning used in the 'sumup' package. The functions provide capabilities for text correction, cleaning, dataset preparation, and handling abbreviation replacements.

Cleans narrative text columns (sterk, verbeter, feedback) in a dataset.

Usage

text_clean(data, corrections_file)

Value

A cleaned dataset with standardized text.

Arguments

data

A data.table or data.frame containing text data.

corrections_file

Path to a CSV file containing typo corrections.

Author

Joyce M.W. Moonen - van Loon

This file defines several functions:

  • text_clean(): Cleans multiple columns in a dataset.

  • correct_text(): Applies typo corrections using a predefined correction list.

  • clean_text(): Cleans and formats input text, handling HTML tags, spacing, punctuation, and typos.

  • get_corrections(): Reads a correction file and processes typo replacements.

  • create_dataset_narratives(): Prepares a dataset by restructuring and normalizing its textual content.

  • replace_abbr(): Corrects abbreviations.