clean_lab_result: Clean and Standardize Laboratory Result Values

Description

This function is designed to clean and standardize laboratory result values. It creates two new columns "clean_result" and "scale_type" without altering the original result values. The function is part of a comprehensive R package designed for cleaning laboratory datasets.

Usage

clean_lab_result(
  lab_data,
  raw_result,
  locale = "NO",
  report = TRUE,
  n_records = NA
)

Value

A modified `lab_data` data frame with additional columns: * `clean_result`: Cleaned and standardized result values. * `scale_type`: The scale type of result values (Quantitative, Ordinal, Nominal). * `cleaning_comments`: Comments about the cleaning process for each record.

Arguments

lab_data: A data frame containing laboratory data.
raw_result: The column in `lab_data` that contains raw result values to be cleaned.
locale: A string representing the locale for the laboratory data. Defaults to "NO".
report: A report is written in the console. Defaults to "TRUE".
n_records: In case you are loading a grouped list of distinct results, then you can assign the n_records to the column that contains the frequency of each distinct result. Defaults to NA.

Author

Ahmed Zayed <ahmed.zayed@kuleuven.be>

Details

The function undergoes the following methodology: 1. Clear Typos: Removes typographical errors and extraneous characters. 2. Handle Extra Variables: Identifies and separates extra variables from result values. 3. Detect and Assign Scale Types: Identifies and assigns the scale type using regular expressions. 4. Number Formatting: Standardizes number formats based on predefined rules and locale. 5. Mining Text Results: Identifies common words and patterns in text results.