Learn R Programming

lab2clean (version 2.0.0)

clean_lab_result: Clean and Standardize Laboratory Result Values

Description

This function is designed to clean and standardize laboratory result values. It creates two new columns "clean_result" and "scale_type" without altering the original result values. The function is part of a comprehensive R package designed for cleaning laboratory datasets.

Usage

clean_lab_result(
  lab_data,
  raw_result,
  locale = "NO",
  report = TRUE,
  n_records = NA
)

Value

A modified `lab_data` data frame with additional columns: * `clean_result`: Cleaned and standardized result values. * `scale_type`: The scale type of result values (Quantitative, Ordinal, Nominal). * `cleaning_comments`: Comments about the cleaning process for each record.

Arguments

lab_data

A data frame containing laboratory data.

raw_result

The column in `lab_data` that contains raw result values to be cleaned.

locale

A string representing the locale for the laboratory data. Defaults to "NO".

report

A report is written in the console. Defaults to "TRUE".

n_records

In case you are loading a grouped list of distinct results, then you can assign the n_records to the column that contains the frequency of each distinct result. Defaults to NA.

Author

Ahmed Zayed <ahmed.zayed@kuleuven.be>

Details

The function undergoes the following methodology: 1. Clear Typos: Removes typographical errors and extraneous characters. 2. Handle Extra Variables: Identifies and separates extra variables from result values. 3. Detect and Assign Scale Types: Identifies and assigns the scale type using regular expressions. 4. Number Formatting: Standardizes number formats based on predefined rules and locale. 5. Mining Text Results: Identifies common words and patterns in text results.

Internal Datasets: The function uses an internal dataset; `common_words_languages.csv` which contains common words in various languages used for pattern identification in text result values.

See Also

Function 2 for result validation,