redcap_column_sanitize: Sanitize to adhere to REDCap character encoding requirements

Description

Replace non-ASCII characters with legal characters that won't cause problems when writing to a REDCap project.

Usage

redcap_column_sanitize(
  d,
  column_names = colnames(d),
  encoding_initial = "latin1",
  substitution_character = "?"
)

Value

A data frame with same columns, but whose character values have been sanitized.

Arguments

d: The base::data.frame() or tibble::tibble() containing the dataset used to update the REDCap project. Required.
column_names: An array of character values indicating the names of the variables to sanitize. Optional.
encoding_initial: An array of character values indicating the names of the variables to sanitize. Optional.
substitution_character: The character value that replaces characters that were unable to be appropriately matched.

Author

Will Beasley

Details

Letters like an accented 'A' are replaced with a plain 'A'.

This is a thin wrapper around base::iconv(). The ASCII//TRANSLIT option does the actual transliteration work. As of R 3.1.0, the OSes use similar, but different, versions to convert the characters. Be aware of this in case you notice OS-dependent differences.

Examples

Run this code

# Typical examples are not shown because they require non-ASCII encoding,
#   which makes the package documentation less portable.

dirty <- data.frame(
  id     = 1:3,
  names  = c("Ekstr\xf8m", "J\xf6reskog", "bi\xdfchen Z\xfcrcher")
)

REDCapR::redcap_column_sanitize(dirty)

Run the code above in your browser using DataLab