dittodb (version 0.1.8)

redact_columns: Redact columns from a dataframe with the default redactors

Description

This function redacts the columns specified in columns in the data given in data using dittodb's standard redactors.

Usage

redact_columns(data, columns, ignore.case = TRUE, ...)

Value

data, with the columns specified in columns duly redacted

Arguments

data

a dataframe to redact

columns

character, the columns to redact

ignore.case

should case be ignored? (default: TRUE)

...

additional options to pass on to grep() when matching the column names

Details

The column names given in the columns argument are treated as regular expressions, however they always have ^ and $ added to the beginning and end of the strings. So if you would like to match any column that starts with the string sensitive (e.g. sensitive_name, sensitive_date) you could use "sensitive.* and this would catch all of those columns (though it would not catch a column called most_sensitive_name).

The standard redactors replace all values in the column with the following values based on the columns type:

  • integer -- 9L

  • numeric -- 9

  • character -- "[redacted]"

  • POSIXct (date times) -- as.POSIXct("1988-10-11T17:00:00", tz = tzone)

Examples

Run this code
if (check_for_pkg("nycflights13", message)) {
  small_flights <- head(nycflights13::flights)

  # with no columns specified, redacting does nothing
  redact_columns(small_flights, columns = NULL)

  # integer
  redact_columns(small_flights, columns = c("arr_time"))

  # numeric
  redact_columns(small_flights, columns = c("arr_delay"))

  # characters
  redact_columns(small_flights, columns = c("origin", "dest"))

  # datetiems
  redact_columns(small_flights, columns = c("time_hour"))
}

Run the code above in your browser using DataLab