Learn R Programming

DQA (version 0.1.0)

currency_check: Perform Currency Check for Data Frame Columns

Description

This function evaluates a source dataframe (`S_data`) against a set of rules defined in a metadata dataframe (`M_data`). it Checks currency rules (Temporal conditions) on columns of a data frame, based on metadata specifications. Supports flexible rule definition,date literal handling, and customizable output.

Usage

currency_check(
  S_data,
  M_data,
  Result = FALSE,
  show_column = NULL,
  date_parser_fun = smart_to_gregorian_vec,
  var_select = "all",
  verbose = FALSE
)

Value

If Result = FALSE: a data.frame summary with columns:

  • VARIABLE: Name of the variable/rule.

  • Condition_Met: Number of rows where the rule is TRUE.

  • Condition_Not_Met: Number of rows where the rule is FALSE.

  • NA_Count: Number of rows with missing/indeterminate result.

  • Total_Applicable: Number of non-NA rows.

  • Total_Rows: Number of total rows.

  • Percent_Met: Percentage of applicable rows meeting the condition.

  • Percent_Not_Met: Percentage of applicable rows not meeting the condition.

  • Currency_Error_Type: Error type from metadata (if available).

If Result = TRUE: a data.frame with one column per rule (variable), each containing logical values for every row, plus optional columns from the source data.

Arguments

S_data

data.frame. The source data in which rules will be evaluated. Each column may be referenced by the rules.

M_data

data.frame. Metadata describing variables and their currency rules. Must include columns VARIABLE, Currency_Rule and TYPE. Optionally includes Currency_Error_Type.

Result

logical (default: FALSE). If TRUE, returns row-by-row evaluation results for each rule. If FALSE, returns a summary table for each rule.

show_column

character vector (default: NULL). Names of columns from S_data to include in the result when Result = TRUE. Ignored otherwise.

date_parser_fun

function (default: smart_to_gregorian_vec). Converting Persian dates to English,Function to convert date values or date literals to Date class. Must accept character vectors and return Date objects.

var_select

character, numeric, or "all" (default: "all"). Subset of variables (rules) to check. Can be a character vector of variable names, numeric vector of row indices in M_data, or "all" to run all rules.

verbose

logical (default: FALSE). If TRUE, prints diagnostic messages during rule processing and evaluation.

Details

The metadata data.frame (M_data) **must** contain the following columns:

  • VARIABLE: Name of the variable in S_data to which the rule applies.

  • Currency_Rule: A logical rule provided as a string that defines temporal (date/time) conditions to be evaluated.

  • TYPE: Specifies the type of the variable (e.g., "numeric", "date", "character").

  • Currency_Error_Type: The error type for each rule will be reported in the summary output.Based on the importance and severity of the rule, it can include two options: "Warning" or "Error".

For each variable described in M_data, the function:

  • Preprocesses the rule: replaces 'val' with the variable name, parses date literals and substitutes them with placeholders.

  • Converts referenced data columns to appropriate types (numeric, date) based on metadata.

  • Evaluates the rule for each row, either vectorized or row-wise if needed.

If Result = FALSE, returns a summary table with counts and percentages of rows meeting/not meeting each condition. If Result = TRUE, returns a data.frame with boolean results for each rule, optionally including selected columns from the source data.

Examples

Run this code
# Source data
S_data <- data.frame(
  VisitDate = c("2025-09-23", "2021-01-10", "2021-01-03","1404-06-28","1404-07-28",NA),
  Test_date = c("1404-07-01", "2021-01-09", "2021-01-14","1404-06-29","2025-09-19",NA)
)

# META DATA
M_data <- data.frame(
  VARIABLE = c("VisitDate","Test_date"),
  Currency_Rule = c(
    "",
    "VisitDate<=Test_date",
    " Test_date-VisitDate <10 ",
    ""),
  TYPE=c("date","date"),
  Currency_Error_Type = c("Error","warning"),
  stringsAsFactors = FALSE
)

result <- currency_check(
  S_data = S_data,
  M_data = M_data,
  Result = TRUE,
  show_column = FALSE
)

print(result)

result <- currency_check(
  S_data = S_data,
  M_data = M_data,
  Result = FALSE,
  var_select = c("VisitDate","Test_date")
)

print(result)

Run the code above in your browser using DataLab