Learn R Programming

pointblank (version 0.5.2)

get_sundered_data: Sunder the data, splitting it into 'pass' and 'fail' pieces

Description

Validation of the data is one thing but, sometimes, you want to use the best part of the input dataset for something else. The get_sundered_data() function works with an agent object that has intel (i.e., post interrogate()) and gets either the 'pass' data piece (rows with no failing test units across all row-based validation functions), or, the 'fail' data piece (rows with at least one failing test unit across the same series of validations). There are some caveats, only those validation steps with no preconditions are considered. And, the validation steps used for this splitting must be of the row-based variety (e.g., the col_vals_*() functions or conjointly()).

Usage

get_sundered_data(
  agent,
  type = c("pass", "fail", "combined"),
  pass_fail = c("pass", "fail"),
  id_cols = NULL
)

Arguments

agent

An agent object of class ptblank_agent. It should have had interrogate() called on it, such that the validation steps were actually carried out.

type

The desired piece of data resulting from the splitting. Options for returning a single table are "pass" (the default) and "fail". Each of these options return a single table with, in the "pass" case, only the rows that passed across all validation steps (i.e., had no failing test units in any part of a row for any validation step), or, the complementary set of rows in the "fail" case. Providing NULL returns both of the split data tables in a list (with the names of "pass" and "fail"). The option "combined" applies a categorical (pass/fail) label (settable in the pass_fail argument) in a new .pb_combined flag column. For this case the ordering of rows is fully retained from the input table.

pass_fail

A vector for encoding the flag column with 'pass' and 'fail' values when type = "combined". The default is c("pass", "fail") but other options could be c(TRUE, FALSE), c(1, 0), or c(1L, 0L).

id_cols

An optional specification of one or more identifying columns. When taken together, we can count on this single column or grouping of columns to distinguish rows.

Value

A list of table objects if type is NULL, or, a single table if a type is given.

Function ID

5-4

See Also

Other Post-interrogation: all_passed(), get_agent_report(), get_agent_x_list(), get_data_extracts()

Examples

Run this code
# NOT RUN {
# Create a series of three validation
# steps focus on test row values for
# the `small_table` tibble object;
# `interrogate()` immediately
agent <-
  create_agent(tbl = small_table) %>%
  col_vals_gt(vars(d), 100) %>%
  col_vals_equal(
    vars(d), vars(d),
    na_pass = TRUE
  ) %>%
  col_vals_between(
    vars(c), left = vars(a), right = vars(d),
    na_pass = TRUE
  ) %>%
  interrogate()

# Get the sundered data piece that
# contains only rows that passed all
# validation steps (the default piece)
agent %>% get_sundered_data()

# }

Run the code above in your browser using DataLab