Learn R Programming

dm (version 0.0.6.9000)

examine_cardinality: Test if the relation between two tables of a data model meet the requirements

Description

All check_cardinality_?_?() functions test the following conditions:

  1. Is pk_column is a unique key for parent_table?

  2. Is the set of values in fk_column of child_table a subset of the set of values of pk_column?

  3. Does the relation between the two tables of the data model meet the cardinality requirements?

examine_cardinality() also checks the first two points and subsequently determines the type of cardinality.

Usage

check_cardinality_0_n(parent_table, pk_column, child_table, fk_column)

check_cardinality_1_n(parent_table, pk_column, child_table, fk_column)

check_cardinality_1_1(parent_table, pk_column, child_table, fk_column)

check_cardinality_0_1(parent_table, pk_column, child_table, fk_column)

examine_cardinality(parent_table, pk_column, child_table, fk_column)

Arguments

parent_table

Data frame.

pk_column

Column of parent_table that has to be one of its unique keys.

child_table

Data frame.

fk_column

Column of child_table that has to be a foreign key to pk_column in parent_table.

Value

For check_cardinality_?_?(): Functions return parent_table, invisibly, if the check is passed, to support pipes. Otherwise an error is thrown and the reason for it is explained.

For examine_cardinality(): Returns a character variable specifying the type of relationship between the two columns.

Details

All cardinality-functions accept a parent table (data frame), a column name of this table, a child table, and a column name of the child table. The given column of the parent table has to be one of its unique keys (no duplicates are allowed). Furthermore, in all cases, the set of values of the child table's column has to be a subset of the set of values of the parent table's column.

The cardinality specifications 0_n, 1_n, 0_1, 1_1 refer to the expected relation that the child table has with the parent table. The numbers 0, 1 and n refer to the number of values in the column of the child table that correspond to each value of the column of the parent table. n means "more than one" in this context, with no upper limit.

0_n means, that each value of the pk_column has at least 0 and at most n corresponding values in the column of the child table (which translates to no further restrictions).

1_n means, that each value of the pk_column has at least 1 and at most n corresponding values in the column of the child table. This means that there is a "surjective" mapping from the child table to the parent table w.r.t. the specified columns, i.e. for each parent table column value there exists at least one equal child table column value.

0_1 means, that each value of the pk_column has at least 0 and at most 1 corresponding values in the column of the child table. This means that there is a "injective" mapping from the child table to the parent table w.r.t. the specified columns, i.e. no parent table column value is addressed multiple times. But not all of the parent table column values have to be referred to.

1_1 means, that each value of the pk_column has exactly 1 corresponding value in the column of the child table. This means that there is a "bijective" ("injective" AND "surjective") mapping between the child table and the parent table w.r.t. the specified columns, i.e. the sets of values of the two columns are equal and there are no duplicates in either of them.

Finally, examine_cardinality() tests for and returns the nature of the relationship (injective, surjective, bijective, or none of these) between the two given columns. If either pk_column is not a unique key of parent_table or the values of fk_column are not a subset of the values in pk_column, the requirements for a cardinality test is not fulfilled. No error will be thrown, but the result will contain the information which prerequisite was violated.

Examples

Run this code
# NOT RUN {
d1 <- tibble::tibble(a = 1:5)
d2 <- tibble::tibble(c = c(1:5, 5))
d3 <- tibble::tibble(c = 1:4)
# This does not pass, `c` is not unique key of d2:
try(check_cardinality_0_n(d2, c, d1, a))

# This passes, multiple values in d2$c are allowed:
check_cardinality_0_n(d1, a, d2, c)

# This does not pass, injectivity is violated:
try(check_cardinality_1_1(d1, a, d2, c))

# This passes:
check_cardinality_0_1(d1, a, d3, c)
# }

Run the code above in your browser using DataLab