dm
into a wide tableGather all information of interest in one place in a wide table (on a database-dm
a
temporary table will be created).
If referential integrity is given among the tables of the data model, the resulting
table of this function will contain as many rows as the table start
does (exceptions are
join = anti_join
(result is empty table with same columns as start
) and join = right_join
(number of rows equal to or larger than those of start
)).
For more information please refer to vignette("dm-joining")
.
cdm_flatten_to_tbl(dm, start, ..., join = left_join)cdm_squash_to_tbl(dm, start, ..., join = left_join)
A dm
object
Table to start from. From this table all outgoing foreign key relations are considered to establish a processing order for the joins. An interesting choice could be for example a fact table in a star schema.
Unquoted table names to include in addition to start
. The order of the tables here determines
the order of the joins. If empty, all tables that can be reached are included.
If this includes tables which aren't direct neighbours of start
,
it will only work with cdm_squash_to_tbl()
(given one of the allowed join-methods).
tidyselect
is supported, cf. dplyr::select()
.
The type of join to be performed, see dplyr::join()
A single table, resulting of consecutively joining
all tables involved to table start
.
With the ...
left empty, this function joins all the tables of your dm
object together, that can be reached from table start
in the direction of the foreign
key relations (pointing from child table to parent table), using the foreign key relations to
determine the parameter by
for the necessary joins.
The result is one table with unique column names.
Use the ...
if you want to control which tables should be joined to table start
.
How does filtering affect the result?
Case 1, either no filter conditions are set in the dm
, or only in a part unconnected to
table start
:
The necessary disambiguations of the column names are performed first. Then all
involved foreign tables are joined to table start
successively with the join function given in
parameter join
.
Case 2, filter conditions are set for at least one table connected to start
:
Disambiguation is performed initially if necessary. Table start
is calculated using tbl(dm, "start")
. This implies
that the effect of the filters on this table is taken into account. For right_join
, full_join
and nest_join
an error
is thrown in case filters are set, because the filters won't affect right hand side tables and thus the result will be
incorrect in general (and calculating the effects on all RHS-tables would be time-consuming and is not supported;
if desired call cdm_apply_filters()
first to achieve this effect.).
For all other join types filtering only start
is enough, since the effect is passed on by the
successive joins.
Mind, that calling cdm_flatten_to_tbl()
with join = right_join
and no table order determined in the ...
would not lead to a well-defined result, if two or more foreign tables are to be joined to start
. The resulting
table would depend on the order the tables are listed in the dm
. Therefore trying this results
in a warning.
Since join = nest_join()
does not make sense in this direction (LHS = child table, RHS = parent table: for valid key constraints
each nested column entry would be a tibble of 1 row), an error is thrown, if this method is chosen.
Other flattening functions: cdm_join_to_tbl
# NOT RUN {
cdm_nycflights13() %>%
cdm_select_tbl(-weather) %>%
cdm_flatten_to_tbl(flights)
# }
Run the code above in your browser using DataLab