Learn R Programming

dm

TL;DR

Are you using multiple data frames or database tables in R? Organize them with dm.

  • Use it for data analysis today.
  • Build data models tomorrow.
  • Deploy the data models to your organization’s RDBMS the day after.

Overview

dm bridges the gap in the data pipeline between individual data frames and relational databases. It’s a grammar of joined tables that provides a consistent set of verbs for consuming, creating, and deploying relational data models. For individual researchers, it broadens the scope of datasets they can work with and how they work with them. For organizations, it enables teams to quickly and efficiently create and share large, complex datasets.

dm objects encapsulate relational data models constructed from local data frames or lazy tables connected to an RDBMS. dm objects support the full suite of dplyr data manipulation verbs along with additional methods for constructing and verifying relational data models, including key selection, key creation, and rigorous constraint checking. Once a data model is complete, dm provides methods for deploying it to an RDBMS. This allows it to scale from datasets that fit in memory to databases with billions of rows.

Features

dm makes it easy to bring an existing relational data model into your R session. As the dm object behaves like a named list of tables it requires little change to incorporate it within existing workflows. The dm interface and behavior is modeled after dplyr, so you may already be familiar with many of its verbs. dm also offers:

  • visualization to help you understand relationships between entities represented by the tables
  • simpler joins that “know” how tables are related, including a “flatten” operation that automatically follows keys and performs column name disambiguation
  • consistency and constraint checks to help you understand (and fix) the limitations of your data

That’s just the tip of the iceberg. See Getting started to hit the ground running and explore all the features.

Installation

The latest stable version of the {dm} package can be obtained from CRAN with the command

The latest development version of {dm} can be installed from GitHub.

Usage

Create a dm object (see Getting started for details).

dm is a named list of tables:

Visualize relationships at any time:

Simple joins:

Check consistency:

Learn more in the Getting started article.

Getting help

If you encounter a clear bug, please file an issue with a minimal reproducible example on GitHub. For questions and other discussion, please use community.rstudio.com.


License: MIT © cynkra GmbH.

Funded by:


Please note that the ‘dm’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Copy Link

Version

Install

install.packages('dm')

Monthly Downloads

2,933

Version

0.2.8

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Kirill Müller

Last Published

April 8th, 2022

Functions in dm (0.2.8)

decompose_table

Decompose a table into two linked tables
check_key

Check if column(s) can be used as keys
db_schema_exists

Check for existence of a schema on a database
db_schema_create

Create a schema on a database
db_schema_list

List schemas on a database
db_schema_drop

Remove a schema from a database
check_set_equality

Check column values for set equality
check_subset

Check column values for subset
deprecated

Deprecated functions
copy_dm_to

Copy data model to data source
dm

Data model class
dm_examine_cardinalities

Learn about your data model
dm_get_filters

Get filter expressions
dm_examine_constraints

Validate your data model
dm_add_fk

Add foreign keys
dm_from_src

Load a dm from a remote data source
dm_flatten_to_tbl

Flatten a part of a dm into a wide table
dm_nrow

Number of rows
dm_nest_tbl

Nest a table inside its dm
dm_bind

Merge several dm
dm_add_pk

Add a primary key
dm_rm_pk

Remove a primary key
dm_rm_fk

Remove foreign keys
dm_filter

Filtering
dm_add_tbl

dm_unnest_tbl

Unnest columns from a wrapped table
dm_unpack_tbl

Unpack columns from a wrapped table
dm_financial

Creates a dm object for the Financial data
pull_tbl

Retrieve a table
pack_join

Pack Join
dm_zoom_to

Mark table for manipulation
dm_draw

Draw a diagram of the data model
dm_get_pk

Primary key column names
dm_enum_fk_candidates

Foreign key candidates
dm_disambiguate_cols

Resolve column name ambiguities
dm_get_referencing_tables

Get the names of referencing tables
dm_get_all_fks

Get foreign key constraints
dm_get_all_pks

dm_has_fk

Check if foreign keys exists
dm_paste

Create R code for a dm object
dm_mutate_tbl

materialize

Materialize
dm_join_to_tbl

Join two tables
get_returned_rows

Extract and check the RETURNING rows
dm_pixarfilms

Creates a dm object for the pixarfilms data
dm_ptype

Prototype for a dm object
dm_nycflights13

Creates a dm object for the nycflights13 data
dm_has_pk

Check for primary key
dm_is_referenced

Check foreign key reference
dm_set_colors

Color in database diagrams
dm_select_tbl

Select and rename tables
dplyr_join

dplyr join methods for zoomed dm objects
head.zoomed_dm

utils table manipulation methods for zoomed_dm objects
dm_wrap_tbl

Wrap dm into a single tibble dm
tidyr_table_manipulation

tidyr table manipulation methods for zoomed dm objects
dm_unwrap_tbl

Unwrap a single table dm
dm_rename

Rename columns
rows_truncate

Truncate all rows
enum_pk_candidates

Primary key candidate
examine_cardinality

Check table relations
rows-db

Updating database tables
dm_pack_tbl

dm_pack_tbl()
dm_rm_tbl

Remove tables
rows-dm

Modifying rows for multiple tables
dm_select

Select columns
dplyr_table_manipulation

dplyr table manipulation methods for zoomed dm objects
dplyr_src

dm as data source
reexports

Objects exported from other packages
reunite_parent_child

Merge two tables that are linked by a foreign key relation