Learn R Programming

OmopSketch

The goal of OmopSketch is to characterise and visualise an Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) instance to asses if it meets the necessary criteria to answer a specific clinical question and conduct a certain study.

Installation

OmopSketch is available from CRAN:

install.packages("OmopSketch")

Or you can install the development version of OmopSketch from GitHub with:

# install.packages("remotes")
remotes::install_github("OHDSI/OmopSketch")

Working with OMOP

To be able to use this package you will need data mapped to the OMOP CDM.

The first step to any analysis you will create what we call the cdm_reference object, which is a reference to the OMOP CDM tables. If you want to learn more about OMOP or the cdm_reference object you can take a look to:

In general, you will create a cdm_reference object using the CDMConnector package, in our case we will use the Eunomia GiBleed mock dataset available through omock:

library(omock)

cdm <- mockCdmFromDataset(datasetName = "GiBleed", source = "duckdb")
#> ℹ Reading GiBleed tables.
#> ℹ Adding drug_strength table.
#> ℹ Creating local <cdm_reference> object.
#> ℹ Inserting <cdm_reference> into duckdb.
cdm
#> 
#> ── # OMOP CDM reference (duckdb) of GiBleed ────────────────────────────────────
#> • omop tables: care_site, cdm_source, concept, concept_ancestor, concept_class,
#> concept_relationship, concept_synonym, condition_era, condition_occurrence,
#> cost, death, device_exposure, domain, dose_era, drug_era, drug_exposure,
#> drug_strength, fact_relationship, location, measurement, metadata, note,
#> note_nlp, observation, observation_period, payer_plan_period, person,
#> procedure_occurrence, provider, relationship, source_to_concept_map, specimen,
#> visit_detail, visit_occurrence, vocabulary
#> • cohort tables: -
#> • achilles tables: -
#> • other tables: -

Sketching your cdm

library(OmopSketch)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

Once we have the cdm_reference object we can start characterising it, there are several functionalities available on OmopSketch the main ones:

Snapshot

We first create a snapshot of our database. This will allow us to track when the analysis has been conducted and capture details about the CDM version or the data release.

snapshot <- summariseOmopSnapshot(cdm = cdm)

tableOmopSnapshot(result = snapshot, type = "flextable")

Characterise the person table

Once we have collected the snapshot information, we can characterise the person table with summarisePersonTable():

result <- summarisePerson(cdm = cdm)

tablePerson(result = result, type = "flextable")

Characterise the observation period

We can then explore the observation period details. You can visualise and explore the characteristics of the observation period per each individual in the database using summariseObservationPeriod().

result <- summariseObservationPeriod(cdm = cdm)
#> Warning: ! There are 2649 individuals not included in the person table.

tableObservationPeriod(result = result, type = "flextable")

Or if visualisation is preferred, you can easily build a histogram to explore how many participants have more than one observation period.

plotObservationPeriod(result = result, colour = "observation_period_ordinal")

Characterise the clinical tables

Now, we can start characterising the clinical tables of the CDM. By using summariseClinicalRecords() and tableClinicalRecords(), we can easily visualise the main characteristics of specific clinical tables.

result <- summariseClinicalRecords(
  cdm = cdm, 
  omopTableName = c("condition_occurrence", "drug_exposure")
)
#> ℹ Adding variables of interest to condition_occurrence.
#> ℹ Summarising records per person in condition_occurrence.
#> ℹ Summarising subjects not in person table in condition_occurrence.
#> ℹ Summarising records in observation in condition_occurrence.
#> ℹ Summarising records with start before birth date in condition_occurrence.
#> ℹ Summarising records with end date before start date in condition_occurrence.
#> ℹ Summarising domains in condition_occurrence.
#> ℹ Summarising standard concepts in condition_occurrence.
#> ℹ Summarising source vocabularies in condition_occurrence.
#> ℹ Summarising concept types in condition_occurrence.
#> ℹ Summarising missing data in condition_occurrence.
#> ℹ Adding variables of interest to drug_exposure.
#> ℹ Summarising records per person in drug_exposure.
#> ℹ Summarising subjects not in person table in drug_exposure.
#> ℹ Summarising records in observation in drug_exposure.
#> ℹ Summarising records with start before birth date in drug_exposure.
#> ℹ Summarising records with end date before start date in drug_exposure.
#> ℹ Summarising domains in drug_exposure.
#> ℹ Summarising standard concepts in drug_exposure.
#> ℹ Summarising source vocabularies in drug_exposure.
#> ℹ Summarising concept types in drug_exposure.
#> ℹ Summarising concept class in drug_exposure.
#> ℹ Summarising missing data in drug_exposure.

tableClinicalRecords(result = result, type = "flextable")

Explore trends over time

After visualising the main characteristics of our clinical tables, we can also explore trends over time using summariseTrend().

result <- summariseTrend(
  cdm = cdm, 
  event = c("condition_occurrence", "drug_exposure"), 
  output = "record",  
  interval = "years"
)

plotTrend(result = result, facet = "omop_table", colour = "cdm_name")

Characterise the concepts

OmopSketch also provides functions to explore the concepts in the dataset.


result <- summariseConceptIdCounts(
  cdm = cdm, 
  omopTableName = "drug_exposure"
)

tableTopConceptCounts(result = result, type = "flextable")

Characterise the cdm

To obtain and explore a complete characterisation of a cdm, you can use the OmopSketch functions databaseCharacteristics() and shinyCharacteristics(). These functions allow you to generate and interactively explore detailed summaries of your database. To see an example of the outputs produced, explore the characterisation of several synthetic datasets here.

As seen, OmopSketch offers multiple functionalities to provide a general overview of a database. Additionally, it includes more tools and arguments that allow for deeper exploration, helping to assess the database’s suitability for specific research studies. For further information, please refer to the vignettes.

Copy Link

Version

Install

install.packages('OmopSketch')

Monthly Downloads

382

Version

1.0.0

License

Apache License (>= 2)

Maintainer

Cecilia Campanile

Last Published

November 19th, 2025

Functions in OmopSketch (1.0.0)

tableRecordCount

Create a visual table from a summariseRecordCount() result
summariseConceptIdCounts

Summarise concept use in patient-level data
summariseConceptSetCounts

Summarise concept counts in patient-level data
tableTopConceptCounts

Create a visual table of the most common concepts from summariseConceptIdCounts() output
summariseMissingData

Summarise missing data in omop tables
summariseInObservation

Summarise the number of people in observation during a specific interval of time
tableMissingData

Create a visual table from a summariseMissingData() result
tableObservationPeriod

Create a visual table from a summariseObservationPeriod() result
tableInObservation

Create a visual table from a summariseInObservation() result
tablePerson

Visualise the results of summarisePerson() into a table
summariseOmopSnapshot

Summarise a cdm_reference object creating a snapshot with the metadata of the cdm_reference object
plotPerson

Visualise the output of summarisePerson()
tableOmopSnapshot

Create a visual table from a summarise_omop_snapshot result
summariseObservationPeriod

Summarise the observation period table getting some overall statistics in a summarised_result object
plotRecordCount

Create a ggplot of the records' count trend
tableTrend

Create a visual table from a summariseTrend() result
summarisePerson

Summarise person table
summariseRecordCount

Summarise record counts of an omop_table using a specific time interval
plotObservationPeriod

Create a plot from the output of summariseObservationPeriod()
mockOmopSketch

Creates a mock database to test OmopSketch package
plotInObservation

Create a ggplot2 plot from the output of summariseInObservation()
OmopSketch-package

OmopSketch: Characterise Tables of an OMOP Common Data Model Instance
clinicalTables

Tables in the cdm_reference that contain clinical information
databaseCharacteristics

Summarise Database Characteristics for OMOP CDM
plotConceptSetCounts

Plot the concept counts of a summariseConceptSetCounts output
summariseClinicalRecords

Summarise an omop table from a cdm object
summariseConceptCounts

Summarise concept counts in patient-level data
plotTrend

Create a ggplot2 plot from the output of summariseTrend()
summariseTrend

Summarise temporal trends in OMOP tables
tableConceptIdCounts

Create a visual table from a summariseConceptIdCounts() result
reexports

Objects exported from other packages
plot-doc

Helper for consistent documentation for plots.
consistent-doc

Helper for consistent documentation
dateRange-startDate

Helper for consistent documentation of dateRange.
tableClinicalRecords

Create a visual table from a summariseClinicalRecord() output
shinyCharacteristics

Generate an interactive Shiny application that visualises the results obtained from the databaseCharacteristics() function
style-table

Helper for consistent documentation of table arguments.