Learn R Programming

omopgenerics

Package overview

The omopgenerics package provides definitions of core classes and methods used by analytic pipelines that query the OMOP common data model.

#> Warning in citation("omopgenerics"): no date field in DESCRIPTION file of
#> package 'omopgenerics'
#> Warning in citation("omopgenerics"): could not determine year for
#> 'omopgenerics' from package DESCRIPTION file
#> 
#> To cite package 'omopgenerics' in publications use:
#> 
#>   Català M, Burn E (????). _omopgenerics: Methods and Classes for the
#>   OMOP Common Data Model_. R package version 0.3.1.900,
#>   <https://darwin-eu.github.io/omopgenerics/>.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Manual{,
#>     title = {omopgenerics: Methods and Classes for the OMOP Common Data Model},
#>     author = {Martí Català and Edward Burn},
#>     note = {R package version 0.3.1.900},
#>     url = {https://darwin-eu.github.io/omopgenerics/},
#>   }

If you find the package useful in supporting your research study, please consider citing this package.

Installation

You can install the development version of OMOPGenerics from GitHub with:

install.packages("pak")
pak::pkg_install("darwin-eu/omopgenerics")

And load it using the library command:

library(omopgenerics)
library(dplyr)

Core classes and methods

CDM Reference

A cdm reference is a single R object that represents OMOP CDM data. The tables in the cdm reference may be in a database, but a cdm reference may also contain OMOP CDM tables that are in dataframes/tibbles or in arrow. In the latter case the cdm reference would typically be a subset of an original cdm reference that has been derived as part of a particular analysis.

omopgenerics contains the class definition of a cdm reference and a dataframe implementation. For creating a cdm reference using a database, see the CDMConnector package (https://darwin-eu.github.io/CDMConnector/).

A cdm object can contain four type of tables:

  • Standard tables:
omopTables()
#>  [1] "person"                "observation_period"    "visit_occurrence"     
#>  [4] "visit_detail"          "condition_occurrence"  "drug_exposure"        
#>  [7] "procedure_occurrence"  "device_exposure"       "measurement"          
#> [10] "observation"           "death"                 "note"                 
#> [13] "note_nlp"              "specimen"              "fact_relationship"    
#> [16] "location"              "care_site"             "provider"             
#> [19] "payer_plan_period"     "cost"                  "drug_era"             
#> [22] "dose_era"              "condition_era"         "metadata"             
#> [25] "cdm_source"            "concept"               "vocabulary"           
#> [28] "domain"                "concept_class"         "concept_relationship" 
#> [31] "relationship"          "concept_synonym"       "concept_ancestor"     
#> [34] "source_to_concept_map" "drug_strength"         "cohort_definition"    
#> [37] "attribute_definition"  "concept_recommended"

Each one of the tables has a required columns. For example, for the person table this are the required columns:

omopColumns(table = "person")
#>  [1] "person_id"                   "gender_concept_id"          
#>  [3] "year_of_birth"               "month_of_birth"             
#>  [5] "day_of_birth"                "birth_datetime"             
#>  [7] "race_concept_id"             "ethnicity_concept_id"       
#>  [9] "location_id"                 "provider_id"                
#> [11] "care_site_id"                "person_source_value"        
#> [13] "gender_source_value"         "gender_source_concept_id"   
#> [15] "race_source_value"           "race_source_concept_id"     
#> [17] "ethnicity_source_value"      "ethnicity_source_concept_id"
  • Cohort tables We can see the cohort-related tables and their required columns.
cohortTables()
#> [1] "cohort"           "cohort_set"       "cohort_attrition" "cohort_codelist"
cohortColumns(table = "cohort")
#> [1] "cohort_definition_id" "subject_id"           "cohort_start_date"   
#> [4] "cohort_end_date"

In addition, cohorts are defined in terms of a generatedCohortSet class. For more details on this class definition see the corresponding vignette.

  • Achilles tables The Achilles R package generates descriptive statistics about the data contained in the OMOP CDM. Again, we can see the tables created and their required columns.
achillesTables()
#> [1] "achilles_analysis"     "achilles_results"      "achilles_results_dist"
achillesColumns(table = "achilles_results")
#> [1] "analysis_id" "stratum_1"   "stratum_2"   "stratum_3"   "stratum_4"  
#> [6] "stratum_5"   "count_value"
  • Other tables, these other tables can have any format.

Any table to be part of a cdm object has to fulfill 4 conditions:

  • All must share a common source.

  • The name of the tables must be lowercase.

  • The name of the column names of each table must be lowercase.

  • person and observation_period must be present.

Concept set

A concept set can be represented as either a codelist or a concept set expression. A codelist is a named list, with each item of the list containing specific concept IDs.

condition_codes <- list("diabetes" = c(201820, 4087682, 3655269),
                        "asthma" = 317009)
condition_codes <- newCodelist(condition_codes)
#> Warning: ! `codelist` contains numeric values, they are casted to integers.

condition_codes
#> 
#> ── 2 codelists ─────────────────────────────────────────────────────────────────
#> 
#> - asthma (1 codes)
#> - diabetes (3 codes)

Meanwhile, a concept set expression provides a high-level definition of concepts that, when applied to a specific OMOP CDM vocabulary version (by making use of the concept hierarchies and relationships), will result in a codelist.

condition_cs <- list(
  "diabetes" = dplyr::tibble(
    "concept_id" = c(201820, 4087682),
    "excluded" = c(FALSE, FALSE),
    "descendants" = c(TRUE, FALSE),
    "mapped" = c(FALSE, FALSE)
  ),
  "asthma" = dplyr::tibble(
    "concept_id" = 317009,
    "excluded" = FALSE,
    "descendants" = FALSE,
    "mapped" = FALSE
  )
)
condition_cs <- newConceptSetExpression(condition_cs)

condition_cs
#> 
#> ── 2 conceptSetExpressions ─────────────────────────────────────────────────────
#> 
#> - asthma (1 concept criteria)
#> - diabetes (2 concept criteria)

A cohort table

A cohort is a set of persons who satisfy one or more inclusion criteria for a duration of time and, when defined, this table in a cdm reference has a cohort table class. Cohort tables are then associated with attributes such as settings and attrition.

person <- tibble(
  person_id = 1, gender_concept_id = 0, year_of_birth = 1990,
  race_concept_id = 0, ethnicity_concept_id = 0
)
observation_period <- dplyr::tibble(
  observation_period_id = 1, person_id = 1,
  observation_period_start_date = as.Date("2000-01-01"),
  observation_period_end_date = as.Date("2023-12-31"),
  period_type_concept_id = 0
)
diabetes <- tibble(
  cohort_definition_id = 1, subject_id = 1,
  cohort_start_date = as.Date("2020-01-01"),
  cohort_end_date = as.Date("2020-01-10")
)

cdm <- cdmFromTables(
  tables = list(
    "person" = person,
    "observation_period" = observation_period,
    "diabetes" = diabetes
  ),
  cdmName = "example_cdm"
)
#> Warning: ! 5 column in person do not match expected column type:
#> • `person_id` is numeric but expected integer
#> • `gender_concept_id` is numeric but expected integer
#> • `year_of_birth` is numeric but expected integer
#> • `race_concept_id` is numeric but expected integer
#> • `ethnicity_concept_id` is numeric but expected integer
#> Warning: ! 3 column in observation_period do not match expected column type:
#> • `observation_period_id` is numeric but expected integer
#> • `person_id` is numeric but expected integer
#> • `period_type_concept_id` is numeric but expected integer
cdm$diabetes <- newCohortTable(cdm$diabetes)
#> Warning: ! 2 column in diabetes do not match expected column type:
#> • `cohort_definition_id` is numeric but expected integer
#> • `subject_id` is numeric but expected integer

cdm$diabetes
#> # A tibble: 1 × 4
#>   cohort_definition_id subject_id cohort_start_date cohort_end_date
#>                  <dbl>      <dbl> <date>            <date>         
#> 1                    1          1 2020-01-01        2020-01-10
settings(cdm$diabetes)
#> # A tibble: 1 × 2
#>   cohort_definition_id cohort_name
#>                  <int> <chr>      
#> 1                    1 cohort_1
attrition(cdm$diabetes)
#> # A tibble: 1 × 7
#>   cohort_definition_id number_records number_subjects reason_id reason          
#>                  <int>          <int>           <int>     <int> <chr>           
#> 1                    1              1               1         1 Initial qualify…
#> # ℹ 2 more variables: excluded_records <int>, excluded_subjects <int>
cohortCount(cdm$diabetes)
#> # A tibble: 1 × 3
#>   cohort_definition_id number_records number_subjects
#>                  <int>          <int>           <int>
#> 1                    1              1               1

Summarised result

A summarised result provides a standard format for the results of an analysis performed against data mapped to the OMOP CDM.

For example this format is used when we get a summary of the cdm as a whole

summary(cdm) |> 
  dplyr::glimpse()
#> Rows: 13
#> Columns: 13
#> $ result_id        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
#> $ cdm_name         <chr> "example_cdm", "example_cdm", "example_cdm", "example…
#> $ group_name       <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ group_level      <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ strata_name      <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ strata_level     <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ variable_name    <chr> "snapshot_date", "person_count", "observation_period_…
#> $ variable_level   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
#> $ estimate_name    <chr> "value", "count", "count", "source_name", "version", …
#> $ estimate_type    <chr> "date", "integer", "integer", "character", "character…
#> $ estimate_value   <chr> "2024-11-01", "1", "1", "", NA, "5.3", "", "", "", ""…
#> $ additional_name  <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ additional_level <chr> "overall", "overall", "overall", "overall", "overall"…

and also when we summarise a cohort

summary(cdm$diabetes) |> 
  dplyr::glimpse()
#> Rows: 6
#> Columns: 13
#> $ result_id        <int> 1, 1, 2, 2, 2, 2
#> $ cdm_name         <chr> "example_cdm", "example_cdm", "example_cdm", "example…
#> $ group_name       <chr> "cohort_name", "cohort_name", "cohort_name", "cohort_…
#> $ group_level      <chr> "cohort_1", "cohort_1", "cohort_1", "cohort_1", "coho…
#> $ strata_name      <chr> "overall", "overall", "reason", "reason", "reason", "…
#> $ strata_level     <chr> "overall", "overall", "Initial qualifying events", "I…
#> $ variable_name    <chr> "number_records", "number_subjects", "number_records"…
#> $ variable_level   <chr> NA, NA, NA, NA, NA, NA
#> $ estimate_name    <chr> "count", "count", "count", "count", "count", "count"
#> $ estimate_type    <chr> "integer", "integer", "integer", "integer", "integer"…
#> $ estimate_value   <chr> "1", "1", "1", "1", "0", "0"
#> $ additional_name  <chr> "overall", "overall", "reason_id", "reason_id", "reas…
#> $ additional_level <chr> "overall", "overall", "1", "1", "1", "1"

Copy Link

Version

Install

install.packages('omopgenerics')

Monthly Downloads

3,167

Version

1.3.1

License

Apache License (>= 2)

Maintainer

Marti Catala

Last Published

September 18th, 2025

Functions in omopgenerics (1.3.1)

assertTable

Assert that an object is a table.
cdmFromTables

Create a cdm object from local tables
createLogFile

Create a log file
cdmReference

Get the cdm_reference of a cdm_table.
createTableIndex

Create a table index
cohortTables

Cohort tables that a cdm reference can contain in the OMOP Common Data Model.
collect.cdm_reference

Retrieves the cdm reference into a local cdm.
cdmSelect

Restrict the cdm object to a subset of tables.
emptyAchillesTable

Create an empty achilles table
cohortCodelist

Get codelist from a cohort_table object.
checkCohortRequirements

Check whether a cohort table satisfies requirements
cdmTableFromSource

This is an internal developer focused function that creates a cdm_table from a table that shares the source but it is not a cdm_table. Please use insertTable if you want to insert a table to a cdm_reference object.
cdmVersion

Get the version of an object.
existingIndexes

Existing indexes in a cdm object
cdmName

Get the name of a cdm_reference associated object
estimateTypeChoices

Choices that can be present in estimate_type column.
cdmSourceType

Get the source type of a cdm_reference object.
cdmSource

Get the cdmSource of an object.
getCohortName

Get the cohort name of a certain cohort definition id
emptySummarisedResult

Empty summarised_result object.
cdmClasses

Separate the cdm tables in classes
emptyOmopTable

Create an empty omop table
getCohortId

Get the cohort definition id of a certain name
createIndexes

Create the missing indexes
insertFromSource

Convert a table that is not a cdm_table but have the same original source to a cdm_table. This Table is not meant to be used to insert tables in the cdm, please use insertTable instead.
compute.cdm_table

Store results in a table.
dropSourceTable

Drop a table from a cdm object.
filterGroup

Filter the group_name-group_level pair in a summarised_result
filterAdditional

Filter the additional_name-additional_level pair in a summarised_result
emptyCdmReference

Create an empty cdm_reference
dropTable

omopColumns

Required columns that the standard tables in the OMOP Common Data Model must have.
numberSubjects

Count the number of subjects that a cdm_table has.
insertTable

Insert a table to a cdm object.
cohortColumns

Required columns for a generated cohort set.
cdmDisconnect

Disconnect from a cdm object.
cohortCount

Get cohort counts from a cohort_table object.
print.codelist_with_details

Print a codelist with details
logMessage

Log a message to a logFile
print.codelist

Print a codelist
listSourceTables

List tables that can be accessed though a cdm object.
emptyCodelistWithDetails

Empty codelist object.
getPersonIdentifier

Get the column name with the person identifier from a table (either subject_id or person_id), it will throw an error if it contains both or neither.
importConceptSetExpression

Import a concept set expression.
importCodelist

Import a codelist.
isTableEmpty

Check if a table is empty or not
exportConceptSetExpression

Export a concept set expression.
exportSummarisedResult

Export a summarised_result object to a csv file.
emptyCodelist

Empty codelist object.
sourceType

Get the source type of an object.
isResultSuppressed

To check whether an object is already suppressed to a certain min cell count.
newAchillesTable

Create an achilles table from a cdm_table.
groupColumns

Identify variables in group_name column
collect.cohort_table

To collect a cohort_table object.
filterSettings

Filter a <summarised_result> using the settings
combineStrata

Provide all combinations of strata levels.
emptyCohortTable

Create an empty cohort_table object
newSummarisedResult

'summarised_results' object constructor
numberRecords

Count the number of records that a cdm_table has.
omopTableFields

Return a table of omop cdm fields informations
omopDataFolder

Check or set the OMOP_DATA_FOLDER where the OMOP related data is stored.
splitAdditional

Split additional_name and additional_level columns
settingsColumns

Identify settings columns of a <summarised_result>
splitGroup

Split group_name and group_level columns
tidyColumns

Identify tidy columns of a <summarised_result>
validateAchillesTable

Validate if a cdm_table is a valid achilles table.
settings.summarised_result

Get settings from a summarised_result object.
splitAll

Split all pairs name-level into columns.
splitStrata

Split strata_name and strata_level columns
statusIndexes

Status of the indexes
suppress.summarised_result

Function to suppress counts in result objects
suppress

Function to suppress counts in result objects
uniteGroup

Unite one or more columns in group_name-group_level format
toSnakeCase

Convert a character vector to snake case
tmpPrefix

Create a temporary prefix for tables, that contains a unique prefix that starts with tmp.
tidy.summarised_result

Turn a <summarised_result> object into a tidy tibble
uniteStrata

Unite one or more columns in strata_name-strata_level format
validateAgeGroupArgument

Validate the ageGroup argument. It must be a list of two integerish numbers lower age and upper age, both of the must be greater or equal to 0 and lower age must be lower or equal to the upper age. If not named automatic names will be given in the output list.
emptyConceptSetExpression

Empty concept_set_expression object.
newCdmReference

cdm_reference objects constructor
expectedIndexes

Expected indexes in a cdm object
omopgenerics-package

omopgenerics: Methods and Classes for the OMOP Common Data Model
omopTables

Standard tables that a cdm reference can contain in the OMOP Common Data Model.
newCodelist

'codelist' object constructor
filterStrata

Filter the strata_name-strata_level pair in a summarised_result
newLocalSource

A new local source for the cdm
settings.cohort_table

Get cohort settings from a cohort_table object.
newOmopTable

Create an omop table from a cdm table.
settings

Get settings from an object.
validateNameStyle

Validate nameStyle argument. If any of the element in ... has length greater than 1 it must be contained in nameStyle. Note that snake case notation is used.
newCodelistWithDetails

'codelist' object constructor
print.conceptSetExpression

Print a concept set expression
resultColumns

Required columns that the result tables must have.
exportCodelist

Export a codelist object.
importSummarisedResult

Import a set of summarised results.
validateNewColumn

Validate a new column of a table
readSourceTable

Read a table from the cdm_source and add it to to the cdm.
resultPackageVersion

Check if different packages version are used for summarise_results object
validateNameArgument

Validate name argument. It must be a snake_case character vector. You can add the a cdm object to check name is not already used in that cdm.
[[.cdm_reference

Subset a cdm reference object.
strataColumns

Identify variables in strata_name column
validateNameLevel

Validate if two columns are valid Name-Level pair.
summary.cohort_table

Summary a generated cohort set
insertCdmTo

Insert a cdm_reference object to a different source.
summary.cdm_source

Summarise a cdm_source object
summary.cdm_reference

Summary a cdm reference
validateCdmArgument

Validate if an object in a valid cdm_reference.
validateCdmTable

Validate if a table is a valid cdm_table object.
summary.summarised_result

Summary a summarised_result
transformToSummarisedResult

Create a <summarised_result> object from a data.frame, given a set of specifications.
tableName

Get the table name of a cdm_table.
tableSource

Get the table source of a cdm_table.
validateColumn

Validate whether a variable points to a certain exiting column in a table.
validateConceptSetArgument

Validate conceptSet argument. It can either be a list, a codelist, a concept set expression or a codelist with details. The output will always be a codelist.
uniqueId

Get a unique Identifier with a certain number of characters and a prefix.
newCdmTable

Create an cdm table.
newCdmSource

Create a cdm source object.
print.cdm_reference

Print a CDM reference object
newConceptSetExpression

'concept_set_expression' object constructor
newCohortTable

cohort_table objects constructor.
recordCohortAttrition

Update cohort attrition.
reexports

Objects exported from other packages
pivotEstimates

Set estimates as columns
validateOmopTable

Validate an omop_table
validateResultArgument

Validate if a an object is a valid 'summarised_result' object.
[[<-.cdm_reference

Assign a table to a cdm reference.
uniqueTableName

Create a unique table name
summariseLogFile

Summarise and extract the information of a log file into a summarised_result object.
validateStrataArgument

To validate a strata list. It makes sure that elements are unique and point to columns in table.
validateCohortIdArgument

Validate cohortId argument. CohortId can either be a cohort_definition_id value, a cohort_name or a tidyselect expression referinc to cohort_names. If you want to support tidyselect expressions please use the function as: validateCohortIdArgument({{cohortId}}, cohort).
uniteAdditional

Unite one or more columns in additional_name-additional_level format
validateCohortArgument

Validate a cohort table input.
validateWindowArgument

Validate a window argument. It must be a list of two elements (window start and window end), both must be integerish and window start must be lower or equal than window end.
achillesTables

Names of the tables that contain the results of achilles analyses
assertDate

Assert Date
assertClass

Assert that an object has a certain class.
assertCharacter

Assert that an object is a character and fulfill certain conditions.
assertList

Assert that an object is a list.
addSettings

Add settings columns to a <summarised_result> object
achillesColumns

Required columns for each of the achilles result tables
assertLogical

Assert that an object is a logical.
attrition.cohort_table

Get cohort attrition from a cohort_table object.
assertChoice

Assert that an object is within a certain oprtions.
bind

Bind two or more objects of the same class.
bind.summarised_result

Bind two or summarised_result objects
bind.cohort_table

Bind two or more cohort tables
$.cdm_reference

Subset a cdm reference object.
$<-.cdm_reference

Assign an table to a cdm reference.
attrition

Get attrition from an object.
assertTrue

Assert that an expression is TRUE.
additionalColumns

Identify variables in additional_name column
assertNumeric

Assert that an object is a numeric.