Learn R Programming

CodelistGenerator

Installation

You can install CodelistGenerator from CRAN

install.packages("CodelistGenerator")

Or you can also install the development version of CodelistGenerator

install.packages("remotes")
remotes::install_github("darwin-eu/CodelistGenerator")

Example usage

library(dplyr)
library(CDMConnector)
library(CodelistGenerator)

For this example we’ll use the Eunomia dataset (which only contains a subset of the OMOP CDM vocabularies)

requireEunomia()
db <- DBI::dbConnect(duckdb::duckdb(), dbdir = eunomiaDir())
cdm <- cdmFromCon(db, 
                  cdmSchema = "main", 
                  writeSchema = "main", 
                  writePrefix = "cg_")

Exploring the OMOP CDM Vocabulary tables

OMOP CDM vocabularies are frequently updated, and we can identify the version of the vocabulary of our Eunomia data

vocabularyVersion(cdm = cdm)
#> [1] "v5.0 18-JAN-19"

Vocabulary based codelists using CodelistGenerator

CodelistGenerator provides functions to extract code lists based on vocabulary hierarchies. One example is `getDrugIngredientCodes, which we can use, for example, to get the concept IDs used to represent aspirin and diclofenac.

ing <- getDrugIngredientCodes(cdm = cdm, 
                       name = c("aspirin", "diclofenac"),
                       nameStyle = "{concept_name}")
ing
#> 
#> - aspirin (2 codes)
#> - diclofenac (1 codes)
ing$aspirin
#> [1]  1112807 19059056
ing$diclofenac
#> [1] 1124300

Systematic search using CodelistGenerator

CodelistGenerator can also support systematic searches of the vocabulary tables to support codelist development. A little like the process for a systematic review, the idea is that for a specified search strategy, CodelistGenerator will identify a set of concepts that may be relevant, with these then being screened to remove any irrelevant codes by clinical experts.

We can do a simple search for asthma

asthma_codes1 <- getCandidateCodes(
  cdm = cdm,
  keywords = "asthma",
  domains = "Condition"
) 
asthma_codes1 |> 
  glimpse()
#> Rows: 2
#> Columns: 6
#> $ concept_id       <int> 317009, 4051466
#> $ found_from       <chr> "From initial search", "From initial search"
#> $ concept_name     <chr> "Asthma", "Childhood asthma"
#> $ domain_id        <chr> "Condition", "Condition"
#> $ vocabulary_id    <chr> "SNOMED", "SNOMED"
#> $ standard_concept <chr> "S", "S"

But perhaps we want to exclude certain concepts as part of the search strategy, in this case we can add these like so

asthma_codes2 <- getCandidateCodes(
  cdm = cdm,
  keywords = "asthma",
  exclude = "childhood",
  domains = "Condition"
) 
asthma_codes2 |> 
  glimpse()
#> Rows: 1
#> Columns: 6
#> $ concept_id       <int> 317009
#> $ found_from       <chr> "From initial search"
#> $ concept_name     <chr> "Asthma"
#> $ domain_id        <chr> "Condition"
#> $ vocabulary_id    <chr> "SNOMED"
#> $ standard_concept <chr> "S"

Summarising code use

As well as functions for finding codes, we also have functions to summarise their use. Here for

library(flextable)
asthma_code_use <- summariseCodeUse(list("asthma" = asthma_codes1$concept_id) |> 
                                           newCodelist(),
  cdm = cdm
)
tableCodeUse(asthma_code_use, type = "flextable", style = "darwin")

Copy Link

Version

Install

install.packages('CodelistGenerator')

Monthly Downloads

905

Version

4.0.2

License

Apache License (>= 2)

Maintainer

Edward Burn

Last Published

January 19th, 2026

Functions in CodelistGenerator (4.0.2)

availableVocabularies

Get the available vocabularies available in the cdm
excludeConcepts

Exclude concepts from a codelist
benchmarkCodelistGenerator

Run benchmark of codelistGenerator analyses
availableDrugIngredients

Get the names of all available drug ingredients
byYearDoc

Helper for consistent documentation of byYear.
availableDoseUnits

Get available dose units
availableDomains

Get the domains available in the cdm
cdmDoc

Helper for consistent documentation of cdm.
availableDoseForms

Get the dose forms for drug concepts
codesFromConceptSet

compareCodelists

Compare overlap between two sets of codes
getDrugIngredientCodes

Get descendant codes of drug ingredients
availableRouteCategories

Get available drug routes
availableRelationshipIds

Get available relationships between concepts
getMappings

Show mappings from non-standard vocabularies to standard.
headerDoc

Helper for consistent documentation of header.
hideDoc

Helper for consistent documentation of hide.
doseUnitDoc

Helper for consistent documentation of doseUnit.
.optionsDoc

Helper for consistent documentation of .options.
intersectCodelists

Generate a codelist from the intersection of different codelists. The generated codelist will come out in alphabetical order.
headerStrataDoc

Helper for consistent documentation of header.
reexports

Objects exported from other packages
keepOriginalDoc

Helper for consistent documentation of keepOriginal.
minimumCountDoc

Helper for consistent documentation of minimumCount.
levelICD10Doc

Helper for consistent documentation of level.
hideStrataDoc

Helper for consistent documentation of hide.
associatedRelationshipIds

Get available relationships with concepts in a codelist
associatedDrugIngredients

Get the names of drug ingredients associated with codelist
routeCategoryDoc

Helper for consistent documentation of routeCategory.
getATCCodes

Get the descendant codes of Anatomical Therapeutic Chemical (ATC) classification codes
searchStrategy

Report the search strategy used to identify codes when using the getCandidateCodes() function
standardConceptDoc

Helper for consistent documentation of standardConcept.
doseFormDoc

Helper for consistent documentation of doseForm.
stratifyByRouteCategory

Stratify a codelist by route category.
stratifyByDoseUnit

Stratify a codelist by dose unit.
subsetOnVocabulary

Subset a codelist to only those codes from a particular vocabulary.
stratifyByDoseForm

Stratify a codelist by dose form.
stratifyByDomain

Stratify a codelist by domain category.
doseFormToRoute

Table showing the route category associated with each dose form.
levelATCDoc

Helper for consistent documentation of level.
getCandidateCodes

Perform a systematic search to identify a candidate codelist using the OMOP CDM vocabulary tables.
keepOriginalDocSubset

Helper for consistent documentation of keepOriginal.
getDescendants

Get descendant codes for a given concept
stratifyByBrand

Stratify a codelist by brand category.
subsetOnDoseForm

Subset a codelist to only those codes from a particular domain.
stratifyByConcept

Stratify a codelist by the concepts included within it.
summariseAchillesCodeUse

Summarise code use from achilles counts.
summariseCodeUse

Summarise code use in patient-level data.
includeDescendantsDoc

Helper for consistent documentation of includeDescendants.
ingredientRangeDoc

Helper for consistent documentation of ingredientRange.
subsetOnDoseUnit

Subset a codelist to only those with a particular dose unit.
codelistNameDoc

Helper for consistent documentation of codelistNameDoc
stratifyByVocabulary

Subset a codelist to only those codes from a particular domain.
subsetToCodesInUse

Filter a codelist to keep only the codes being used in patient records
summariseCohortCodeUse

Summarise code use among a cohort in the cdm reference
subsetOnDomain

Subset a codelist to only those codes from a particular domain.
tableAchillesCodeUse

Format the result of summariseAchillesCodeUse into a table
tableCodeUse

Format the result of summariseCodeUse into a table.
tableOrphanCodes

Format the result of summariseOrphanCodes into a table
tableStyleDoc

Helper for consistent documentation of style.
summariseOrphanCodes

Find orphan codes related to a codelist using achilles counts and, if available, PHOEBE concept recommendations
unionCodelists

Generate a codelist from the union of different codelists. The generated codelist will come out in alphabetical order.
typeTableDoc

Helper for consistent documentation of type.
subsetOnIngredientRange

Subset a codelist to only those codes with a range of number of ingredients
mockVocabRef

Generate example vocabulary database
codesFromCohort

Get concept ids from JSON files containing cohort definitions
subsetOnRouteCategory

Subset a codelist to only those with a particular route category
typeNarrowDoc

Helper for consistent documentation of type.
countByDoc

Helper for consistent documentation of countBy.
typeBroadDoc

Helper for consistent documentation of type.
groupColumnDoc

Helper for consistent documentation of groupColumn.
nameStyleDoc

Helper for consistent documentation of nameStyle.
domainDoc

Helper for consistent documentation of domain.
groupColumnStrataDoc

Helper for consistent documentation of groupColumn.
tableCohortCodeUse

Format the result of summariseCohortCodeUse into a table.
vocabularyVersion

Get the available version of the vocabulary used in the cdm
tableDoc

Helper for consistent documentation of table.
xDocCohort

Helper for consistent documentation of x where input can be codelist or cohort.
xDoc

Helper for consistent documentation of x.
associatedConceptClassIds

Get the concept classes associated with a codelist
ageGroupDoc

Helper for consistent documentation of ageGroup.
asCodelistWithDetails

Coerce to a codelist with details
asConceptSetExpression

Coerce to a concept set expression
asCodelist

Coerce to a codelist
associatedDoseUnits

Get available dose units
associatedDoseForms

Get the dose forms associated with drug concepts in a codelist
CodelistGenerator-package

CodelistGenerator: Identify Relevant Clinical Codes and Evaluate Their Use
associatedDomains

Get the domains associated with a codelist
addConcepts

Add concepts to a codelist
availableConceptClassIds

Get the available concept classes used in a given set of domains
associatedVocabularies

Get the vocabularies associated with a codelist
associatedRouteCategories

Get drug routes associated with a codelist
availableATC

Get the names of all available Anatomical Therapeutic Chemical (ATC) classification codes
byConceptDoc

Helper for consistent documentation of byConcept.
bySexDoc

Helper for consistent documentation of bySex.