Learn R Programming

CohortGenerator

CohortGenerator is part of HADES.

Introduction

This R package contains functions for generating cohorts and cohort subsets using data in the CDM.

Features

  • Create a cohort table and generate cohorts against an OMOP CDM.
  • Get the count of subjects and events in a cohort.
  • Define subsets of cohorts using different criteria or other cohorts.
  • Create cohorts using templated SQL.

Example

# First construct a cohort definition set: an empty 
# data frame with the cohorts to generate
cohortsToCreate <- CohortGenerator::createEmptyCohortDefinitionSet()

# Fill the cohort set using  cohorts included in this 
# package as an example
cohortJsonFiles <- list.files(path = system.file("testdata/name/cohorts", package = "CohortGenerator"), full.names = TRUE)
for (i in 1:length(cohortJsonFiles)) {
  cohortJsonFileName <- cohortJsonFiles[i]
  cohortName <- tools::file_path_sans_ext(basename(cohortJsonFileName))
  # Here we read in the JSON in order to create the SQL
  # using [CirceR](https://ohdsi.github.io/CirceR/)
  # If you have your JSON and SQL stored differenly, you can
  # modify this to read your JSON/SQL files however you require
  cohortJson <- readChar(cohortJsonFileName, file.info(cohortJsonFileName)$size)
  cohortExpression <- CirceR::cohortExpressionFromJson(cohortJson)
  cohortSql <- CirceR::buildCohortQuery(cohortExpression, options = CirceR::createGenerateOptions(generateStats = FALSE))
  cohortsToCreate <- rbind(cohortsToCreate, data.frame(cohortId = i,
                                                       cohortName = cohortName, 
                                                       sql = cohortSql,
                                                       stringsAsFactors = FALSE))
}

# Generate the cohort set against Eunomia. 
# cohortsGenerated contains a list of the cohortIds 
# successfully generated against the CDM
connectionDetails <- Eunomia::getEunomiaConnectionDetails()

# Create the cohort tables to hold the cohort generation results
cohortTableNames <- CohortGenerator::getCohortTableNames(cohortTable = "my_cohort_table")
CohortGenerator::createCohortTables(connectionDetails = connectionDetails,
                                                        cohortDatabaseSchema = "main",
                                                        cohortTableNames = cohortTableNames)
# Generate the cohorts
cohortsGenerated <- CohortGenerator::generateCohortSet(connectionDetails = connectionDetails,
                                                       cdmDatabaseSchema = "main",
                                                       cohortDatabaseSchema = "main",
                                                       cohortTableNames = cohortTableNames,
                                                       cohortDefinitionSet = cohortsToCreate)

# Get the cohort counts
cohortCounts <- CohortGenerator::getCohortCounts(connectionDetails = connectionDetails,
                                                 cohortDatabaseSchema = "main",
                                                 cohortTable = cohortTableNames$cohortTable)
print(cohortCounts)

Technology

CohortGenerator is an R package.

System requirements

Requires R (version 4.1.0 or higher).

Getting Started

  1. Make sure your R environment is properly configured. This means that Java must be installed. See these instructions for how to configure your R environment.

  2. In R, use the following commands to download and install CohortGenerator:

    remotes::install_github("OHDSI/CohortGenerator")

User Documentation

Documentation can be found on the package website.

PDF versions of the documentation are also available:

Support

  • Developer questions/comments/feedback: OHDSI Forum
  • We use the GitHub issue tracker for all bugs/issues/enhancements

Contributing

Read here how you can contribute to this package.

License

CohortGenerator is licensed under Apache License 2.0

Development

CohortGenerator is being developed in RStudio.

Development status

CohortGenerator is actively being used in several studies and is ready for use.

Copy Link

Version

Install

install.packages('CohortGenerator')

Monthly Downloads

431

Version

1.0.2

License

Apache License

Issues

Pull Requests

Stars

Forks

Maintainer

Anthony Sena

Last Published

February 10th, 2026

Functions in CohortGenerator (1.0.2)

addSqlCohortDefinition

Add an sql cohort definition
createCohortSubsetOperator

A definition of subset functions to be applied to a set of cohorts
createLimitSubsetOperator

Create Limit Subset Operator
createResultsDataModel

Create the results data model tables on a database server.
createDemographicSubset

Create Demographic Subset Operator
createSubsetCohortWindow

Create a relative time window for cohort subset operations
createUnionCohortTemplate

Create cohort template to union multiple cohorts
getResultsDataModelSpecifications

Get specifications for CohortGenerator results data model
getRestrictionSubsetDefinitionIds

Get Restriction Subset Definition Ids
generateCohortSet

Generate a set of cohorts
generateNegativeControlOutcomeCohorts

Generate a set of negative control outcome cohorts
createCohortTemplateDefintion

Create Cohort Template Definition
getCohortTableNames

Used to get a list of cohort table names to use when creating the cohort tables
getCohortValidationCounts

Validate cohort
createRxNormCohortTemplateDefinition

Create Rx Norm Cohort Template Definition
createCohortTables

Create cohort tables
getCohortCounts

Count the cohort(s)
isSnakeCase

Used to check if a string is in snake case
createSnomedCohortTemplateDefinition

Create SNOMED Cohort Template Definition
migrateDataModel

Migrate Data model
createDemographicSubsetOperator

Create createDemographicSubset Subset operator
getCohortDefinitionSet

Get a cohort definition set
getSubsetDefinitions

Get cohort subset definitions from a cohort definition set
createEmptyCohortDefinitionSet

Create an empty cohort definition set
getCohortInclusionRules

Get Cohort Inclusion Rules from a cohort definition set
isCamelCase

Used to check if a string is in lower camel case
getCohortStats

Get Cohort Inclusion Stats Table Data
insertInclusionRuleNames

Used to insert the inclusion rule names from a cohort definition set when generating cohorts that include cohort statistics
getIndicationSubsetDefinitionIds

Get Indication Subset Definition Ids
getTemplateDefinitions

Extract template definitions from a cohort definition set
createEmptyNegativeControlOutcomeCohortSet

Create an empty negative control outcome cohort set
dropCohortStatsTables

Drop cohort statistics tables
createLimitSubset

Create Limit Subset Operator
omopCdmPerson

OMOP CDM Person Sample Data
getExcludeOnIndexSubsetDefinitionIds

Get Exclude On Index Subset Definition Ids
readCsv

Used to read a .csv file
exportCohortStatsTables

Export the cohort statistics tables to the file system
runCohortGeneration

Run a cohort generation and export results
getDataMigrator

Get database migrations instance
omopCdmDrugExposure

OMOP CDM Drug Exposure Sample Data
sampleCohortDefinitionSet

Sample Cohort Definition Set
getLastGeneratedCohortChecksums

Get last generated cohort checksums
saveCohortDefinitionSet

Save the cohort definition set to the file system
writeCsv

Used to write a .csv file
uploadResults

Upload results to the database server.
saveCohortSubsetDefinition

Save cohort subset definitions to json
isCohortDefinitionSet

Is the data.frame a cohort definition set?
isFormattedForDatabaseUpload

Is the data.frame formatted for uploading to a database?
addCohortSubsetDefinition

Add cohort subset definition to a cohort definition set
addCohortTemplateDefintion

Add Cohort template definition to cohort set
CohortSubsetDefinition

Cohort Subset Definition
CohortGenerator-package

CohortGenerator: Cohort Generation for the OMOP Common Data Model
SubsetOperator

Abstract base class for subsets.
CohortTemplateDefinition

Class for automating the creation of bulk cohorts
DemographicSubsetOperator

Demographic Subset Operator
CohortSubsetOperator

Cohort Subset Operator
LimitSubsetOperator

Limit Subset Operator
SubsetCohortWindow

Time Window For Cohort Subset Operator
createCohortSubset

Create Cohort Subset Operator
addUnionCohortDefinition

Add union cohort definition to cohort definition set
checkAndFixCohortDefinitionSetDataTypes

Check if a cohort definition set is using the proper data types
addRestrictionSubsetDefinition

Add Restriction Subset Definition
createAtcCohortTemplateDefinition

Create ATC Cohort Template Definition
addIndicationSubsetDefinition

Add Indication Subset Definition
computeChecksum

Computes the checksum for a value
addExcludeOnIndexSubsetDefinition

Add exclude on index subset definition
createCohortSubsetDefinition

Create Subset Definition