Learn R Programming

CohortGenerator

CohortGenerator is part of HADES.

Introduction

This R package contains functions for generating cohorts and cohort subsets using data in the CDM.

Features

  • Create a cohort table and generate cohorts against an OMOP CDM.
  • Get the count of subjects and events in a cohort.
  • Define subsets of cohorts using different criteria or other cohorts.
  • Create cohorts using templated SQL.

Example

# First construct a cohort definition set: an empty 
# data frame with the cohorts to generate
cohortsToCreate <- CohortGenerator::createEmptyCohortDefinitionSet()

# Fill the cohort set using  cohorts included in this 
# package as an example
cohortJsonFiles <- list.files(path = system.file("testdata/name/cohorts", package = "CohortGenerator"), full.names = TRUE)
for (i in 1:length(cohortJsonFiles)) {
  cohortJsonFileName <- cohortJsonFiles[i]
  cohortName <- tools::file_path_sans_ext(basename(cohortJsonFileName))
  # Here we read in the JSON in order to create the SQL
  # using [CirceR](https://ohdsi.github.io/CirceR/)
  # If you have your JSON and SQL stored differenly, you can
  # modify this to read your JSON/SQL files however you require
  cohortJson <- readChar(cohortJsonFileName, file.info(cohortJsonFileName)$size)
  cohortExpression <- CirceR::cohortExpressionFromJson(cohortJson)
  cohortSql <- CirceR::buildCohortQuery(cohortExpression, options = CirceR::createGenerateOptions(generateStats = FALSE))
  cohortsToCreate <- rbind(cohortsToCreate, data.frame(cohortId = i,
                                                       cohortName = cohortName, 
                                                       sql = cohortSql,
                                                       stringsAsFactors = FALSE))
}

# Generate the cohort set against Eunomia. 
# cohortsGenerated contains a list of the cohortIds 
# successfully generated against the CDM
connectionDetails <- Eunomia::getEunomiaConnectionDetails()

# Create the cohort tables to hold the cohort generation results
cohortTableNames <- CohortGenerator::getCohortTableNames(cohortTable = "my_cohort_table")
CohortGenerator::createCohortTables(connectionDetails = connectionDetails,
                                                        cohortDatabaseSchema = "main",
                                                        cohortTableNames = cohortTableNames)
# Generate the cohorts
cohortsGenerated <- CohortGenerator::generateCohortSet(connectionDetails = connectionDetails,
                                                       cdmDatabaseSchema = "main",
                                                       cohortDatabaseSchema = "main",
                                                       cohortTableNames = cohortTableNames,
                                                       cohortDefinitionSet = cohortsToCreate)

# Get the cohort counts
cohortCounts <- CohortGenerator::getCohortCounts(connectionDetails = connectionDetails,
                                                 cohortDatabaseSchema = "main",
                                                 cohortTable = cohortTableNames$cohortTable)
print(cohortCounts)

Technology

CohortGenerator is an R package.

System requirements

Requires R (version 4.1.0 or higher).

Getting Started

  1. Make sure your R environment is properly configured. This means that Java must be installed. See these instructions for how to configure your R environment.

  2. In R, use the following commands to download and install CohortGenerator:

    remotes::install_github("OHDSI/CohortGenerator")

User Documentation

Documentation can be found on the package website.

PDF versions of the documentation are also available:

Support

  • Developer questions/comments/feedback: OHDSI Forum
  • We use the GitHub issue tracker for all bugs/issues/enhancements

Contributing

Read here how you can contribute to this package.

License

CohortGenerator is licensed under Apache License 2.0

Development

CohortGenerator is being developed in RStudio.

Development status

CohortGenerator is actively being used in several studies and is ready for use.

Copy Link

Version

Install

install.packages('CohortGenerator')

Monthly Downloads

634

Version

1.1.0

License

Apache License

Issues

Pull Requests

Stars

Forks

Maintainer

Anthony Sena

Last Published

March 3rd, 2026

Functions in CohortGenerator (1.1.0)

computeCohortAttrition

Compute cohort attrition from inclusion rule statistics
addExcludeOnIndexSubsetDefinition

Add exclude on index subset definition
createCohortSubset

Create Cohort Subset Operator
addIndicationSubsetDefinition

Add Indication Subset Definition
createAtcCohortTemplateDefinition

Create ATC Cohort Template Definition
addRestrictionSubsetDefinition

Add Restriction Subset Definition
computeChecksum

Computes the checksum for a value
addUnionCohortDefinition

Add union cohort definition to cohort definition set
checkAndFixCohortDefinitionSetDataTypes

Check if a cohort definition set is using the proper data types
createLimitSubsetOperator

Create Limit Subset Operator
createEmptyNegativeControlOutcomeCohortSet

Create an empty negative control outcome cohort set
createEmptyCohortDefinitionSet

Create an empty cohort definition set
addSqlCohortDefinition

Add an sql cohort definition
createCohortSubsetDefinition

Create Subset Definition
createCohortSubsetOperator

A definition of subset functions to be applied to a set of cohorts
createLimitSubset

Create Limit Subset Operator
createCohortTables

Create cohort tables
createCohortTemplateDefintion

Create Cohort Template Definition
dropCohortStatsTables

Drop cohort statistics tables
createDemographicSubsetOperator

Create createDemographicSubset Subset operator
createUnionCohortTemplate

Create cohort template to union multiple cohorts
createResultsDataModel

Create the results data model tables on a database server.
createDemographicSubset

Create Demographic Subset Operator
createSnomedCohortTemplateDefinition

Create SNOMED Cohort Template Definition
createRxNormCohortTemplateDefinition

Create Rx Norm Cohort Template Definition
createSubsetCohortWindow

Create a relative time window for cohort subset operations
getDataMigrator

Get database migrations instance
exportCohortSubsetStatsTables

Export cohort subset statistics tables to the file system
exportCohortStatsTables

Export the cohort statistics tables to the file system
generateCohortSet

Generate a set of cohorts
generateNegativeControlOutcomeCohorts

Generate a set of negative control outcome cohorts
getIndicationSubsetDefinitionIds

Get Indication Subset Definition Ids
getLastGeneratedCohortChecksums

Get last generated cohort checksums
getCohortTableNames

Used to get a list of cohort table names to use when creating the cohort tables
getCohortValidationCounts

Validate cohort
getCohortInclusionRules

Get Cohort Inclusion Rules from a cohort definition set
getCohortStats

Get Cohort Inclusion Stats Table Data
insertInclusionRuleNames

Used to insert the inclusion rule names from a cohort definition set when generating cohorts that include cohort statistics
getCohortCounts

Count the cohort(s)
getCohortDefinitionSet

Get a cohort definition set
uploadResults

Upload results to the database server.
migrateDataModel

Migrate Data model
isSnakeCase

Used to check if a string is in snake case
saveCohortSubsetDefinition

Save cohort subset definitions to json
sampleCohortDefinitionSet

Sample Cohort Definition Set
saveCohortDefinitionSet

Save the cohort definition set to the file system
omopCdmDrugExposure

OMOP CDM Drug Exposure Sample Data
isCamelCase

Used to check if a string is in lower camel case
getExcludeOnIndexSubsetDefinitionIds

Get Exclude On Index Subset Definition Ids
omopCdmPerson

OMOP CDM Person Sample Data
isCohortDefinitionSet

Is the data.frame a cohort definition set?
getResultsDataModelSpecifications

Get specifications for CohortGenerator results data model
getRestrictionSubsetDefinitionIds

Get Restriction Subset Definition Ids
readCsv

Used to read a .csv file
runCohortGeneration

Run a cohort generation and export results
getSubsetDefinitions

Get cohort subset definitions from a cohort definition set
getTemplateDefinitions

Extract template definitions from a cohort definition set
writeCsv

Used to write a .csv file
isFormattedForDatabaseUpload

Is the data.frame formatted for uploading to a database?
LimitSubsetOperator

Limit Subset Operator
CohortGenerator-package

CohortGenerator: Cohort Generation for the OMOP Common Data Model
CohortSubsetOperator

Cohort Subset Operator
SubsetCohortWindow

Time Window For Cohort Subset Operator
DemographicSubsetOperator

Demographic Subset Operator
addCohortSubsetDefinition

Add cohort subset definition to a cohort definition set
addCohortTemplateDefintion

Add Cohort template definition to cohort set
CohortSubsetDefinition

Cohort Subset Definition
CohortTemplateDefinition

Class for automating the creation of bulk cohorts
SubsetOperator

Abstract base class for subsets.