Learn R Programming

CohortGenerator

CohortGenerator is part of HADES.

Introduction

This R package contains functions for generating cohorts and cohort subsets using data in the CDM.

Features

  • Create a cohort table and generate cohorts against an OMOP CDM.
  • Get the count of subjects and events in a cohort.
  • Define subsets of cohorts using different criteria or other cohorts.
  • Create cohorts using templated SQL.

Example

# First construct a cohort definition set: an empty 
# data frame with the cohorts to generate
cohortsToCreate <- CohortGenerator::createEmptyCohortDefinitionSet()

# Fill the cohort set using  cohorts included in this 
# package as an example
cohortJsonFiles <- list.files(path = system.file("testdata/name/cohorts", package = "CohortGenerator"), full.names = TRUE)
for (i in 1:length(cohortJsonFiles)) {
  cohortJsonFileName <- cohortJsonFiles[i]
  cohortName <- tools::file_path_sans_ext(basename(cohortJsonFileName))
  # Here we read in the JSON in order to create the SQL
  # using [CirceR](https://ohdsi.github.io/CirceR/)
  # If you have your JSON and SQL stored differenly, you can
  # modify this to read your JSON/SQL files however you require
  cohortJson <- readChar(cohortJsonFileName, file.info(cohortJsonFileName)$size)
  cohortExpression <- CirceR::cohortExpressionFromJson(cohortJson)
  cohortSql <- CirceR::buildCohortQuery(cohortExpression, options = CirceR::createGenerateOptions(generateStats = FALSE))
  cohortsToCreate <- rbind(cohortsToCreate, data.frame(cohortId = i,
                                                       cohortName = cohortName, 
                                                       sql = cohortSql,
                                                       stringsAsFactors = FALSE))
}

# Generate the cohort set against Eunomia. 
# cohortsGenerated contains a list of the cohortIds 
# successfully generated against the CDM
connectionDetails <- Eunomia::getEunomiaConnectionDetails()

# Create the cohort tables to hold the cohort generation results
cohortTableNames <- CohortGenerator::getCohortTableNames(cohortTable = "my_cohort_table")
CohortGenerator::createCohortTables(connectionDetails = connectionDetails,
                                                        cohortDatabaseSchema = "main",
                                                        cohortTableNames = cohortTableNames)
# Generate the cohorts
cohortsGenerated <- CohortGenerator::generateCohortSet(connectionDetails = connectionDetails,
                                                       cdmDatabaseSchema = "main",
                                                       cohortDatabaseSchema = "main",
                                                       cohortTableNames = cohortTableNames,
                                                       cohortDefinitionSet = cohortsToCreate)

# Get the cohort counts
cohortCounts <- CohortGenerator::getCohortCounts(connectionDetails = connectionDetails,
                                                 cohortDatabaseSchema = "main",
                                                 cohortTable = cohortTableNames$cohortTable)
print(cohortCounts)

Technology

CohortGenerator is an R package.

System requirements

Requires R (version 4.1.0 or higher).

Getting Started

  1. Make sure your R environment is properly configured. This means that Java must be installed. See these instructions for how to configure your R environment.

  2. In R, use the following commands to download and install CohortGenerator:

    remotes::install_github("OHDSI/CohortGenerator")

User Documentation

Documentation can be found on the package website.

PDF versions of the documentation are also available:

Support

  • Developer questions/comments/feedback: OHDSI Forum
  • We use the GitHub issue tracker for all bugs/issues/enhancements

Contributing

Read here how you can contribute to this package.

License

CohortGenerator is licensed under Apache License 2.0

Development

CohortGenerator is being developed in RStudio.

Development status

CohortGenerator is actively being used in several studies and is ready for use.

Copy Link

Version

Install

install.packages('CohortGenerator')

Monthly Downloads

431

Version

1.0.1

License

Apache License

Issues

Pull Requests

Stars

Forks

Maintainer

Anthony Sena

Last Published

November 17th, 2025

Functions in CohortGenerator (1.0.1)

CohortSubsetOperator

Cohort Subset Operator
SubsetOperator

Abstract base class for subsets.
SubsetCohortWindow

Time Window For Cohort Subset Operator
DemographicSubsetOperator

Demographic Subset Operator
addCohortSubsetDefinition

Add cohort subset definition to a cohort definition set
CohortGenerator-package

CohortGenerator: Cohort Generation for the OMOP Common Data Model
LimitSubsetOperator

Limit Subset Operator
CohortSubsetDefinition

Cohort Subset Definition
addCohortTemplateDefintion

Add Cohort template definition to cohort set
CohortTemplateDefinition

Class for automating the creation of bulk cohorts
addUnionCohortDefinition

Add union cohort definition to cohort definition set
addExcludeOnIndexSubsetDefinition

Add exclude on index subset definition
addRestrictionSubsetDefinition

Add Restriction Subset Definition
addIndicationSubsetDefinition

Add Indication Subset Definition
checkAndFixCohortDefinitionSetDataTypes

Check if a cohort definition set is using the proper data types
createCohortSubset

Create Cohort Subset Operator
computeChecksum

Computes the checksum for a value
createCohortSubsetDefinition

Create Subset Definition
createAtcCohortTemplateDefinition

Create ATC Cohort Template Definition
createCohortTemplateDefintion

Create Cohort Template Definition
addSqlCohortDefinition

Add an sql cohort definition
createLimitSubset

Create Limit Subset Operator
createEmptyNegativeControlOutcomeCohortSet

Create an empty negative control outcome cohort set
createLimitSubsetOperator

Create Limit Subset Operator
createResultsDataModel

Create the results data model tables on a database server.
createDemographicSubsetOperator

Create createDemographicSubset Subset operator
createEmptyCohortDefinitionSet

Create an empty cohort definition set
createDemographicSubset

Create Demographic Subset Operator
dropCohortStatsTables

Drop cohort statistics tables
getCohortStats

Get Cohort Inclusion Stats Table Data
getCohortInclusionRules

Get Cohort Inclusion Rules from a cohort definition set
exportCohortStatsTables

Export the cohort statistics tables to the file system
createSnomedCohortTemplateDefinition

Create SNOMED Cohort Template Definition
createRxNormCohortTemplateDefinition

Create Rx Norm Cohort Template Definition
generateCohortSet

Generate a set of cohorts
createCohortSubsetOperator

A definition of subset functions to be applied to a set of cohorts
createCohortTables

Create cohort tables
getIndicationSubsetDefinitionIds

Get Indication Subset Definition Ids
getCohortDefinitionSet

Get a cohort definition set
getLastGeneratedCohortChecksums

Get last generated cohort checksums
getCohortCounts

Count the cohort(s)
isCohortDefinitionSet

Is the data.frame a cohort definition set?
isFormattedForDatabaseUpload

Is the data.frame formatted for uploading to a database?
getCohortValidationCounts

Validate cohort
getSubsetDefinitions

Get cohort subset definitions from a cohort definition set
getCohortTableNames

Used to get a list of cohort table names to use when creating the cohort tables
getTemplateDefinitions

Extract template definitions from a cohort definition set
readCsv

Used to read a .csv file
omopCdmPerson

OMOP CDM Person Sample Data
omopCdmDrugExposure

OMOP CDM Drug Exposure Sample Data
createSubsetCohortWindow

Create a relative time window for cohort subset operations
getRestrictionSubsetDefinitionIds

Get Restriction Subset Definition Ids
createUnionCohortTemplate

Create cohort template to union multiple cohorts
insertInclusionRuleNames

Used to insert the inclusion rule names from a cohort definition set when generating cohorts that include cohort statistics
runCohortGeneration

Run a cohort generation and export results
isSnakeCase

Used to check if a string is in snake case
getResultsDataModelSpecifications

Get specifications for CohortGenerator results data model
migrateDataModel

Migrate Data model
saveCohortSubsetDefinition

Save cohort subset definitions to json
writeCsv

Used to write a .csv file
generateNegativeControlOutcomeCohorts

Generate a set of negative control outcome cohorts
getExcludeOnIndexSubsetDefinitionIds

Get Exclude On Index Subset Definition Ids
getDataMigrator

Get database migrations instance
saveCohortDefinitionSet

Save the cohort definition set to the file system
sampleCohortDefinitionSet

Sample Cohort Definition Set
uploadResults

Upload results to the database server.
isCamelCase

Used to check if a string is in lower camel case