Learn R Programming

sdtm.oak

An EDC (Electronic Data Capture systems) and Data Standard agnostic solution that enables the pharmaceutical programming community to develop CDISC (Clinical Data Interchange Standards Consortium) SDTM (Study Data Tabulation Model) datasets in R. The reusable algorithms concept in 'sdtm.oak' provides a framework for modular programming and also can automate SDTM creation based on the standard SDTM spec.

Installation

The package is available from CRAN and can be installed with:

install.packages("sdtm.oak")

You can install the development version of 'sdtm.oak' from GitHub with:

# install.packages("remotes")
remotes::install_github("pharmaverse/sdtm.oak")

Challenges with SDTM at the Industry Level

  • Raw Data Structure: Data from different EDC systems come in varying structures, with different variable names, dataset names, etc.

  • Varying Data Collection Standards: Despite the availability of CDASH (Clinical Data Acquisition Standards Harmonization), pharmaceutical companies still create different eCRFs using CDASH standards.

Due to the differences in raw data structures and data collection standards, it may seem impossible to develop a common approach for programming SDTM datasets.

GOAL

'sdtm.oak' aims to address this issue by providing an EDC-agnostic, standards-agnostic solution. It is an open-source R package that offers a framework for the modular programming of SDTM in R. With future releases; it will also strive to automate the creation of SDTM datasets based on the metadata-driven approach using standard SDTM specifications.

Scope

Our goal is to use 'sdtm.oak' to program most of the domains specified in SDTMIG (Study Data Tabulation Model Implementation Guide: Human Clinical Trials) and SDTMIG-AP (Study Data Tabulation Model Implementation Guide: Associated Persons). This R package is based on the core concept of algorithms, implemented as functions capable of carrying out the SDTM mappings for any domains listed in the CDISC SDTMIG and across different versions of SDTM IGs. The design of these functions allows users to specify a raw dataset and a variable name(s) as parameters, making it EDC (Electronic Data Capture) agnostic. As long as the raw dataset and variable name(s) exist, 'sdtm.oak' will execute the SDTM mapping using the selected function. It’s important to note that 'sdtm.oak' may not handle sponsor-specific details related to managing metadata for LAB tests, unit conversions, and coding information, as many companies have unique business processes. With subsequent releases, strive to automate SDTM creation using a metadata-driven approach based on a standard SDTM specification format.

Road Map

This Release: The V0.1.0 release of 'sdtm.oak' users can create the majority of the SDTM domains. Domains that are NOT in scope for the V0.1.0 release are DM (Demographics), Trial Design Domains, SV (Subject Visits), SE (Subject Elements), RELREC (Related Records), Associated Person domains, and EPOCH Variable across all domains.

Subsequent Releases: We are planning to develop the below features in the subsequent releases.

  • Functions required to derive reference date variables in the DM

domain.

  • Metadata driven automation based on the standardized SDTM

specification.

  • Functions required to program the EPOCH Variable.
  • Functions to derive standard units and results based on metadata.

References and Documentation

  • Please go to Algorithms article to learn about Algorithms.
  • Please go to Create Events Domain to learn about step by step process to create an Events domain.
  • Please go to Create Findings Domain to learn about step by step process to create an Events domain.
  • Please go to Path to Automation to learn about how the foundational release sets up the stage for automation.

Feedback

We ask users to follow the mentioned approach and try 'sdtm.oak' to map any SDTM domains supported in this release. Users can also utilize the test data in the package to become familiar with the concepts before attempting on their own data. Please get in touch with us using one of the recommended approaches listed below:

Acknowledgments

We thank the contributors and authors of the package. We also thank the CDISC COSA for sponsoring the 'sdtm.oak'. Additionally, we would like to sincerely thank the volunteers from Roche, Pfizer, GSK, Vertex, and Merck for their valuable input as integral members of the CDISC COSA - OAK leadership team.

Copy Link

Version

Install

install.packages('sdtm.oak')

Monthly Downloads

616

Version

0.1.0

License

Apache License (>= 2)

Maintainer

Rammprasad Ganapathy

Last Published

September 3rd, 2024

Functions in sdtm.oak (0.1.0)

dataset_oak_vignette

Output a Dataset in a Vignette in the sdtm.oak Format
condition_add

Add filtering tags to a data set
derive_study_day

derive_study_day performs study day calculation
derive_blfl

Derive Baseline Flag or Last Observation Before Exposure Flag
derive_seq

Derive the sequence number (--SEQ) variable
harcode

Derive an SDTM variable with a hardcoded value
dttm_fmt_to_regex

Convert a parsed date/time format to regex
eval_conditions

Evaluate conditions
fmt_rg

Regexps for date/time components
dtc_formats

Date/time collection formats
is_cnd_df

Check if a data frame is a conditioned data frame
index_for_recode

Determine Indices for Recoding
format_iso8601

Convert date/time components into ISO8601 format
dtc_timepart

Extract time part from ISO 8601 date/time variable
iso8601_na

Convert NA to "-"
is_seq_name

Is it a --SEQ variable name
iso8601_sec

Format as ISO8601 seconds
iso8601_mon

Format as a ISO8601 month
get_cnd_df_cnd_sum

Get the summary of the conditioning vector from a conditioned data frame
%.>%

Explicit Dot Pipe
dtc_datepart

Extract date part from ISO8601 date/time variable
iso8601_year

Format as a ISO8601 four-digit year
months_abb_regex

Regex for months' abbreviations
generate_oak_id_vars

A function to generate oak_id_vars
get_cnd_df_cnd

Get the conditioning vector from a conditioned data frame
domain_example

Find the path to an example SDTM domain file
read_ct_spec_example

Read an example controlled terminology specification
read_domain_example

Read an example SDTM domain
oak_id_vars

Raw dataset keys
iso8601_truncate

Truncate a partial ISO8601 date-time
parse_dttm_

Parse a date, time, or date-time
zero_pad_whole_number

Convert an integer to a zero-padded character vector
rm_cnd_df

Remove the cnd_df class from a data frame
regex_or

Utility function to assemble a regex of alternative patterns
tbl_sum.cnd_df

Conditioned tibble header print method
find_int_gap

Find gap intervals in integer sequences
yy_to_yyyy

Convert two-digit to four-digit years
mutate.cnd_df

Mutate method for conditioned data frames
reg_matches

regmatches() with NA
pseq

Parallel sequence generation
new_cnd_df

Create a data frame with filtering tags
read_ct_spec

Read in a controlled terminology
fmt_cmp

Regexps for date/time format components
iso8601_two_digits

Format as a ISO8601 two-digit number
str_to_anycase

Generate case insensitive regexps
sdtm_assign

Derive an SDTM variable
recode

Recode values
problems

Retrieve date/time parsing problems
sdtm_join

SDTM join
parse_dttm_fmt_

Parse a date/time format
sdtm_hardcode

Derive an SDTM variable with a hardcoded value
sbj_vars

Subject-level key variables
sdtm.oak-package

sdtm.oak: SDTM Data Transformation Engine
add_problems

Add ISO 8601 parsing problems
assign_datetime

Derive an ISO8601 date-time variable
assert_ct_spec

Assert a controlled terminology specification
any_problems

Detect problems with the parsing of date/times
assert_dtc_fmt

Assert date time character formats
complete_capture_matrix

Complete a capture matrix
assert_dtc_format

Assert dtc format
coalesce_capture_matrices

Coalesce capture matrices
assert_capture_matrix

Assert capture matrix
ctl_new_rowid_pillar.cnd_df

Conditioned tibble pillar print method
ct_spec_example

Find the path to an example controlled terminology file
assign_no_ct

Derive an SDTM variable
contains_oak_id_vars

Does a vector contain the raw dataset key variables?
assert_ct_clst

Assert a codelist code
ct_spec_vars

Controlled terminology variables
create_iso8601

Convert date or time collected values to ISO 8601
ct_map

Recode according to controlled terminology
ct_mappings

Controlled terminology mappings