Learn R Programming

simaerep

Simulate adverse event reporting in clinical trials with the goal of detecting under-reporting sites.

Monitoring of Adverse Event (AE) reporting in clinical trials is important for patient safety. We use bootstrap-based simulation to assign an AE under-reporting probability to each site in a clinical trial. The method is inspired by the ‘infer’ R package and Allen Downey’s blog article: “There is only one test!”.

Installation

CRAN

install.packages("simaerep")

Development Version

You can install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("openpharma/simaerep")

IMPALA

simaerep has been published as workproduct of the Inter-Company Quality Analytics (IMPALA) consortium. IMPALA aims to engage with Health Authorities inspectors on defining guiding principles for the use of advanced analytics to complement, enhance and accelerate current QA practices. simaerep has initially been developed at Roche but is currently evaluated by other companies across the industry to complement their quality assurance activities (see testimonials).

Publications

Koneswarakantha, B., Adyanthaya, R., Emerson, J. et al. An Open-Source R Package for Detection of Adverse Events Under-Reporting in Clinical Trials: Implementation and Validation by the IMPALA (Inter coMPany quALity Analytics) Consortium. Ther Innov Regul Sci 58, 591–599 (2024). https://doi.org/10.1007/s43441-024-00631-8

Koneswarakantha, B., Barmaz, Y., Ménard, T. et al. Follow-up on the Use of Advanced Analytics for Clinical Quality Assurance: Bootstrap Resampling to Enhance Detection of Adverse Event Under-Reporting. Drug Saf (2020). https://doi.org/10.1007/s40264-020-01011-5

Resources

Validation Report

Download as pdf in the release section generated using thevalidatoR.

{gsm.simaerep}

We have created an extension gsm.simaerep so that simaerep event reporting probabilities can be added to good statistical monitoring gsm.core reports.

Application

Recommended Threshold: aerep$dfeval$prob_low_prob_ur: 0.95


suppressPackageStartupMessages(library(simaerep))
suppressPackageStartupMessages(library(tidyverse))
suppressPackageStartupMessages(library(knitr))

set.seed(1)

df_visit <- sim_test_data_study(
  n_pat = 1000, # number of patients in study
  n_sites = 100, # number of sites in study
  frac_site_with_ur = 0.05, # fraction of sites under-reporting
  ur_rate = 0.4, # rate of under-reporting
  ae_per_visit_mean = 0.5 # mean AE per patient visit
)

df_visit$study_id <- "A"

df_visit %>%
  select(study_id, site_number, patnum, visit, n_ae) %>%
  head(25) %>%
  knitr::kable()
study_idsite_numberpatnumvisitn_ae
AS0001P00000110
AS0001P00000121
AS0001P00000131
AS0001P00000142
AS0001P00000153
AS0001P00000163
AS0001P00000173
AS0001P00000183
AS0001P00000193
AS0001P000001103
AS0001P000001113
AS0001P000001123
AS0001P000001134
AS0001P000001144
AS0001P000001154
AS0001P000001166
AS0001P000001176
AS0001P00000210
AS0001P00000220
AS0001P00000230
AS0001P00000240
AS0001P00000250
AS0001P00000260
AS0001P00000270
AS0001P00000281

aerep <- simaerep(df_visit)

plot(aerep, study = "A")

Left panel shows mean AE reporting per site (lightblue and darkblue lines) against mean AE reporting of the entire study (golden line). Single sites are plotted in descending order by AE under-reporting probability on the right panel in which grey lines denote cumulative AE count of single patients. Grey dots in the left panel plot indicate sites that were picked for single plotting. AE under-reporting probability of dark blue lines crossed threshold of 95%. Numbers in the upper left corner indicate the ratio of patients that have been used for the analysis against the total number of patients. Patients that have not been on the study long enough to reach the evaluation point (visit_med75, see introduction) will be ignored.

Optimized Statistical Performance

Following the recommendation of our latest performance benchmark statistical performance can be increased by using the inframe algorithm without multiplicity correction.

Note that the plot is more noisy because no patients are excluded and only a few patients contribute to the event count at higher visits

Recommended Threshold: aerep$dfeval$prob_low_prob_ur: 0.99

aerep <- simaerep(
  df_visit,
  inframe = TRUE,
  visit_med75 = FALSE,
  mult_corr = FALSE
)

plot(aerep, study = "A")

In Database Calculation

The inframe algorithm uses only dbplyr compatible table operations and can be executed within a database backend as we demonstrate here using duckdb.

However, we need to provide a in database table that has as many rows as the desired replications in our simulation, instead of providing an integer for the r parameter.

con <- DBI::dbConnect(duckdb::duckdb(), dbdir = ":memory:")
df_r <- tibble(rep = seq(1, 1000))

dplyr::copy_to(con, df_visit, "visit")
dplyr::copy_to(con, df_r, "r")

tbl_visit <- tbl(con, "visit")
tbl_r <- tbl(con, "r")


aerep <- simaerep(
  tbl_visit,
  r = tbl_r,
  inframe = TRUE,
  visit_med75 = FALSE,
  mult_corr = FALSE
)

plot(aerep, df_visit = tbl_visit)
#> study = NULL, defaulting to study:A

DBI::dbDisconnect(con)

Copy Link

Version

Install

install.packages('simaerep')

Monthly Downloads

224

Version

0.7.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Bjoern Koneswarakantha

Last Published

April 9th, 2025

Functions in simaerep (0.7.0)

plot_visit_med75

Plot patient visits against visit_med75.
poiss_test_site_ae_vs_study_ae

Poisson test for vector with site AEs vs vector with study AEs.
plot_dots

Plots AE per site as dots.
purrr_bar

Execute a purrr or furrr function with a progress bar.
plot_study

Plot ae development of study and sites highlighting at risk sites.
plot_sim_example

Plot simulation example.
max_rank

Calculate Max Rank
is_simaerep

is simaerep class
prune_to_visit_med75_inframe

prune visits to visit_med75 using table operations
plot_sim_examples

Plot multiple simulation examples.
orivisit

create orivisit object
get_visit_med75

Get visit_med75.
prob_lower_site_ae_vs_study_ae

Calculate bootstrapped probability for obtaining a lower site mean AE number.
prep_for_sim

Prepare data for simulation.
is_orivisit

is orivisit class
p_adjust_bh_inframe

benjamini hochberg p value correction using table operations
sim_test_data_patient

simulate patient ae reporting test data
sim_studies

Simulate studies.
sim_test_data_events

simulate test data events
sim_inframe

Calculate prob_lower for study sites using table operations
sim_scenario

simulate single scenario
sim_sites

Calculate prob_lower and poisson.test pvalue for study sites.
sim_after_prep

Start simulation after preparation.
sim_ur

simulate under-reporting
sim_test_data_study

simulate study test data
sim_test_data_portfolio

Simulate Portfolio Test Data
sim_ur_scenarios

Simulate Under-Reporting Scenarios
simaerep

Create simaerep object
with_progress_cnd

site_aggr

Aggregate from visit to site level.
simaerep_inframe

simulate in dataframe
eval_sites

Evaluate sites.
get_portf_perf

Get Portfolio Performance
check_df_visit

Integrity check for df_visit.
get_config

Get Portfolio Configuration
get_ecd_values

Get empirical cumulative distribution values of pval or prob_lower
exp_implicit_missing_visits

Expose implicitly missing visits.
get_pat_pool_config

Configure study patient pool by site parameters.
get_site_mean_ae_dev

Get site mean ae development.
aggr_duplicated_visits

Aggregate duplicated visits.
get_legend

replace cowplot::get_legend, to silence warning Multiple components found; returning the first one. To return all, use `return_all = TRUE
plot.simaerep

plot AE under-reporting simulation results
pat_pool

Create a study specific patient pool for sampling
pat_aggr

Aggregate visit to patient level.
%>%

Pipe operator