Modified Detecting Deviating Cells (MDDC) algorithm for adverse event signal identification. Monte Carlo (MC) method is used for cutoff selection in the second step of the algorithm.
mddc_mc(
contin_table,
quantile = 0.95,
rep = 10000,
exclude_same_drug_class = TRUE,
col_specific_cutoff = TRUE,
separate = TRUE,
if_col_cor = FALSE,
cor_lim = 0.8,
num_cores = 2,
seed = NULL
)
A list with the following components:
mc_pval
returns the p values for each cell in the second step
using the Monte Carlo method (Algorithm 3 of Liu et al.(2024)).
fisher_pval
returns the p-values for each cell in the step 2 of
the algorithm, calculated using the Monte Carlo method for cells with count
greater than five, and Fisher’s exact test for cells with count less than or
equal to five.
mc_signal
returns the signals with a count greater than five and
identified in the second step by MC method. 1 indicates signals, 0 for non
signal.
fisher_signal
returns the signals with a count
less than or equal to five and identified in the second step by
Fisher's exact tests. 1 indicates signals, 0 for non signal.
corr_signal_pval
returns the p values for each cell in the
contingency table in the fifth step, when the \(r_{ij}\) values are mapped
back to the standard normal distribution.
corr_signal_adj_pval
returns the Benjamini-Hochberg adjusted p
values for each cell in the fifth step. We leave here an option for the user
to decide whether to use corr_signal_pval
or
corr_signal_adj_pval
, and what threshold for p values should be used
(for example, 0.05). Please see the example below.
A data matrix of an \(I\) x \(J\) contingency table
with row (adverse event) and column (drug or vaccine) names.
Please first check the input contingency table using the function
check_and_fix_contin_table()
.
In the second step of the algorithm, the quantile of the null distribution obtained via MC method to use as a threshold for identifying cells with high value of the standardized Pearson residuals. Default is 0.95.
In the second step, the number of Monte Carlo replications in the MC method. Default is 10000.
In the second step, when applying Fisher's
exact test to cells with a count less than six, a 2 by 2 contingency table
needs to be constructed. Does the construction need to exclude other drugs
or vaccines in the same class as the drug or vaccine of interest?
Default is TRUE
.
Logical. In the second step of the algorithm,
whether to apply MC method to the standardized Pearson residuals
of the entire table, or within each drug or vaccine column.
Default is TRUE
, that is within each drug or vaccine
column (column specific cutoff). FALSE
indicates applying MC method
on residuals of the entire table.
Logical. In the second step of the algorithm, whether to
separate the standardized Pearson residuals for the zero cells and non zero
cells and apply MC method separately or together. Default is TRUE
.
Logical. In the third step of the algorithm, whether to use
column (drug or vaccine) correlation or row (adverse event) correlation.
Default is FALSE
, that is using the adverse event correlation.
TRUE
indicates using drug or vaccine correlation.
A numeric value between (0, 1). In the third step, what correlation threshold should be used to select ``connected'' adverse events. Default is 0.8.
Number of cores used to parallelize the MDDC MC algorithm. Default is 2.
An optional integer to set the seed for reproducibility. If NULL, no seed is set.
Liu, A., Mukhopadhyay, R., and Markatou, M. (2024). MDDC: An R and Python package for adverse event identification in pharmacovigilance data. arXiv preprint. arXiv:2410.01168
# using statin49 data set as an example
data(statin49)
# apply the mddc_mc
mc_res <- mddc_boxplot(statin49)
# signals identified in step 2 using MC method
signal_step2 <- mc_res$mc_signal
# signals identified in step 5 by considering AE correlations
# In this example, cells with p values less than 0.05 are
# identified as signals
signal_step5 <- (mc_res$corr_signal_pval < 0.05) * 1
Run the code above in your browser using DataLab