Overview
Most sequential analysis is based on asymptotic results. This package contains functions for the exact calculation of critical values, statistical power,
expected time to signal when the null is rejected and the maximum sample size needed when the null is not rejected. This is done for Poisson type date
with a Wald type upper boundary, which is flat with respect to the likelihood ratio function, and a predetermined upper limit on the sample size. For a
desired statistical power, it is also possible to calculate the latter. The motivation for this package is post-market near real-time drug and vaccine
safety surveillance, where the goal is to detect rare but serious safety problems as early as possible, in many cases after only a hand full of
adverse events. There are also other application areas.
The basis for this package is the MaxSPRT statistic (Kulldorff et al., 2011), which is a variant of Wald's Sequential Probability Ratio Test (SPRT)
(Wald, 1945,47). MaxSPRT uses a composite alternative hypothesis, and upper bounbdary to reject the null hypothesis when there are more events than expected,
no lower boundary, and an upper limit on the sample size at which time the sequential analyses end without rejecting the null.
MaxSPRT was developed for post-market vaccine safety surveillance as part of the Vaccine Safety Datalink project run by the Centers for Disease
Control and Prevention.
Let $C_t$ be the random variable that counts the number of events up to time t. Suppose that, under
the null hypothesis, $C_t$ has a Poisson distribution with mean $\mu_t$, where $\mu_t$ is a known function
reflecting the population at risk. Under the alternative hypothesis, suppose that $C_t$ has a Poisson
distribution with mean $RR C_t$, where "RR" is the unknown increased relative risk due to the vaccine. The MaxSPRT statistic defined
in terms of the log likelihood ratio is given by:
$$LLR_t=(\mu_t-c_t)+c_t \log{c_t/\mu_t},$$
when $c_t$ is at least $\mu_t$, and $LLR_t =0$, otherwise. For continuous sequential analysis, the test statistic $LLR_t$ is monitored at all times t > 0.
The sequential analyses end, and $H_0$ is rejected,
if and when $LLR_t \geq CV$. If $\mu_t=SampleSize=SS$, the sequential analysis ends without rejecting the null hypothesis. "SS" is defined a
priori by the user in order to achieve the desired statistical power.
If the first event occur sufficiently early, the sequential analyses may end with the null hypothesis rejected after a single events.
There is an optional to require a minimum number of observed events, $c_t=M$, before the null can be rejected. Setting M in the range [3,6]
is often a good choice (Kulldorff and Silva, 2012). If there is a delay until the sequential analysis starts, but it continuous continously
thereafter, there is an option for that as well, requireing a minimum number $\mu_t=D$ of expected events before the null can be rejected.
With continuous sequential analysis, investigators can repeatedly analyze the data as often as they want, ensuring
that the overall probability of falsely rejecting the null hypothesis at any time during the analysis
is controlled at the desired nominal significance level (Wald, 1945, 1947). Continuous sequential methods are suitable for real-time or near real-time
monitoring. When data is only analyzed intermittently, group sequential methods are used instead (Jennison and Turnbull, 1999).
The data is then analyzed at regular or irregular discrete time intervals after a certain amount of data is accessible.
Group sequential statistical methods are commonly used in clinical trials, where a trial may be
stopped early due to either efficacy or unexpected adverse events (Jennison and Turnbull, 1999).
The same test statistic, $LLR_t$, is used for group sequential analyses (Silva and Kulldorff, 2012
).
The times when $LLR_t$ is evaluated can be defined in several ways,
using regular or irregular time intervals that are referenced by calendar period, sample size or some scale involving the distribution of the data.
In this first version of the package, the group sequential analysis must be conducted with a constant expected number of adverse events between
looks at the accumulated data. In another words, $LLR_t$ is compared against CV whenever $\mu_t$ is a multiple of $SS/L$, where L is
the total number of looks at the data.
In this package, all critical values, statistical power, epected time to signal and required sample size to achive a
certain power, are obtained exactly to whatever decimal precision desired, using iterative numerical calculations. None of the results are based on
asymptotic theory or computer simulations.Acknowledgements
This package was developed during a one year post-doctoral visit by Dr. Silva at the Department of Population Medicine,
Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA 02215, USA. The funds supporting this work came from the
United States Food and Drug Administration, Center for Biologics Evaluation and Research, through
the Mini-Sentinel Post-Rapid Immunization Safety Monitoring program; from the National Council of Scientific and
Technological Development (CNPq), Minas Gerais, Brazil; and from the Bank for the Development of the Minas Gerais State (BDMG), Minas Gerais, Brazil.
We are grateful to Claudia Coronel-Moreno for editorial support.