The context is that n i.i.d. observations are assumed to be drawn from some distribution on the unit interval with c.d.f. F(x), and it is desired to test the null hypothesis that F(x) = x for all x in (0,1), referred to as the "global null hypothesis," against a two-sided alternative. An "equal local levels" test is used, in which each of the n order statistics is tested for significant deviation from its null distribution by a 2-sided test with significance level \(\eta\). The global null hypothesis is rejected if at least one of the order statistic tests is rejected at level \(\eta\), where \(\eta\) is chosen so that the significance level of the global test is alpha. Given the size of the dataset n and the desired global significance level alpha, this function calculates the local level \(\eta\) and the acceptance/rejection regions for the test. There are a set of n intervals, one for each order statistic. If at least one order statistic falls outside the corresponding interval, the global test is rejected.
get_bounds_two_sided(
alpha,
n,
tol = 1e-08,
max_it = 100,
method = c("best_available", "approximate", "search")
)
A list with components
lower_bound - Numeric vector of length n
containing the lower bounds
for the acceptance regions of the test of each order statistic.
upper_bound - Numeric vector of length n
containing the upper bounds
for the acceptance regions of the test of each order statistic.
x - Numeric vector of length n
containing the expectation of each order statistic.
These are the x-coordinates for the bounds if used in a pp-plot.
The value is c(1:n) / (n + 1)
.
local_level - Significance level \(\eta\) of the local test on each individual order statistic. It is equal for all order
statistics and will be less than alpha
for all n
> 1.
Desired global significance level of the test.
Size of the dataset.
(optional) Relative tolerance of the alpha
level of the
simultaneous test. Defaults to 1e-8. Used only if method
is set to
"search" or if method is set to "best_available" and the best available
method is a search.
(optional) Maximum number of iterations of Binary Search Algorithm
used to find the bounds. Defaults to 100 which should be much larger than necessary
for a reasonable tolerance. Used only if method
is set to
"search" or if method is set to "best_available" and the best available
method is a search.
(optional) Argument indicating if the calculation should be done using a highly
accurate approximation, "approximate", or if the calculations should be done using an exact
binary search calculation, "search". The default is "best_available" (recommended), which uses the exact search
when either (i) the approximation isn't available or (ii) the approximation is available but isn't highly accurate and the search method
isn't prohibitively slow (occurs for small to moderate n
with alpha
= .1).
Of note, the approximate method is only available for alpha values of .1, .05, and .01. In the case of alpha = .05 or .01, the
approximation is highly accurate for all values of n
up to at least 10^6
.
get_bounds_two_sided(alpha = .05, n = 100)
Run the code above in your browser using DataLab