ScaleTests: Two- and \(K\)-Sample Scale Tests

Description

Testing the equality of the distributions of a numeric response variable in two or more independent groups against scale alternatives.

Usage

# S3 method for formula
taha_test(formula, data, subset = NULL, weights = NULL, …)
# S3 method for IndependenceProblem
taha_test(object, conf.int = FALSE, conf.level = 0.95, …)
# S3 method for formula
klotz_test(formula, data, subset = NULL, weights = NULL, …)
# S3 method for IndependenceProblem
klotz_test(object, ties.method = c("mid-ranks", "average-scores"),
           conf.int = FALSE, conf.level = 0.95, …)
# S3 method for formula
mood_test(formula, data, subset = NULL, weights = NULL, …)
# S3 method for IndependenceProblem
mood_test(object, ties.method = c("mid-ranks", "average-scores"),
          conf.int = FALSE, conf.level = 0.95, …)
# S3 method for formula
ansari_test(formula, data, subset = NULL, weights = NULL, …)
# S3 method for IndependenceProblem
ansari_test(object, ties.method = c("mid-ranks", "average-scores"),
            conf.int = FALSE, conf.level = 0.95, …)
# S3 method for formula
fligner_test(formula, data, subset = NULL, weights = NULL, …)
# S3 method for IndependenceProblem
fligner_test(object, ties.method = c("mid-ranks", "average-scores"),
             conf.int = FALSE, conf.level = 0.95, …)
# S3 method for formula
conover_test(formula, data, subset = NULL, weights = NULL, …)
# S3 method for IndependenceProblem
conover_test(object, conf.int = FALSE, conf.level = 0.95, …)

Arguments

formula

a formula of the form y ~ x | block where y is a numeric variable, x is a factor and block is an optional factor for stratification.

data

an optional data frame containing the variables in the model formula.

subset

an optional vector specifying a subset of observations to be used. Defaults to NULL.

weights

an optional formula of the form ~ w defining integer valued case weights for each observation. Defaults to NULL, implying equal weight for all observations.

object

an object inheriting from class "'>IndependenceProblem".

conf.int

a logical indicating whether a confidence interval for the ratio of scales should be computed. Defaults to FALSE.

conf.level

a numeric, confidence level of the interval. Defaults to 0.95.

ties.method

a character, the method used to handle ties: the score generating function either uses mid-ranks ("mid-ranks", default) or averages the scores of randomly broken ties ("average-scores").

…

further arguments to be passed to independence_test.

Value

An object inheriting from class "'>IndependenceTest". Confidence intervals can be extracted by confint.

Details

taha_test, klotz_test, mood_test, ansari_test, fligner_test and conover_test provide the Taha test, the Klotz test, the Mood test, the Ansari-Bradley test, the Fligner-Killeen test and the Conover-Iman test. A general description of these methods is given by Hollander and Wolfe (1999). For the adjustment of scores for tied values see H<U+00E1>jek, <U+0160>id<U+00E1>k and Sen (1999, pp. 133--135).

The null hypothesis of equality, or conditional equality given block, of the distribution of y in the groups defined by x is tested against scale alternatives. In the two-sample case, the two-sided null hypothesis is \(H_0\!: V(Y_1) / V(Y_2) = 1\), where \(V(Y_s)\) is the variance of the responses in the \(s\)th sample. In case alternative = "less", the null hypothesis is \(H_0\!: V(Y_1) / V(Y_2) \ge 1\). When alternative = "greater", the null hypothesis is \(H_0\!: V(Y_1) / V(Y_2) \le 1\). Confidence intervals for the ratio of scales are available and computed according to Bauer (1972).

The Fligner-Killeen test uses median centering in each of the samples, as suggested by Conover, Johnson and Johnson (1981), whereas the Conover-Iman test, following Conover and Iman (1978), uses mean centering in each of the samples.

The conditional null distribution of the test statistic is used to obtain \(p\)-values and an asymptotic approximation of the exact distribution is used by default (distribution = "asymptotic"). Alternatively, the distribution can be approximated via Monte Carlo resampling or computed exactly for univariate two-sample problems by setting distribution to "approximate" or "exact" respectively. See asymptotic, approximate and exact for details.

References

Bauer, D. F. (1972). Constructing confidence sets using rank statistics. Journal of the American Statistical Association 67(339), 687--690. 10.1080/01621459.1972.10481279

Conover, W. J. and Iman, R. L. (1978). Some exact tables for the squared ranks test. Communications in Statistics -- Simulation and Computation 7(5), 491--513. 10.1080/03610917808812093

Conover, W. J., Johnson, M. E. and Johnson, M. M. (1981). A comparative study of tests for homogeneity of variances, with applications to the outer continental shelf bidding data. Technometrics 23(4), 351--361. 10.1080/00401706.1981.10487680

H<U+00E1>jek, J., <U+0160>id<U+00E1>k, Z. and Sen, P. K. (1999). Theory of Rank Tests, Second Edition. San Diego: Academic Press.

Hollander, M. and Wolfe, D. A. (1999). Nonparametric Statistical Methods, Second Edition. York: John Wiley & Sons.

Examples

Run this code

# NOT RUN {
## Serum Iron Determination Using Hyland Control Sera
## Hollander and Wolfe (1999, p. 147, Tab 5.1)
sid <- data.frame(
    serum = c(111, 107, 100, 99, 102, 106, 109, 108, 104, 99,
              101, 96, 97, 102, 107, 113, 116, 113, 110, 98,
              107, 108, 106, 98, 105, 103, 110, 105, 104,
              100, 96, 108, 103, 104, 114, 114, 113, 108, 106, 99),
    method = gl(2, 20, labels = c("Ramsay", "Jung-Parekh"))
)

## Asymptotic Ansari-Bradley test
ansari_test(serum ~ method, data = sid)

## Exact Ansari-Bradley test
pvalue(ansari_test(serum ~ method, data = sid,
                   distribution = "exact"))


## Platelet Counts of Newborn Infants
## Hollander and Wolfe (1999, p. 171, Tab. 5.4)
platelet <- data.frame(
    counts = c(120, 124, 215, 90, 67, 95, 190, 180, 135, 399,
               12, 20, 112, 32, 60, 40),
    treatment = factor(rep(c("Prednisone", "Control"), c(10, 6)))
)

## Approximative (Monte Carlo) Lepage test
## Hollander and Wolfe (1999, p. 172)
lepage_trafo <- function(y)
    cbind("Location" = rank_trafo(y), "Scale" = ansari_trafo(y))

independence_test(counts ~ treatment, data = platelet,
                  distribution = approximate(nresample = 10000),
                  ytrafo = function(data)
                      trafo(data, numeric_trafo = lepage_trafo),
                  teststat = "quadratic")

## Why was the null hypothesis rejected?
## Note: maximum statistic instead of quadratic form
ltm <- independence_test(counts ~ treatment, data = platelet,
                         distribution = approximate(nresample = 10000),
                         ytrafo = function(data)
                             trafo(data, numeric_trafo = lepage_trafo))

## Step-down adjustment suggests a difference in location
pvalue(ltm, method = "step-down")

## The same results are obtained from the simple Sidak-Holm procedure since the
## correlation between Wilcoxon and Ansari-Bradley test statistics is zero
cov2cor(covariance(ltm))
pvalue(ltm, method = "step-down", distribution = "marginal", type = "Sidak")
# }

Run the code above in your browser using DataLab