ScaleTests: Independent Two- and K-Sample Scale Tests

Description

Testing the equality of the distributions of a numeric response in two or more independent groups against scale alternatives.

Usage

## S3 method for class 'formula':
ansari_test(formula, data, subset = NULL, weights = NULL, \dots)
## S3 method for class 'IndependenceProblem':
ansari_test(object, 
    alternative = c("two.sided", "less", "greater"),
    ties.method = c("mid-ranks", "average-scores"),
    conf.int = FALSE, conf.level = 0.95, ...)
## S3 method for class 'formula':
fligner_test(formula, data, subset = NULL, weights = NULL, \dots)
## S3 method for class 'IndependenceProblem':
fligner_test(object, 
    ties.method = c("mid-ranks", "average-scores"),
    distribution = c("asymptotic", "approximate"), 
    ...)

Arguments

Value

An object inheriting from class IndependenceTest-class with methods show, statistic, expectation, covariance and pvalue. The null distribution can be inspected by pperm, dperm, qperm and support methods. Confidence intervals can be extracted by confint.

Details

The null hypothesis of the equality of the distribution of y in the groups given by x is tested. In particular, the methods documented here are designed to detect scale alternatives. For a general description of the test procedures documented here we refer to Hollander & Wolfe (1999).

The asymptotic null distribution is computed by default for both procedures. Exact p-values may be computed for the Ansari-Bradley test can be approximated via Monte-Carlo for the Fligner-Killeen procedure. Exact p-values are computed either by the shift algorithm (Streitberg & R"ohmel, 1986, 1987) or by the split-up algorithm (van de Wiel, 2001).

The Ansari-Bradley test can be used to test the two-sided hypothesis $var(Y_1) / var(Y_2) = 1$, where $var(Y_i)$ is the variance of the responses in the ith group. Confidence intervals for the ratio of scales are available for the Ansari-Bradley test and are computed according to Bauer (1972). In case alternative = "less", the null hypothesis $var(Y_1) / var(Y_2) \ge 1$ is tested and alternative = "greater" corresponds to $var(Y_1) / var(Y_2) \le 1$.

For the adjustment of scores for tied values see Hajek, Sidak and Sen (1999), page 131ff.

References

Myles Hollander & Douglas A. Wolfe (1999). Nonparametric Statistical Methods, 2nd Edition. New York: John Wiley & Sons.

Bernd Streitberg & Joachim R"ohmel (1986). Exact distributions for permutations and rank tests: An introduction to some recently published algorithms. Statistical Software Newsletter 12(1), 10--17.

Bernd Streitberg & Joachim R"ohmel (1987). Exakte Verteilungen f"ur Rang- und Randomisierungstests im allgemeinen $c$-Stichprobenfall. EDV in Medizin und Biologie 18(1), 12--19.

Mark A. van de Wiel (2001). The split-up algorithm: a fast symbolic method for computing p-values of rank statistics. Computational Statistics 16, 519--538.

David F. Bauer (1972). Constructing confidence sets using rank statistics. Journal of the American Statistical Association 67, 687--690.

Jaroslav Hajek, Zbynek Sidak & Pranab K. Sen (1999). Theory of Rank Tests. San Diego, London: Academic Press.

Examples

Run this code

### Serum Iron Determination Using Hyland Control Sera
  ### Hollander & Wolfe (1999), page 147
  sid <- data.frame(
      serum = c(111, 107, 100, 99, 102, 106, 109, 108, 104, 99,
                101, 96, 97, 102, 107, 113, 116, 113, 110, 98,
                107, 108, 106, 98, 105, 103, 110, 105, 104,
                100, 96, 108, 103, 104, 114, 114, 113, 108, 106, 99),
      method = factor(gl(2, 20), labels = c("Ramsay", "Jung-Parekh")))

  ### Ansari-Bradley test, asymptotical p-value
  ansari_test(serum ~ method, data = sid)

  ### exact p-value
  ansari_test(serum ~ method, data = sid, distribution = "exact")


  ### Platelet Counts of Newborn Infants
  ### Hollander & Wolfe (1999), Table 5.4, page 171
  platalet_counts <- data.frame(
      counts = c(120, 124, 215, 90, 67, 95, 190, 180, 135, 399, 
                 12, 20, 112, 32, 60, 40),
      treatment = factor(c(rep("Prednisone", 10), rep("Control", 6))))

  ### Lepage test, Hollander & Wolfe (1999), page 172 
  lt <- independence_test(counts ~ treatment, data = platalet_counts,
      ytrafo = function(data) trafo(data, numeric_trafo = function(x)       
          cbind(rank(x), ansari_trafo(x))),
      teststat = "quad", distribution = approximate(B = 9999))

  lt

  ### where did the rejection come from? Use maximum statistic
  ### instead of a quadratic form
  ltmax <- independence_test(counts ~ treatment, data = platalet_counts,
      ytrafo = function(data) trafo(data, numeric_trafo = function(x) 
          matrix(c(rank(x), ansari_trafo(x)), ncol = 2,
                 dimnames = list(1:length(x), c("Location", "Scale")))),
      teststat = "max")

  ### points to a difference in location
  pvalue(ltmax, method = "single-step")

  ### Funny: We could have used a simple Bonferroni procedure
  ### since the correlation between the Wilcoxon and Ansari-Bradley 
  ### test statistics is zero
  covariance(ltmax)

Run the code above in your browser using DataLab