wgtRanktt: Adaptive Inference Using Two Test Statistics in a Block Design

Description

Tests twice, using the better of two test statistics; see Rosenbaum (2012; 2024, section 5; 2025a, section 9.5).

Usage

wgtRanktt(y, phi1 = "u868", phi2 = "u878", phifunc1 = NULL, phifunc2 = NULL, gamma = 1)

Value

jointP: Upper bound on the one-sided joint P-value obtained from two test statistics in the presence of a bias of at most gamma.
cor12: Correlation of the two test statistics at the treatment assignment distribution that provides the joint upper bound. Often, this correlation is high, so the joint distribution that is used here is much less conservative than use of the Bonferroni inequality when testing twice.
detail: Details about the two statistics separately. Equivalent to the result from wgtRank() run twice with different test statistics.

Arguments

y: A matrix or data frame with I rows and J columns. Column 1 contains the response of the treated individuals and columns 2 throught J contain the responses of controls in the same block. An error will result if y contains NAs.
phi1: The weight function to be applied to the ranks of the within block ranges. The options are: (i) "wilc" for the stratified Wilcoxon test, which gives every block the same weight, (ii) "quade" which ranks the within block ranges from 1 to I, and is closely related to Quade's (1979) statistic; see also Tardif (1987), (iii) "u868" based on Rosenbaum (2011), (iv) "u878"" based on Rosenbaum (2011) and (v) "mixed" based on Rosenbaum (2025b). Note carefully that phi is ignored if phifunc is not NULL.
phi2: See phi1.
phifunc1: If not NULL, a user specified weight function for the ranks of the within block ranges. The function should map [0,1] into [0,1]. The function is applied to the ranks divided by the sample size. See the example.
phifunc2: See phifunc1.
gamma: A single number greater than or equal to 1. gamma is the sensitivity parameter. Two individuals with the same observed covariates may differ in their odds of treatment by at most a factor of gamma; see Rosenbaum (1987; 2017, Chapter 9).

Author

Paul R. Rosenbaum

References

Berk, R. H. and Jones, D. H. (1978) <https://www.jstor.org/stable/4615706> Relatively optimal combinations of test statistics. Scandinavian Journal of Statistics, 5, 158-162.

Brown, B. M. (1981) <doi:10.1093/biomet/68.1.235> Symmetric quantile averages and related estimators. Biometrika, 68(1), 235-242.

Gastwirth, J. L., Krieger, A. M., and Rosenbaum, P. R. (2000) <doi:10.1111/1467-9868.00249> Asymptotic separability in sensitivity analysis. Journal of the Royal Statistical Society B 2000, 62, 545-556.

Lehmann, E. L. (1975) Nonparametrics: Statistical Methods Based on Ranks. San Francisco: Holden-Day.

Quade, D. (1979) <doi:10.2307/2286991> Using weighted rankings in the analysis of complete blocks with additive block effects. Journal of the American Statistical Association, 74, 680-683.

Rosenbaum, P. R. (1987) <doi:10.2307/2336017> Sensitivity analysis for certain permutation inferences in matched observational studies. Biometrika, 74(1), 13-26.

Rosenbaum, P. R. (2011) <doi:10.1111/j.1541-0420.2010.01535.x> A new U‐Statistic with superior design sensitivity in matched observational studies. Biometrics, 67(3), 1017-1027.

Rosenbaum, P. R. (2012) <doi:10.1093/biomet/ass032> Testing one hypothesis twice in observational studies. Biometrika, 99(4), 763-774.

Rosenbaum, P. R. (2014) <doi:10.1080/01621459.2013.879261> Weighted M-statistics with superior design sensitivity in matched observational studies with multiple controls. Journal of the American Statistical Association, 109(507), 1145-1158.

Rosenbaum, P. R. (2015) <doi:10.1080/01621459.2014.960968> Bahadur efficiency of sensitivity analyses in observational studies. Journal of the American Statistical Association, 110(509), 205-217.

Rosenbaum, P. (2017) <doi:10.4159/9780674982697> Observation and Experiment: An Introduction to Causal Inference. Cambridge, MA: Harvard University Press.

Rosenbaum, P. R. (2018) <doi:10.1214/18-AOAS1153> Sensitivity analysis for stratified comparisons in an observational study of the effect of smoking on homocysteine levels. The Annals of Applied Statistics, 12(4), 2312-2334.

Rosenbaum, P. R. (2024) <doi:10.1080/01621459.2023.2221402> Bahadur efficiency of observational block designs. Journal of the American Statistical Association, 119, 1871-1881.

Rosenbaum, P. R. (2025a) <doi:10.1007/978-3-031-90494-3> An Introduction to the Theory of Observational Studies. Switzerland: Springer.

Rosenbaum, P. R. (2025b) A jackknife estimate of the power of a sensitivity analysis in an observational study. Manuscript.

Tardif, S. (1987) <doi:10.2307/2289476> Efficiency and optimality results for tests based on weighted rankings. Journal of the American Statistical Association, 82(398), 637-644.

Examples

Run this code

data(aHDL)
y<-t(matrix(aHDL$hdl,4,406))

# This is the simplest example of a general property.  The
# example simply illustrates, but does not fully exploit
# the property.  In this case, use of the stratified
# Wilcoxon statistic is a mistake, because Quade's
# statistic correctly reports insensitivity to a bias
# of gamma=4.5, but the stratified Wilcoxon statistic
# is sensitive at gamma=3.5.  The adaptive procedure
# that does both tests and corrects for multiple testing
# is insensitive to gamma=4.4; so, it is almost as good
# as knowing what you cannot know, namely that Quade's
# statistic is the better choice in this one example.
# The price paid for testing twice is very small;
# see Berk and Jones (1978) and Rosenbaum (2012, 2022).
wgtRank(y,phi="wilc",gamma=3.5)
wgtRank(y,phi="quade",gamma=3.5)
wgtRank(y,phi="wilc",gamma=4.5)
wgtRank(y,phi="quade",gamma=4.5)
wgtRanktt(y,phi1="wilc",phi2="quade",gamma=4.4)

# Sensitivity to gamma=3.5 is very different from
# sensitivity to gamma=4.4; see documentation for amplify.
amplify(3.5,8)
amplify(4.4,8)


# In this example, u878 exhibits greater insensitivity to bias
# than u868.  However, adaptive inference using both is almost
# as good as the better statistic, yet it strongly controls the
# family-wise error rate despite testing twice;
# see Rosenbaum (2012,2022).
wgtRank(y,phi="u868",gamma=6) # New U-statistic weights (8,6,8)
wgtRank(y,phi="u878",gamma=6) # New U-statistic weights (8,7,8)
wgtRanktt(y,phi1="u868",phi2="u878",gamma=5.9)

# A user defined weight function, brown, analogous to Brown (1981).
brown<-function(v){((v>=.333)+(v>=.667))/2}
# In this example, the joint test rejects based on u878
wgtRanktt(y,phi1="u878",phifunc2=brown,gamma=5.8)

# \donttest{
# The following example reproduces Table 1 in Rosenbaum (2025b).
hdl0<-y
gammas<-c(1:6,6.25,6.5,6.75,6.875,7)
o<-matrix(NA,length(gammas),7)
rownames(o)<-gammas
colnames(o)<-c("Wilcoxon","Quade","U868","U878","U888","Mixed","U878aMixed")
for (i in 1:length(gammas)){
  o[i,1]<-wgtRank(hdl0,phi="wilc",gamma=gammas[i])$pval
  o[i,2]<-wgtRank(hdl0,phi="quade",gamma=gammas[i])$pval
  o[i,3]<-wgtRank(hdl0,phi="u868",gamma=gammas[i])$pval
  o[i,4]<-wgtRank(hdl0,phi="u878",gamma=gammas[i])$pval
  o[i,5]<-wgtRank(hdl0,phi="u888",gamma=gammas[i])$pval
  o[i,6]<-wgtRank(hdl0,phi="mixed",gamma=gammas[i])$pval
  o[i,7]<-wgtRanktt(hdl0,phi1="u878",phi2="mixed",gamma=gammas[i])$jointP
  }
round(o,3)
# }

Run the code above in your browser using DataLab