catL2BB: Generators for efpFunctionals along Categorical Variables

Description

Generators for efpFunctional objects suitable for aggregating empirical fluctuation processes to test statistics along (ordinal) categorical variables.

Usage

catL2BB(freq)
ordL2BB(freq, nproc = NULL, nrep = 1e5, probs = c(0:84/100, 850:1000/1000), ...)
ordwmax(freq, algorithm = mvtnorm::GenzBretz(), ...)

Value

An object of class efpFunctional.

Arguments

freq: object specifying the category frequencies for the categorical variable to be used for aggregation: either a gefp object, a factor, or a numeric vector with either absolute or relative category frequencies.
nproc: numeric. Number of processes used for simulating from the asymptotic distribution (passed to efpFunctional). If feq is a gefp object, then its number of processes is used by default.
nrep: numeric. Number of replications used for simulating from the asymptotic distribution (passed to efpFunctional).
probs: numeric vector specifying for which probabilities critical values should be tabulated.
...: further arguments passed to efpFunctional.
algorithm: algorithm specification passed to pmvnorm for computing the asymptotic distribution.

Details

Merkle, Fan, and Zeileis (2014) discuss three functionals that are suitable for aggregating empirical fluctuation processes along categorical variables, especially ordinal variables. The functions catL2BB, ordL2BB, and ordwmax all require a specification of the relative frequencies within each category (which can be computed from various specifications, see arguments). All of them employ efpFunctional (Zeileis 2006) internally to set up an object that can be employed with gefp fluctuation processes.

catL2BB results in a chi-squared test. This is essentially the LM test counterpart to the likelihood ratio test that assesses a split into unordered categories.

ordL2BB is the ordinal counterpart to supLM where aggregation is done along the ordered categories (rather than continuously). The asymptotic distribution is non-standard and needs to be simulated for every combination of frequencies and number of processes. Hence, this is somewhat more time-consuming compared to the closed-form solution employed in catL2BB. It is also possible to store the result of ordL2BB in case it needs to be applied several gefp fluctuation processes.

ordwmax is a weighted double maximum test based on ideas previously suggested by Hothorn and Zeileis (2008) in the context of maximally selected statistics. The asymptotic distribution is (multivariate) normal and computed by means of pmvnorm.

References

Hothorn T., Zeileis A. (2008), Generalized Maximally Selected Statistics. Biometrics, 64, 1263--1269.

Merkle E.C., Fan J., Zeileis A. (2014), Testing for Measurement Invariance with Respect to an Ordinal Variable. Psychometrika, 79(4), 569--584. doi:10.1007/S11336-013-9376-7.

Zeileis A. (2006), Implementing a Class of Structural Change Tests: An Econometric Computing Approach. Computational Statistics & Data Analysis, 50, 2987--3008. doi:10.1016/j.csda.2005.07.001.

Examples

Run this code

## artificial data
set.seed(1)
d <- data.frame(
  x = runif(200, -1, 1),
  z = factor(rep(1:4, each = 50)),
  err = rnorm(200)
)
d$y <- rep(c(0.5, -0.5), c(150, 50)) * d$x + d$err

## empirical fluctuation process
scus <- gefp(y ~ x, data = d, fit = lm, order.by = ~ z)

## chi-squared-type test (unordered LM-type test)
LMuo <- catL2BB(scus)
plot(scus, functional = LMuo)
sctest(scus, functional = LMuo)

## ordinal maxLM test (with few replications only to save time)
maxLMo <- ordL2BB(scus, nrep = 10000)
plot(scus, functional = maxLMo)
sctest(scus, functional = maxLMo)

## ordinal weighted double maximum test
WDM <- ordwmax(scus)
plot(scus, functional = WDM)
sctest(scus, functional = WDM)

Run the code above in your browser using DataLab