NPtest: function to perform nonparametric Rasch model tests

Description

A variety of nonparametric tests as proposed by Ponocny(2001), Koller and Hatzinger(2012), and an 'exact' version of the Martin-Loef test are implemented. The function operates on random binary matrices that have been generated using an MCMC algorithm (Verhelst, 2008) from the RaschSampler package (Hatzinger, Mair, and Verhelst, 2009).

Usage

NPtest(obj, n = NULL, method = "T1", ...)

Arguments

obj

A binary data matrix (or data frame) or an object containing the output from the RaschSampler package.

If obj is a matrix or a data frame, n n is the number of sampled matrices (default is 500)

method

One of the test statistics. See details below.

...

Further arguments according to method. See details below. Additionally, the sampling routine can be controlled by specifying burn_in =, step =, and seed = (for details see below and . A summa

Value

Depends on the method used. For each method a list is returned. The returned objects are of class T1obj,T1mobj,T1lobj, Tmdobj, T2obj, T2mobj, T4obj, T10obj, T11obj, and Tpbisobj corresponding to the method used. The main output element is prop giving the one-sided p-value, i.e., the number of statistics from the sampled matrices which are equal or exceed the statistic based on the observed data. For T1, T1m, and T1l prop is a vector. For the Martin-Loef test the returned object is of class MLobj. Besides other elements, it contains a prop vector and MLres, the output object from the asymptotic Martin-Loef test on the input data.

Details

The function uses the RaschSampler package. It has to be installed to use NPtest. On input the user has to supply either a binary data matrix or a RaschSampler output object. If the input is a data matrix, the RaschSampler is called with default values (i.e., rsctrl(burn_in = 256, n_eff = n, step = 32), see rsctrl), where n corresponds to n_eff (the default number of sampled matrices is 500). By default, the starting values for the random number generators (seed) are chosen randomly using system time. Methods other than those listed below can easily be implemented using the RaschSampler package directly. The currently implemented methods (following Ponocny's notation of T-statistics) and their options are: [object Object],Checks for local dependence via increased inter-item correlations. For all item pairs cases are counted with equal responses on both items.,[object Object],Checks for multidimensionality via decreased inter-item correlations. For all item pairs cases are counted with equal responses on both items.,[object Object],Checks for learning. For all item pairs cases are counted with response pattern (1,1).,[object Object],idx1, idx2 ...vectors of indices specifying items which define two subscales, e.g., idx1 = c(1, 5, 7) and idx2 = c(3, 4, 6) Checks for multidimensionality based on correlations of person raw scores for the subscales.,[object Object],idx ...vector of indices specifying items which define a subscale, e.g., idx = c(1, 5, 7) stat ...one of "var" (variance), "mad1" (mean absolute deviation), "mad2" (median absolute deviation), "range" (range) Checks for local dependence within model deviating subscales via increased dispersion of subscale person rawscores.,[object Object],idx ...vector of indices specifying items which define a subscale, e.g., idx = c(1, 5, 7) stat ...one of "var" (variance), "mad1" (mean absolute deviation), "mad2" (median absolute deviation), "range" (range) Checks for multidimensionality within model deviating subscales via decreased dispersion of subscale person rawscores.,[object Object],idx ...vector of indices specifying items which define a subscale, e.g., idx = c(1, 5, 7) group ...logical vector defining a subject group, e.g., group = (age >= 15 && age < 30) alternative ...one of "high" or "low". Specifies the alternative hypothesis. Checks for group anomalies (DIF) via too high (low) raw scores on item(s) for specified group.,[object Object],splitcr ...split criterion for subject raw score splitting. "median" uses the median as split criterion, "mean" performs a mean-split. Optionally splitcr can also be a vector which assigns each person to a one of two subgroups (e.g., following an external criterion). This vector can be numeric, character, logical or a factor. Gobal test for subgroup-invariance. Checks for different item difficulties in two subgroups (for details see Ponocny, 2001).,[object Object],Gobal test for local dependence. The statistic calculates the sum of absolute deviations between the observed inter-item correlations and the expected correlations.,[object Object],Test for discrimination. The statistic calculates a pointbiserial correlation for a test item (specified via idxt) with the person row scores for a subscale of the test sum (specified via idxs). If correlation is too low, the test item shows different discrimination compared to the items of the subscale. The 'exact' version of the Martin-Loef statistic is specified via method = "MLoef" and optionally splitcr (see MLoef).

References

Ponocny, I. (2001) Nonparametric goodness-of-fit tests for the rasch model. Psychometrika, Volume 66, Number 3 Verhelst, N. D. (2008) An Efficient MCMC Algorithm to Sample Binary Matrices with Fixed Marginals. Psychometrika, Volume 73, Number 4 Verhelst, N. D., Hatzinger, R., and Mair, P. (2007) The Rasch Sampler, Journal of Statistical Software, Vol. 20, Issue 4, Feb 2007

Examples

Run this code

### Preparation:

# data for examples below
X <- raschdat1

# generate 100 random matrices based on original data matrix
rmat <- rsampler(X, rsctrl(burn_in = 100, n_eff = 100, seed = 123))

## the following examples can also directly be used by setting
## rmat <- raschdat1
## without calling rsampler() first, e.g.,
t1 <- NPtest(raschdat1, n = 100, method = "T1")


### Examples ###################################################################

###--- T1 ----------------------------------------------------------------------
t1 <- NPtest(rmat, method = "T1")
# choose a different alpha for selecting displayed values
print(t1, alpha = 0.01)


###--- T2 ----------------------------------------------------------------------
t21 <- NPtest(rmat, method = "T2", idx = 1:5, burn_in = 100, step = 20,
              seed = 7654321, RSinfo = TRUE)
# default stat is variance
t21

t22 <- NPtest(rmat, method = "T2", stat = "mad1",
              idx = c(1, 22, 5, 27, 6, 9, 11))
t22


###--- T4 ----------------------------------------------------------------------
age <- sample(20:90, 100, replace = TRUE)
# group MUST be a logical vector
# (value of TRUE is used for group selection)
age <- age < 30
t41 <- NPtest(rmat, method = "T4", idx = 1:3, group = age)
t41

sex <- gl(2, 50)
# group can also be a logical expression (generating a vector)
t42 <- NPtest(rmat, method = "T4", idx = c(1, 4, 5, 6), group = sex == 1)
t42


###--- T10 ---------------------------------------------------------------------
t101 <- NPtest(rmat, method = "T10")       # default split criterion is "median"
t101

split <- runif(100)
t102 <- NPtest(rmat, method = "T10", splitcr = split > 0.5)
t102

t103 <- NPtest(rmat, method = "T10", splitcr = sex)
t103


###--- T11 ---------------------------------------------------------------------
t11 <- NPtest(rmat, method = "T11")
t11


###--- Tpbis -------------------------------------------------------------------
tpb <- NPtest(X[, 1:5], method = "Tpbis", idxt = 1, idxs = 2:5)
tpb


###--- Martin-Loef -------------------------------------------------------------
# takes a while ...
split <- rep(1:3, each = 10)
NPtest(raschdat1, n = 100, method = "MLoef", splitcr = split)

Run the code above in your browser using DataLab