The GMATtest
data set is generated data set based on parameters from
Graduate Management Admission Test (GMAT) data set (Kingston et al., 1985). First two items
were considered to function differently in uniform and non-uniform way respectively.
The data set represents responses of 2,000 subjects to multiple-choice test of 20 items.
Aditionally, 4 possible answers on all items were generated, coded A, B, C and D. The column
group
represents group membership, where 0 represents reference group and 1 represent
focal group. Groups are the same size (i.e. 1,000 per group). The distributions of total scores
(sum of correct answers) are the same for both reference and focal group (Martinkova et al., 2016).
The column criterion
represents generated continuous variable which is intended to be predicted
by test.
data(GMATtest)
A GMAT
data frame consists of 2,000 observations on the following 21 variables.
The first 20 columns represents answers of subject to an items of the test. The 21st column is
vector of group membership; values 0 and 1 refer to reference and focal group. The 22nd column is vector
representing variable which is intended to be predicted by test.
Correct answers are presented in GMATkey
data set.
Kingston, N., Leary, L., & Wightman, L. (1985). An Exploratory Study of the Applicability of Item Response Theory Methods to the Graduate Management Admission Test. ETS Research Report Series, 1985(2) : 1-64.
Martinkova, P., Drabinova, A., Liaw, Y. L., Sanders, E. A., McFarland, J. L., & Price, R. M. (2017). Checking equity: Why Differential Item Functioning Analysis Should Be a Routine Part of Developing Conceptual Assessments. CBE-Life Sciences Education, 16(2), https://doi.org/10.1187/cbe.16-10-0307.