The GMAT
data set is generated data set based on parameters from Graduate
Management Admission Test (GMAT) data set (Kingston et al., 1985). First two items were
considered to function differently in uniform and non-uniform way respectively. The data set
represents responses of 2,000 subjects to multiple-choice test of 20 items. A correct answer
is coded as 1 and incorrect answer as 0. The column group
represents group membership,
where 0 represents reference group and 1 represents focal group. Groups are the same
size (i.e. 1,000 per group). The distributions of total scores (sum of correct answers) are the
same for both reference and focal group (Martinkova et al., 2016). The column criterion
represents generated continuous variable which is intended to be predicted by test.
data(GMAT)
A GMAT
data frame consists of 2,000 observations on the following 22 variables.
The first 20 columns represent dichotomously scored items of the test. The 21st column is vector
of group membership; values 0 and 1 refer to reference and focal group. The 22nd column is vector
representing variable which is intended to be predicted by test.
Kingston, N., Leary, L., & Wightman, L. (1985). An Exploratory Study of the Applicability of Item Response Theory Methods to the Graduate Management Admission Test. ETS Research Report Series, 1985(2) : 1-64.
Martinkova, P., Drabinova, A., Liaw, Y. L., Sanders, E. A., McFarland, J. L., & Price, R. M. (2017). Checking equity: Why Differential Item Functioning Analysis Should Be a Routine Part of Developing Conceptual Assessments. CBE-Life Sciences Education, 16(2), https://doi.org/10.1187/cbe.16-10-0307.