genNLR: Generates data set based on Non-Linear Regression DIF a DDF models.

Description

Generates dichotomous and nominal data set based on non-linear regression models for DIF and DDF detection.

Usage

genNLR(N = 1000, ratio = 1, itemtype = "dich", a, b, c, d, mu = 0, sigma = 1)

Arguments

numeric: number of rows representing respondents.

ratio

numeric: ratio of respondents number in reference and focal group.

itemtype

character: type of items to be generated. Options are "dich" for dichotomous item (default), "nominal" for nominal items, and "ordinal" for ordinal data. See Details.

numeric: matrix representing discriminations with m rows (where m is number of items). Need to be provided. See Details.

numeric: numeric: matrix representing difficulties with m rows (where m is number of items). Need to be provided. See Details.

numeric: matrix representing guessings (lower asymptotes) with m rows (where m is number of items). Default is NULL. See Details.

numeric: matrix representing inattentions (upper asymptotes) with m rows (where m is number of items). Default is NULL. See Details.

numeric: a mean vector of the underlying distribution. The first value corresponds to reference group, the second to focal group. Default is 0 value for both groups. See Details.

sigma

numeric: a standard deviation vector of the underlying distribution. The first value corresponds to reference group, the second to focal group. Default is 1 value for both groups. See Details.

Value

A data.frame containing N rows representing respondents and m + 1 columns representing m items. Last column is group membership variable with coding 0 for reference group and 1 for focal group.

Details

The itemtype argument specify what type of item should be generated. In case itemtype = "dich", dichotomous items are generated with non-linear regression model (generalized logistic regression model) for DIF detection specified in difNLR. In case itemtype = "nominal", nominal items are generated with multinomial model specified in ddfMLR. For option itemtype = "ordinal", ordinal items are generated with adjacent logit model specified in ddfORD with argument model = "adjacent".

The a, b, c and d are numeric matrices with m rows (where m is number of items) representing parameters of regression models for DIF and DDF detection.

For option itemtype = "dich", matrices should have two columns. The first column represents parameters of the reference group and the second of the focal group. In case that only one column is provided, parameters are set to be the same for both groups.

For options itemtype = "nominal" and itemtype = "ordinal", matrices c and d are ignored. Matrices a and b contain parameters for distractors. For example, when item with 4 different choices is supposed to be generated, user provide matrices with 6 columns. First 3 columns correspond to distractors parameters for reference group and last three columns for focal group. The number of choices can differ for items. Matrices a and b need to consist of as many columns as is the maximum number of distractors. Items with less choices can containt NAs.

Single value for mu means that reference and focal group have underlying distribution with the same mean. Single value for sigma means that reference and focal group have underlying distribution with the same standard deviation. In case that mu or sigma are vectors of length grater than two, only first two values are taken.

References

Drabinova, A. & Martinkova P. (2017). Detection of Differential Item Functioning with NonLinear Regression: Non-IRT Approach Accounting for Guessing. Journal of Educational Measurement, 54(4), 498-517, https://doi.org/10.1111/jedm.12158.

Examples

Run this code

# NOT RUN {
# seed
set.seed(123)
# generating parameters for dichotomous data with DIF, 5 items
a <- matrix(runif(10, 0.8, 2), ncol = 2)
b <- matrix(runif(10, -2, 2), ncol = 2)
c <- matrix(runif(10, 0, 0.25), ncol = 2)
d <- matrix(runif(10, 0.8, 1), ncol = 2)
# generating dichotomous data set with 300 observations (150 each group)
genNLR(N = 300, a = a, b = b, c = c, d = d)
# generating dichotomous data set with 300 observations (150 each group)
# and different mean and standard deviation for underlying distribution
genNLR(N = 300, a = a, b = b, c = c, d = d, mu = c(1, 0), sigma = c(1, 2))
# generating dichotomous data set with 300 observations (250 reference group, 50 focal)
genNLR(N = 300, ratio = 5, a = a, b = b, c = c, d = d)

# generating parameters for nominal data with DDF, 5 items,
# each item 3 choices
a <- matrix(runif(20, 0.8, 2), ncol = 4)
b <- matrix(runif(20, -2, 2), ncol = 4)
# generating nominal data set with 300 observations (150 each group)
genNLR(N = 300, itemtype = "nominal", a = a, b = b)
# generating nominal data set with 300 observations (250 reference group, 50 focal)
genNLR(N = 300, itemtype = "nominal", ratio = 5, a = a, b = b)

# generating parameters for nominal data with DDF, 5 items,
# items 1 and 2 have 2 choices, items 3, 4 and 5 have 3 choices
a <- matrix(runif(20, 0.8, 2), ncol = 4)
a[1:2, c(2, 4)] <- NA
b <- matrix(runif(20, -2, 2), ncol = 4)
b[1:2, c(2, 4)] <- NA
# generating nominal data set with 300 observations (150 each group)
genNLR(N = 300, itemtype = "nominal", a = a, b = b)
# generating nominal data set with 300 observations (250 reference group, 50 focal)
genNLR(N = 300, itemtype = "nominal", ratio = 5, a = a, b = b)

# }

Run the code above in your browser using DataLab