DIF detection procedure based on non-linear regression is the extension of logistic regression
procedure (Swaminathan and Rogers, 1990).
The Data
is a matrix which rows represents scored examinee answers ("1"
correct,
"0"
incorrect) and columns correspond to the items. In addition, Data
can hold
the vector of group membership. If so, group
is a column indicator of Data
.
Otherwise, group
must be a dichotomous vector of the same length as nrow(Data)
.
The unconstrained form of 4PL generalized logistic regression model for probability of correct
answer (i.e., y = 1) is
P(y = 1) = (c + cDif*g) + (d + dDif*g - c - cDif*g)/(1 + exp(-(a + aDif*g)*(x - b - bDif*g))),
where x is by default standardized total score (also called Z-score) and g is group membership.
Parameters a, b, c and d are discrimination, difficulty, guessing and inattention.
Terms aDif, bDif, cDif and dDif then represent differences between two groups in relevant
parameters.
This 4PL model can be further constrained by model
and constraints
arguments.
The arguments model
and constraints
can be also combined.
The model
argument offers several predefined models. The options are as follows:
Rasch
for 1PL model with discrimination parameter fixed on value 1 for both groups,
1PL
for 1PL model with discrimination parameter fixed for both groups,
2PL
for logistic regression model,
3PLcg
for 3PL model with fixed guessing for both groups,
3PLdg
for 3PL model with fixed inattention for both groups,
3PLc
(alternatively also 3PL
) for 3PL regression model with guessing parameter,
3PLd
for 3PL model with inattention parameter,
4PLcgdg
for 4PL model with fixed guessing and inattention parameter for both groups,
4PLcgd
(alternatively also 4PLd
) for 4PL model with fixed guessing for both groups,
4PLcdg
(alternatively also 4PLc
) for 4PL model with fixed inattention for both groups,
or 4PL
for 4PL model.
The model
can be specified in more detail with constraints
argument which specifies what
parameters should be fixed for both groups. For example, choice "ad"
means that discrimination (a) and
inattention (d) are fixed for both groups and other parameters (b and c) are not. The arguments
model
and constraints
can be also item specific if they take a form of vector, where
each element correspond to one item. The NA
value for constraints
means no constraints.
The type
corresponds to type of DIF to be tested. Possible values are
"both"
to detect any DIF caused by difference in difficulty or discrimination (i.e., uniform and/or non-uniform),
"udif"
to detect only uniform DIF (i.e., difference in difficulty b),
"nudif"
to detect only non-uniform DIF (i.e., difference in discrimination a), or
"all"
to detect DIF caused by difference caused by any parameter that can differed between groups. The type
of DIF can be also specified in more detail by using combination of parameters a, b, c and d. For example, with an option
"c"
for 4PL model only the difference in parameter c is tested. The type
argument is
also item specific.
Argument match
represents the matching criterion. It can be either the standardized test score (default, "zscore"
),
total test score ("score"
), or any other continuous or discrete variable of the same length as number of observations
in Data
.
A set of anchor items (DIF free) can be specified through the anchor
argument. It need to be a vector of either
item names (as specified in column names of Data
) or item identifiers (integers specifying the column number).
In case anchor items are provided, only these items are used to compute matching criterion match
. If the match
argument is not either "zscore"
or "score"
, anchor
argument is ignored. When anchor items are
provided, purification is not applied.
The p.adjust.method
is a character for p.adjust
function from the stats
package. Possible values are "holm"
, "hochberg"
, "hommel"
,
"bonferroni"
, "BH"
, "BY"
, "fdr"
, "none"
.
The start
is a list with as many elements as number of items. Each element is a named numeric
vector of length 8 representing initial values for parameter estimation. Specifically, parameters
a, b, c, and d are initial values for discrimination, difficulty, guessing and inattention
for reference group. Parameters aDif, bDif, cDif and dDif are then differences in these
parameters between reference and focal group. If not specified, starting
values are calculated with startNLR
function.
Missing values are allowed but discarded for item estimation. They must be coded as
NA
for both, data
and group
parameters.
In case of convergence issues, with an option initboot = TRUE
, the starting values are
re-calculated based on bootstraped samples. Newly calculated initial values are applied only to
items/models with convergence issues.
In case that model considers difference in guessing or inattention parameter, the different parameterization is
used and parameters with standard errors are recalculated by delta method. However, covariance matrices stick with
alternative parameterization.