Calculates the "logistic regression" likelihood-ratio statistics and effect sizes for DIF detection.
Logistik(
data,
member,
member.type = "group",
match = "score",
anchor = 1:ncol(data),
type = "both",
criterion = "LRT",
all.cov = FALSE
)A list with nine components:
the values of the logistic regression DIF statistics.
the values of Nagelkerke's R^2 coefficients for the "full" model.
the values of Nagelkerke's R^2 coefficients for the "simpler" model.
the differences between Nagelkerke's \(R^2\) coefficients of the tested models. See Details.
a matrix with one row per item and four columns, holding successively the fitted parameters \(\hat{\alpha}\), \(\hat{\beta}\), \(\hat{\gamma}_1\) and \(\hat{\delta}_1\) of the "full" model (\(M_0\) if type = "both" or type = "nudif", \(M_1\) if type = "udif").
the same matrix as parM0 but with fitted parameters for the "simpler" model (\(M_1\) if type = "nudif", \(M_2\) if type = "both" or type = "udif").
a matrix with the standard error values of the parameter estimates in matrix parM0.
a matrix with the standard error values of the parameter estimates in matrix parM1.
either NULL (if all.cov argument is FALSE) or a list of covariance matrices of parameter estimates of the "full" model (\(M_0\)) for each item (if all.cov argument is TRUE).
either NULL (if all.cov argument is FALSE) or a list of covariance matrices of parameter estimates of the "reduced" model (\(M_1\)) for each item (if all.cov argument is TRUE).
the value of the criterion argument.
the value of the member.type argument.
a character string, either "score" or "matching variable" depending on the match argument.
numeric: the data matrix (one row per subject, one column per item).
numeric or factor: the vector of group membership. Can either take two distinct values (zero for the reference group and one for the focal group) or be a continuous vector. See Details.
character: either "group" (default) to specify
that group membership is made of two groups, or "cont" to indicate
that group membership is based on a continuous criterion. See
Details.
specifies the type of matching criterion. Can be either
"score" (default) to compute the total test score based on the
anchor items, or "restscore" to compute the matching score while
excluding the item currently being tested. This prevents contamination of
the matching variable by the item itself. Alternatively, any numeric vector
with the same length as the number of rows in data can be supplied
as an external matching variable.
a vector of integer values specifying which items (all by
default) are currently considered as anchor (DIF free) items. Ignored if
match is not "score". See Details.
a character string specifying which DIF effects must be tested.
Possible values are "both" (default), "udif" and
"nudif". See Details.
a character string specifying which DIF statistic is
computed. Possible values are "LRT" (default) or "Wald". See
Details.
logical: should all covariance matrices of model
parameter estimates be returned (as lists) for both nested models and all
items? (default is FALSE).
David Magis
Data science consultant at IQVIA Belux
Brussels, Belgium
Sebastien Beland
Faculte des sciences de l'education
Universite de Montreal (Canada)
sebastien.beland@umontreal.ca
Gilles Raiche
Universite du Quebec a Montreal
raiche.gilles@uqam.ca
Adela Hladka (nee Drabinova)
Institute of Computer Science of the Czech Academy of Sciences
hladka@cs.cas.cz
This command computes the logistic regression statistic (Swaminathan &
Rogers, 1990) in the specific framework of differential item functioning.
It forms the basic command of difLogistic and is specifically
designed for this call.
If the member.type argument is set to "group", the
member argument must be a vector with two distinct (numeric or
factor) values, say 0 and 1 (for the reference and focal groups
respectively). Those values are internally transformed onto factors to
denote group membership. The three possible models to be fitted are then:
$$M_0: \text{logit} (\pi_g) = \alpha + \beta X + \gamma_g + \delta_g X,$$ $$M_1: \text{logit} (\pi_g) = \alpha + \beta X + \gamma_g,$$ $$M_2: \text{logit} (\pi_g) = \alpha + \beta X,$$
where \(\pi_g\) is the probability of answering correctly the item in
group g and \(X\) is the matching variable. Parameters
\(\alpha\) and \(\beta\) are the intercept and the slope of the
logistic curves (common to all groups), while \(\gamma_g\) and
\(\delta_g\) are group-specific parameters. For identification reasons
the parameters \(\gamma_0\) and \(\delta_0\) for reference group
(\(g = 0\)) are set to zero. The parameter \(\gamma_1\) of the focal
group (\(g = 1\)) represents the uniform DIF effect, and the parameter
\(\delta_1\) is used to model nonuniform DIF effect. The models are
fitted with the glm function.
If member.type is set to "cont", then "group membership" is
replaced by a continuous or discrete variable, given by the member
argument, and the models above are written as
$$M_0: \text{logit} (\pi_g) = \alpha + \beta X + \gamma Y + \delta X Y,$$ $$M_1: \text{logit} (\pi_g) = \alpha + \beta X + \gamma Y,$$ $$M_2: \text{logit} (\pi_g) = \alpha + \beta X,$$
where Y is the group variable. Parameters \(\gamma\) and
\(\delta\) act now as the \(\gamma_1\) and \(\delta_1\) DIF
parameters.
The matching criterion can be either the test score or any other continuous
or discrete variable to be passed in the Logistik function. This is
specified by the match argument. By default, it takes the value
"score" and the test score (i.e. raw score) is computed. The second
option is to assign to match a vector of continuous or discrete
numeric values, which acts as the matching criterion. Note that for
consistency this vector should not belong to the data matrix.
Two types of DIF statistics can be computed: the likelihood ratio test
statistics, obtained by comparing the fit of two nested models, and the
Wald statistics, obtained with an appropriate contrast matrix for testing
the model parameters (Johnson & Wichern, 1998). These are specified by
the argument criterion, with respective values "LRT" and
"Wald". By default, the LRT statistics are computed.
If criterion is "LRT", the argument type determines the
models to be compared by means of the LRT statistics. The three possible
values of type are: type = "both" (default) which
tests the hypothesis \(H_0: \gamma_1 = \delta_1 = 0\) (or \(H_0: \gamma
= \delta = 0\)) by comparing models \(M_0\) and \(M_2\);
type = "nudif" which tests the hypothesis \(H_0: \delta_1 = 0\) (or
\(H_0: \delta = 0\)) by comparing models \(M_0\) and \(M_1\); and
type = "udif" which tests the hypothesis \(H_0: \gamma_1 = 0\) (or
\(H_0: \gamma = 0\)) by comparing models \(M_1\) and \(M_2\) (assuming
that \(\delta_1 = 0\) or \(\delta = 0\)). In other words,
type="both" tests for DIF (without distinction between uniform and
nonuniform effects), while type = "udif" and type = "nudif"
test for uniform and nonuniform DIF, respectively.
If criterion is "Wald", the argument type determines
the logistic model to be considered and the appropriate contrast matrix.
If type = "both", the considered model is model \(M_0\) and the
contrast matrix has two rows, (0, 0, 1, 0) and (0, 0, 0, 1). If
type = "nudif", the considered model is also model \(M_0\) but the
contrast matrix has only one row, (0, 0, 0, 1). Eventually, if
type = "udif", the considered model is model \(M_1\) and the
contrast matrix has one row, (0, 0, 1).
The data are passed through the data argument, with one row per
subject and one column per item. Missing values are allowed but must be
coded as NA values. They are discarded from the fitting of the
logistic models (see glm for further details).
The vector of group membership, specified with member argument, must
hold only zeros and ones, a value of zero corresponding to the reference
group and a value of one to the focal group.
Option anchor sets the items which are considered as anchor items
for computing the test scores and related logistic regression DIF
statistics. Items other than the anchor items and the tested item are
discarded. anchor must hold integer values specifying the column
numbers of the corresponding anchor items. It is mainly designed to perform
item purification. Note that this option is discarded when match is
not "score".
The output contains: the selected DIF statistics (either the LRT or the
Wald statistic) computed for each item, two matrices with the parameter
estimates of both models (for each item) and two matrices of related
standard error values. In addition, Nagelkerke's \(R^2\) coefficients
(Nagelkerke, 1991) are computed for each model and the output returns both,
the vectors of \(R^2\) coefficients for each model and the differences in
these coefficients. Such differences are used as measures of effect size by
the difLogistic command; see Gomez-Benito, Dolores Hidalgo
and Padilla (2009), Jodoin and Gierl (2001), and Zumbo and Thomas (1997).
The criterion and member.type arguments are also returned, as
well as a character argument named match that specifies the type of
matching criterion that was used.
Gomez-Benito, J., Dolores Hidalgo, M. and Padilla, J.-L. (2009). Efficacy of effect size measures in logistic regression: an application for detecting DIF. Methodology, 5, 18--25, tools:::Rd_expr_doi("10.1027/1614-2241.5.1.18")
Jodoin, M. G. and Gierl, M. J. (2001). Evaluating Type I error and power rates using an effect size measure with logistic regression procedure for DIF detection. Applied Measurement in Education, 14, 329--349, tools:::Rd_expr_doi("10.1207/S15324818AME1404_2")
Johnson, R. A. and Wichern, D. W. (1998). Applied multivariate statistical analysis (fourth edition). Upper Saddle River, NJ: Prentice-Hall.
Magis, D., Beland, S., Tuerlinckx, F. and De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42, 847--862, tools:::Rd_expr_doi("10.3758/BRM.42.3.847")
Nagelkerke, N. J. D. (1991). A note on a general definition of the coefficient of determination. Biometrika, 78, 691--692, tools:::Rd_expr_doi("10.1093/biomet/78.3.691")
Swaminathan, H. and Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361--370, tools:::Rd_expr_doi("10.1111/j.1745-3984.1990.tb00754.x")
Zumbo, B. D. and Thomas, D. R. (1997). A measure of effect size for a model-based approach for studying DIF. Prince George, Canada: University of Northern British Columbia, Edgeworth Laboratory for Quantitative Behavioral Science.
difLogistic, dichoDif
if (FALSE) {
# Loading of the verbal data
data(verbal)
# Testing both types of DIF simultaneously
# With all items, test score as matching criterion
Logistik(verbal[, 1:24], verbal[, 26])
# Returning all covariance matrices of model parameters
Logistik(verbal[, 1:24], verbal[, 26], all.cov = TRUE)
# Testing both types of DIF simultaneously
# With all items and Wald test
Logistik(verbal[, 1:24], verbal[, 26], criterion = "Wald")
# Removing item 6 from the set of anchor items
Logistik(verbal[, 1:24], verbal[, 26], anchor = c(1:5, 7:24))
# Testing for nonuniform DIF
Logistik(verbal[, 1:24], verbal[, 26], type = "nudif")
# Testing for uniform DIF
Logistik(verbal[, 1:24], verbal[, 26], type = "udif")
# Using the "anger" trait variable as matching criterion
Logistik(verbal[, 1:24], verbal[, 26], match = verbal[, 25])
# Using the "anger" trait variable as group membership
Logistik(verbal[, 1:24], verbal[, 25], member.type = "cont")
}
Run the code above in your browser using DataLab