difGenLogistic: Generalized logistic regression DIF method

Description

Performs DIF detection among multiple groups using generalized logistic regression method.

Usage

difGenLogistic(Data, group, focal.names, anchor = NULL, match = "score", 
 	type = "both", criterion = "LRT", alpha = 0.05, purify = FALSE, nrIter = 10,
 	p.adjust.method = NULL, save.output = FALSE, output = c("out", "default"))
# S3 method for genLogistic
print(x, ...)
# S3 method for genLogistic
plot(x, plot = "lrStat", item = 1, itemFit = "best",pch = 8, number = TRUE,
  	col = "red", colIC = rep("black", length(x$focal.names)+1),
  	ltyIC = 1:(length(x$focal.names)+1), title = NULL, save.plot = FALSE, 
  	save.options = c("plot", "default", "pdf"), ref.name = NULL, ...)

Arguments

Data

numeric: either the data matrix only, or the data matrix plus the vector of group membership. See Details.

group

numeric or character: either the vector of group membership or the column indicator (within data) of group membership. See Details.

focal.names

numeric or character vector indicating the levels of group which correspond to the focal groups.

anchor

either NULL (default) or a vector of item names (or identifiers) to specify the anchor items. Ignored if match is not "score". See Details.

match

specifies the type of matching criterion. Can be either "score" (default) to compute the test score, or any continuous or discrete variable with the same length as the number of rows of Data. See Details.

type

a character string specifying which DIF effects must be tested. Possible values are "both" (default), "udif" and "nudif". See Details.

criterion

character: the type of test statistic used to detect DIF items. Possible values are "LRT" (default) and "Wald". See Details.

alpha

numeric: significance level (default is 0.05).

purify

logical: should the method be used iteratively to purify the set of anchor items? (default is FALSE).

nrIter

numeric: the maximal number of iterations in the item purification process (default is 10).

p.adjust.method

either NULL (default) or the acronym of the method for p-value adjustment for multiple comparisons. See Details.

save.output

logical: should the output be saved into a text file? (Default is FALSE).

output

character: a vector of two components. The first component is the name of the output file, the second component is either the file path or "default" (default value). See Details.

the result from a Logistik class object.

plot

character: the type of plot, either "lrStat" or "itemCurve". See Details.

item

numeric or character: either the number or the name of the item for which logistic curves are plotted. Use only when plot="itemCurve".

itemFit

character: the model to be selected for drawing the item curves. Possible values are "best" (default) for drawing from the best of the two models, and "null" for using fitted parameters of the null model \(M_0\). Not used if "plot" is "lrStat". See Details.

pch, col

type of usual pch and col graphical options.

number

logical: should the item number identification be printed (default is TRUE).

colIC, ltyIC

vectors of elements of the usual col and lty arguments for logistic curves. Used only when plot="itemCurve".

title

either a character string with the title of the plot, or NULL (default), for which a specific title is automatically displayed.

save.plot

logical: should the plot be saved into a separate file? (default is FALSE).

save.options

character: a vector of three components. The first component is the name of the output file, the second component is either the file path or "default" (default value), and the third component is the file extension, either "pdf" (default) or "jpeg". See Details.

ref.name

either NULL(default) or a character string for the name of the reference group (to be used instead of "Reference" in the legend). Ignored if plot is "lrStat".

...

other generic parameters for the plot or the print functions.

Value

A list of class "genLogistic" with the following arguments:

genLogistik

the values of the generalized logistic regression statistics.

p.value

the vector of p-values for the generalized logistic regression statistics.

logitPar

a matrix with one row per item and \(2+J*2\) columns, holding the fitted parameters of the best model (among the two tested models) for each item.

parM0

the matrix of fitted parameters of the null model \(M_0\), as returned by the Logistik command.

covMat

a 3-dimensional matrix of size p x p x K, where p is the number of estimated parameters and K is the number of items, holding the p x p covariance matrices of the estimated parameters (one matrix for each tested item).

deltaR2

the differences in Nagelkerke's \(R^2\) coefficients. See Details.

alpha

the value of alpha argument.

thr

the threshold (cut-score) for DIF detection.

DIFitems

either the column indicators for the items which were detected as DIF items, or "No DIF item detected".

type

the value of type argument.

p.adjust.method

the value of the p.adjust.method argument.

adjusted.p

either NULL or the vector of adjusted p-values for multiple comparisons.

purification

the value of purify option.

nrPur

the number of iterations in the item purification process. Returned only if purify is TRUE.

difPur

a binary matrix with one row per iteration in the item purification process and one column per item. Zeros and ones in the i-th row refer to items which were classified respectively as non-DIF and DIF items at the (i-1)-th step. The first row corresponds to the initial classification of the items. Returned only if purify is TRUE.

convergence

logical indicating whether the iterative item purification process stopped before the maximal number of nrItem allowed iterations. Returned only if purify is TRUE.

names

the names of the items.

anchor.names

the value of the anchor argument.

focal.names

the value of focal.names argument.

criterion

the value of the criterion argument.

save.output

the value of the save.output argument.

output

the value of the output argument.

Details

The generalized logistic regression method (Magis, Raiche, Beland and Gerard, 2011) allows for detecting both uniform and non-uniform differential item functioning among multiple groups without requiring an item response model approach. It consists in fitting a logistic model with the matching criterion, the group membership and an interaction between both as covariates. The statistical significance of the parameters related to group membership and the group-score interaction is then evaluated by means of the usual likelihood-ratio test. The argument type permits to test either both uniform and nonuniform effects simultaneously (type="both"), only uniform DIF effect (type="udif") or only nonuniform DIF effect (type="nudif"). The identification of DIF items can be performed with either the Wald test or the likelihood ratio test, by setting the criterion argument to "Wald" or "LRT" respectively. See genLogistik for further details.

The matching criterion can be either the test score or any other continuous or discrete variable to be passed in the genLogistik function. This is specified by the match argument. By default, it takes the value "score" and the test score (i.e. raw score) is computed. The second option is to assign to match a vector of continuous or discrete numeric values, which acts as the matching criterion. Note that for consistency this vector should not belong to the Data matrix.

The Data is a matrix whose rows correspond to the subjects and columns to the items. In addition, Data can hold the vector of group membership. If so, group indicates the column of Data which corresponds to the group membership, either by specifying its name or by giving the column number. Otherwise, group must be a vector of same length as nrow(Data).

Missing values are allowed for item responses (not for group membership) but must be coded as NA values. They are discarded from the fitting of the logistic models (see glm for further details).

The vector of group membership must hold at least three values, either as numeric or character. The focal groups are defined by the values of the argument focal.names. If there is a unique focal group, then difGenLogistic returns the output of difLogistic.

The threshold (or cut-score) for classifying items as DIF is computed as the quantile of the chi-squared distribution with lower-tail probability of one minus alpha and with J (if type="udif" or type="nudif") or 2J (if type="both") degrees of freedom (J is the number of focal groups).

Item purification can be performed by setting purify to TRUE. Purification works as follows: if at least one item is detected as functioning differently at the first step of the process, then the data set of the next step consists in all items that are currently anchor (DIF free) items, plus the tested item (if necessary). The process stops when either two successive applications of the method yield the same classifications of the items (Clauser and Mazor, 1998), or when nrIter iterations are run without obtaining two successive identical classifications. In the latter case a warning message is printed.

Adjustment for multiple comparisons is possible with the argument p.adjust.method. The latter must be an acronym of one of the available adjustment methods of the p.adjust function. According to Kim and Oshima (2013), Holm and Benjamini-Hochberg adjustments (set respectively by "Holm" and "BH") perform best for DIF purposes. See p.adjust function for further details. Note that item purification is performed on original statistics and p-values; in case of adjustment for multiple comparisons this is performed after item purification.

A pre-specified set of anchor items can be provided through the anchor argument. It must be a vector of either item names (which must match exactly the column names of Data argument) or integer values (specifying the column numbers for item identification). In case anchor items are provided, they are used to compute the test score (matching criterion), including also the tested item. None of the anchor items are tested for DIF: the output separates anchor items and tested items and DIF results are returned only for the latter. By default it is NULL so that no anchor item is specified. Note also that item purification is not activated when anchor items are provided (even if purify is set to TRUE). Moreover, if the match argument is not set to "score", anchor items will not be taken into account even if anchor is not NULL.

The measures of effect size are provided by the difference \(\Delta R^2\) between the \(R^2\) coefficients of the two nested models (Nagelkerke, 1991; Gomez-Benito, Dolores Hidalgo and Padilla, 2009). The effect sizes are classified as "negligible", "moderate" or "large". Two scales are available, one from Zumbo and Thomas (1997) and one from Jodoin and Gierl (2001). The output displays the \(\Delta R^2\) measures, together with the two classifications.

The output of the difGenLogistic, as displayed by the print.genLogistic function, can be stored in a text file provided that save.output is set to TRUE (the default value FALSE does not execute the storage). In this case, the name of the text file must be given as a character string into the first component of the output argument (default name is "out"), and the path for saving the text file can be given through the second component of output. The default value is "default", meaning that the file will be saved in the current working directory. Any other path can be specified as a character string: see the Examples section for an illustration.

Two types of plots are available. The first one is obtained by setting plot="lrStat" and it is the default option. The likelihood ratio statistics are displayed on the Y axis, for each item. The detection threshold is displayed by a horizontal line, and items flagged as DIF are printed with the color defined by argument col. By default, items are spotted with their number identification (number=TRUE); otherwise they are simply drawn as dots whose form is given by the option pch.

The other type of plot is obtained by setting plot="itemCurve". In this case, the fitted logistic curves are displayed for one specific item set by the argument item. The latter argument can hold either the name of the item or its number identification. If the argument itemFit takes the value "best", the curves are drawn according to the output of the best model among \(M_0\) and \(M_1\). That is, two curves are drawn if the item is flagged as DIF, and only one if the item is flagged as non-DIF. If itemFit takes the value "null", then the two curves are drawn from the fitted parameters of the null model \(M_0\). See genLogistik for further details on the models. The colors and types of traits for these curves are defined by means of the arguments colIC and ltyIC respectively. These are set as vectors of length \(J+1\), the first element for the reference group and the others for the focal groups. Finally, the ref.name argument permits to display the name if the reference group (instead of "Reference") in the legend.

Both types of plots can be stored in a figure file, either in PDF or JPEG format. Fixing save.plot to TRUE allows this process. The figure is defined through the components of save.options. The first two components perform similarly as those of the output argument. The third component is the figure format, with allowed values "pdf" (default) for PDF file and "jpeg" for JPEG file.

References

Clauser, B.E. and Mazor, K.M. (1998). Using statistical procedures to identify differential item functioning test items. Educational Measurement: Issues and Practice, 17, 31-44.

Gomez-Benito, J., Dolores Hidalgo, M. and Padilla, J.-L. (2009). Efficacy of effect size measures in logistic regression: an application for detecting DIF. Methodology, 5, 18-25. 10.1027/1614-2241.5.1.18

Hidalgo, M. D. and Lopez-Pina, J.A. (2004). Differential item functioning detection and effect size: a comparison between logistic regression and Mantel-Haenszel procedures. Educational and Psychological Measurement, 64, 903-915. 10.1177/0013164403261769

Jodoin, M. G. and Gierl, M. J. (2001). Evaluating Type I error and power rates using an effect size measure with logistic regression procedure for DIF detection. Applied Measurement in Education, 14, 329-349. 10.1207/S15324818AME1404_2

Kim, J., and Oshima, T. C. (2013). Effect of multiple testing adjustment in differential item functioning detection. Educational and Psychological Measurement, 73, 458--470. 10.1177/0013164412467033

Magis, D., Beland, S., Tuerlinckx, F. and De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42, 847-862. 10.3758/BRM.42.3.847

Magis, D., Raiche, G., Beland, S. and Gerard, P. (2011). A logistic regression procedure to detect differential item functioning among multiple groups. International Journal of Testing, 11, 365--386. 10.1080/15305058.2011.602810

Nagelkerke, N. J. D. (1991). A note on a general definition of the coefficient of determination. Biometrika, 78, 691-692. 10.1093/biomet/78.3.691

Zumbo, B. D. and Thomas, D. R. (1997). A measure of effect size for a model-based approach for studying DIF. Prince George, Canada: University of Northern British Columbia, Edgeworth Laboratory for Quantitative Behavioral Science.

Examples

Run this code

# NOT RUN {
 # Loading of the verbal data
 data(verbal)
 attach(verbal)

 # Creating four groups according to gender ("Man" or "Woman") and
 # trait anger score ("Low" or "High")
 group <- rep("WomanLow", nrow(verbal))
 group[Anger>20 & Gender==0] <- "WomanHigh"
 group[Anger<=20 & Gender==1] <- "ManLow"
 group[Anger>20 & Gender==1] <- "ManHigh"

 # New data set
 Verbal <- cbind(verbal[,1:24], group)

 # Reference group: "WomanLow"
 names <- c("WomanHigh", "ManLow", "ManHigh")

 # Testing both types of DIF effects
 # Three equivalent settings of the data matrix and the group membership
 r <- difGenLogistic(Verbal, group = 25, focal.names = names)
 difGenLogistic(Verbal, group = "group", focal.name = names)
 difGenLogistic(Verbal[,1:24], group = Verbal[,25], focal.names = names)

 # Using the Wald test
 difGenLogistic(Verbal, group = 25, focal.names = names, criterion = "Wald")

 # Multiple comparisons adjustment using Benjamini-Hochberg method
difGenLogistic(Verbal, group = 25, focal.names = names, p.adjust.method = "BH")

 # With item purification
 difGenLogistic(Verbal, group = 25, focal.names = names, purify = TRUE)
 difGenLogistic(Verbal, group = 25, focal.names = names, purify = TRUE,
   nrIter = 5)

 # With items 1 to 5 set as anchor items
 difGenLogistic(Verbal, group = 25, focal.name = names, anchor = 1:5)

 # Testing for nonuniform DIF effect
 difGenLogistic(Verbal, group = 25, focal.names = names, type = "nudif")

 # Testing for uniform DIF effect
 difGenLogistic(Verbal, group = 25, focal.names = names, type = "udif")

 # User anger trait score as matching criterion
 anger <- verbal[,25]
 difGenLogistic(Verbal, group = 25, focal.names = names, match = anger)

 # Saving the output into the "GLresults.txt" file (and default path)
 r <- difGenLogistic(Verbal, group = 25, focal.name = names, 
                save.output = TRUE, output = c("GLresults","default"))

 # Graphical devices
 plot(r)
 plot(r, plot = "itemCurve", item = 1)
 plot(r, plot = "itemCurve", item = 1, itemFit = "best")
 plot(r, plot = "itemCurve", item = 6)
 plot(r, plot = "itemCurve", item = 6, itemFit = "best")

 # Plotting results and saving it in a PDF figure
 plot(r, save.plot = TRUE, save.options = c("plot", "default", "pdf"))

 # Changing the path, JPEG figure
 path <- "c:/Program Files/"
 plot(r, save.plot = TRUE, save.options = c("plot", path, "jpeg"))
# }
# NOT RUN {
 
# }

Run the code above in your browser using DataLab