difTID: Transformed Item Difficulties (TID) DIF method

Description

Performs DIF detection using Transformed Item Difficulties (TID) method.

Usage

difTID(Data, group, focal.name, thrTID = 1.5, purify = FALSE, purType = "IPP1", 
  	nrIter = 10, alpha = 0.05, extreme = "constraint", 
  	const.range = c(0.001, 0.999), nrAdd = 1, save.output = FALSE, 
  	output = c("out", "default"))  
# S3 method for TID
print(x, only.final = TRUE, ...)
# S3 method for TID
plot(x, plot = "dist",pch = 2, pch.mult = 17, axis.draw = TRUE, 
  	thr.draw = FALSE, dif.draw = c(1, 3), print.corr = FALSE, xlim = NULL, 
  	ylim = NULL, xlab = NULL, ylab = NULL, main = NULL, col = "red", 
  	number = TRUE, save.plot = FALSE, save.options = c("plot", 
  	"default", "pdf"), ...)

Arguments

Data

numeric: either the data matrix only, or the data matrix plus the vector of group membership. See Details.

group

numeric or character: either the vector of group membership or the column indicator (within Data) of group membership. See Details.

focal.name

numeric or character indicating the level of group which corresponds to the focal group.

thrTID

either the threshold for detecting DIF items (default is 1.5) or "norm".

purify

logical: should the method be used iteratively to purify the set of anchor items? (default is FALSE).

purType

character: the type of purification process to be run. Possible values are "IPP1" (default), "IPP2" and "IPP3". Ignored if purify is FALSE. See Details.

nrIter

numeric: the maximal number of iterations in the item purification process (default is 10).

alpha

numeric: the significance level for calculating the detection threshold (default is 0.05). Ignored if thrTID is numeric.

extreme

character: the method used to modify the extreme proportions. Possible values are "constraint" (default) or "add". See Details.

const.range

numeric: a vector of two constraining proportions. Default values are 0.001 and 0.999. Ignored if extreme is "add".

nrAdd

integer: the number of successes and the number of failures to add to the data in order to adjust the proportions. Default value is 1. Ignored if extreme is "constraint".

save.output

logical: should the output be saved into a text file? (Default is FALSE).

output

character: a vector of two components. The first component is the name of the output file, the second component is either the file path or "default" (default value). See Details.

the result from a TID class object.

only.final

logical: should only the first and last steps of the purification process be printed? (default is TRUE. If FALSE all perpendicular distances, parameters of the major axis, and detection thresholds are printed additionally. Ignored if purify is FALSE.

plot

character: either "dist" (default) to display the perpendicular distances, or "delta" for the Delta plot. See Details.

pch

integer: the usual point character type for point display. Default value is 2, that is, Delta points are drawn as empty triangles.

pch.mult

integer: the type of point to be used for superposing onto Delta points that correspond to several items. Default value is 17, that is, full black traingles are drawn onto existing Delta plots wherein multiple items are located.

axis.draw

logical: should the major axis be drawn? (default is TRUE). If so, it will be drawn as a solid line.

thr.draw

logical: should the upper and lower bounds for DIF detection be drawn? (default is FALSE). If TRUE, they will be drawn as dashed lines.

dif.draw

numeric: a vector of two integer values to specify how the DIF items should be displayed. The first component of dif.draw is the type of point (i.e. the usual pch argument) and the second component determines the point size (i.e. the usual cex argument). Default values are 1 and 3, meaning that empty circles of three times the usual size are drawn around the Delta points of items flagged as DIF.

print.corr

logical: should the sample correlation of Delta scores be printed? (default is FALSE). If TRUE, it is printed in upper-left corner of the plot.

xlim, ylim, xlab, ylab, main

either the usual plot arguments xlim, ylim, xlab, ylab and main, or NULL (default value for all arguments). If NULL, the X and Y axis limits are computed from the range of Delta scores, the X and Y axis labels are "Reference group" and "Focal group" respectively, and no main title is produced.

col

character: the color type for the items. Used only when plot is "dist".

number

logical: should the item number identification be printed (default is TRUE).

save.plot

logical: should the plot be saved into a separate file? (default is FALSE).

save.options

character: a vector of three components. The first component is the name of the output file, the second component is either the file path or "default" (default value), and the third component is the file extension, either "pdf" (default) or "jpeg". See Details.

...

other generic parameters for the plot or the print functions.

Value

A list of class "TID" with the following arguments:

Props

the matrix of proportions of correct responses, or NA if type is "delta".

adjProps

the restricted proportions, in the same format as the output Props matrix, or NA if type is "delta".

Deltas

the matrix of Delta scores.

Dist

a matrix with perpendicular distances, one row per item and one column per run of the Delta plot. If purify is FALSE, only a single column is returned.

axis.par

a matrix with two columns, holding respectively the intercepts and the slope parameters of the major axis. Each row refers to one step of the purification process. If purify is FALSE, only a single row is returned.

nrIter

the number of iterations invloved in the purification process. Returned only if purify is TRUE.

maxIter

the value of the maxIter argument. Returned only if purify is TRUE.

convergence

a logical value indicating whether convergence was reached in the purification process. Returned only if purify is TRUE.

difPur

a matrix with one column per item and one row per iteration in the purification process, holding zeros and ones to indicate which items were flagged as DIF or not at each step of the process. Returned only if purify is TRUE.

thr

a vector of successive threshold values used during the purification process. If purify is FALSE, a single value is returned.

rule

a character value indicating whether the threshold was "fixed" by the user (i.e. by setting thr to a numeric value) or whether it was computed by normal approximation (i.e. by setting thr to "norm").

purType

the value of the purType argument. Returned only if purify is TRUE.

DIFitems

either "No DIF item detected" or an integer vector with the items that were flagged as DIF.

adjust.extreme

the value of the extreme argument.

const.range

the value of the const.range argument.

nrAdd

the value of the nrAdd argument.

purify

the value of the purify argument.

alpha

the value of the alpha argument.

save.output

the value of the save.output argument.

output

the value of the output argument.

names

either the names of the items (defined by the column names of the Data matrix) or the series of integers from one to the number of items.

number

a boolean value, being TRUE if the item names are simply their number in the Data matrix, or FALSE if real item names are available in the names element.

Details

The Transformed Item Difficulties (TID) method, also known as Angoff's Delta method (Angoff, 1982; Angoff and Ford, 1973) allows for detecting uniform differential item functioning without requiring an item response model approach. The presnt implementation relies on the deltaPlot and diagPlot functions from packagedeltaPlotR (Magis and Facon, 2014).

The Data is a matrix whose rows correspond to the subjects and columns to the items. In addition, Data can hold the vector of group membership. If so, group indicates the column of Data which corresponds to the group membership, either by specifying its name or by giving the column number. Otherwise, group must be a vector of same length as nrow(Data).

Missing values are allowed for item responses (not for group membership) but must be coded as NA values. They are discarded from the computation of proportions of success.

The vector of group membership must hold only two different values, either as numeric or character. The focal group is defined by the value of the argument focal.name.

The threshold for flaging items as DIF can be of two types and is specified by the thr argument.

It can be fixed to some arbitrary positive value by the user, for instance 1.5 (Angoff and Ford, 1973). In this case, thr takes the required numeric threshold value.
Alternatively, it can be derived from the bivariate normal approximation of the Delta points (Magis and Facon, 2012). In this case, thr must be given the character value "norm" (which is the default value). This threshold equals $$\Phi^{-1}(1-\alpha/2) \; \sqrt{\frac{b^2\,{s_0}^2-2\,b\,s_{01}+{s_1}^2}{b^2+1}}$$ where $\Phi$ is the density of the standard normal distribution, $\alpha$ is the significance level (set by the argument alpha with default value 0.05), $b$ is the slope parameter of the major axis, $s_0$ and $s_1$ are the sample standard deviations of the Delta scores in the reference group and the focal group, respecively, and $s_{01}$ is the sample covariance of the Delta scores (see Magis and Facon, 2012, for further details).

Item purification can be performed by setting the argument purify to TRUE (by default it is FALSE so no purification is performed). The item purification process (IPP) starts when at least one item was flagged as DIF after the first run of the Delta plot, and proceeds as follows.

The intercept and slope parameters of the major axis are re-calculated by removing all DIF that are currently flagged as DIF. This yields updated values $a^*$, $b^*$, $s_0^*$, $s_1^*$ and $s_{01}^*$ of the intercept and slope parameters, sample stanbdard deviations and sample covariance of the Delta scores.
Perpendicular distances (for all items) are updated with respect to the updated major axis.
Detection threshold is also updated. Three possible updates are possible: see below.
All items are now tested for the presence of DIF, given the updated perpendicular distances and major axis.
If the set of items flagged as DIF is the same as the one from the previous loop, stop the process. Otherwise go back to step 1.

Unlike traditional DIF methods, the detection threshold may also be updated since it depends on the sample estimates (when the normal approximation is considered). Three approaches are currently implemented and are specified by the purType argument.

Method 1 (purType=="IPP1"): the same threshold is used throughout the purification process, it is not iteratively updated. The threshold is the one obtained after the first run of the Delta plot.
Method 2 (purType=="IPP2"): only the slope parameter is updated in the threshold formula. By this way, one keeps the full data structure (i.e. neither the sample variances nor the sample covariance of the Delta scores are modified) but only the slope parameter is adjusted to lessen the impact of DIF items.
Method 3 (purType=="IPP3"): all adjusted parameters are plugged in the threshold formula. This approach completely discards the effect of items flagged as DIF from the computation of the threshold.

See Magis and Facon (2013) for further details. Note that purification can also be performed with fixed threshold (i.e. specified by the user), but then only IPP1 process is performed.

In order to avoid possible infinite loops in the purification process, a maximal number of iterations must be specified through the argument maxIter. The default maximal number of iterations is 10.

The output contains all input information, the Delta scores and perpendicular distances, the parameter of the major axis and the items flagged as DIF (if none, a character sentence is returned). In addition, the detection threshold and the type of threshold (fixed or normal approximation) is provided.

If item purification was run, several additional elements are returned: the number of iterations, a logical indicator whether the convergence was reached (or not, meaning that the process stopped because of reaching the maximal number of allowed iterations), a matrix with indicators of which items were flagged as DIF at each iteration, and the type of item purification process. Moreover, perpendicular distances are returned in a matrix format (one column per iteration), as well as successive major axis parameters (one row per iteration) and successive thresholds (as a vector).

The output is managed and printed in a more user-friendly way. When item purification is performed, only the first and last steps are displayed. Specifying the argument only.final to FALSE prints in addition all intermediate steps of the process (successive perpendicular distances, parameters of the major axis, and detection thresholds).

The output of the difTID, as displayed by the print.TID function, can be stored in a text file provided that save.output is set to TRUE (the default value FALSE does not execute the storage). In this case, the name of the text file must be given as a character string into the first component of the output argument (default name is "out"), and the path for saving the text file can be given through the second component of output. The default value is "default", meaning that the file will be saved in the current working directory. Any other path can be specified as a character string: see the Examples section for an illustration.

Two types of plots are available through the plot.TID function. If the argument plot is set to "dist" (the default value), then the perpendicular distances are represented on the Y axis of a scatter plot, with each item on the X axis. If plot is set to "delta", the Delta plot is returned. In the latter, all particular options can be found from the diagPlot function. Also, the plot can be stored in a figure file, either in PDF or JPEG format. Fixing save.plot to TRUE allows this process. The figure is defined through the components of save.options. The first two components perform similarly as those of the output argument. The third component is the figure format, with allowed values "pdf" (default) for PDF file and "jpeg" for JPEG file.

References

Angoff, W. H. (1982). Use of difficulty and discrimination indices for detecting item bias. In R. A. Berck (Ed.), Handbook of methods for detecting item bias (pp. 96-116). Baltimore, MD: Johns Hopkins University Press.

Angoff, W. H., and Ford, S. F. (1973). Item-race interaction on a test of scholastic aptitude. Journal of Educational Measurement, 2, 95-106. 10.1111/j.1745-3984.1973.tb00787.x

Magis, D., and Facon, B. (2012). Angoff's Delta method revisited: improving the DIF detection under small samples. British Journal of Mathematical and Statistical Psychology, 65, 302-321. 10.1111/j.2044-8317.2011.02025.x

Magis, D., and Facon, B. (2013). Item purification does not always improve DIF detection: a counter-example with Angoff's Delta plot. Educational and Psychological Measurement, 73, 293-311. 10.1177/0013164412451903

Magis, D. and Facon, B. (2014). deltaPlotR: An R Package for Differential Item Functioning Analysis with Angoff's Delta Plot. Journal of Statistical Software, Code Snippets, 59(1), 1-19. 10.18637/jss.v059.c01

Examples

Run this code

# NOT RUN {
 # Loading of the verbal data
 data(verbal)

 # Excluding the "Anger" variable
 verbal <- verbal[colnames(verbal) != "Anger"]

 # Three equivalent settings of the data matrix and the group membership
 r <- difTID(verbal, group = 25, focal.name = 1)
 difTID(verbal, group = "Gender", focal.name = 1)
 difTID(verbal[,1:24], group = verbal[,25], focal.name = 1)

 # With item purification and threshold 1
 r2 <- difTID(verbal, group = "Gender", focal.name = 1, purify = TRUE, thrTID = 1)

 # Saving the output into the "TIDresults.txt" file (and default path)
 difTID(verbal, group = 25, focal.name = 1, save.output = TRUE, 
   output = c("TIDresults", "default"))

 # Graphical devices
 plot(r2)
 plot(r2, plot = "delta")

 # Plotting results and saving it in a PDF figure
 plot(r2, save.plot = TRUE, save.options = c("plot", "default", "pdf"))

 # Changing the path, JPEG figure
 path <- "c:/Program Files/"
 plot(r2, save.plot = TRUE, save.options = c("plot", path, "jpeg"))
# }
# NOT RUN {
 
# }

Run the code above in your browser using DataLab