Learn R Programming

difR (version 6.1.0)

lassoDIF.CV: Detection of Differential Item Functioning Using the Lasso Approach: Selection of Optimal \(\lambda\) via Cross-Validation

Description

Performs DIF detection using a lasso-penalized logistic regression model for dichotomous items and selects the optimal penalty parameter \(\lambda\) via cross-validation.

Usage

lassoDIF.CV(Data, group, nfold = 5, lambda = NULL, ...)

Value

A list with the following components:

DIFitems

Indices of items flagged as exhibiting DIF.

DIFpars

Matrix of estimated DIF parameters for each item.

crit.value

Cross-validation criterion values (deviance) across the \(\lambda\) path.

crit.type

The type of criterion used, here "cv".

lambda

Vector of \(\lambda\) values considered.

opt.lambda

The optimal \(\lambda\) value selected via cross-validation.

glmnet.fit

Fitted glmnet model object.

Arguments

...

Additional arguments passed to internal methods.

Data

A numeric data frame or matrix: either only the item responses or the item responses with a group membership column.

group

A numeric or character vector: either a vector of group membership or a column index/name indicating group membership in Data.

nfold

Integer: the number of folds used in cross-validation. Default is 5.

lambda

Optional numeric vector of \(\lambda\) values to be used in the penalization path. If NULL, a default sequence is used.

Author

David Magis
Data science consultant at IQVIA Belux
Brussels, Belgium
Carl F. Falk
Department of Psychology
McGill University (Canada)
carl.falk@mcgill.ca, https://www.mcgill.ca/psychology/carl-f-falk
Sebastien Beland
Faculte des sciences de l'education
Universite de Montreal (Canada)
sebastien.beland@umontreal.ca

Details

This function detects uniform differential item functioning (DIF) using a lasso-penalized logistic regression model and selects the penalty parameter \(\lambda^*\) that minimizes cross-validation error. For this selected value, the function returns the estimated DIF parameters for all items and flags those with non-zero DIF effects.

Note: The performance of the method depends on choices such as the number of folds and the grid of \(\lambda\) values. We strongly recommend testing different configurations to assess the robustness of the results before interpretation.

References

Magis, D., Tuerlinckx, F., & De Boeck, P. (2015). Detection of Differential Item Functioning Using the Lasso Approach. Journal of Educational and Behavioral Statistics, 40(2), 111–135. https://doi.org/10.3102/1076998614559747

Examples

Run this code
if (FALSE) {

# With the Verbal data set

data(verbal)

Dat    <-verbal[,1:20]
Member <-verbal[,26]

# Using cross-validation
set.seed(1234) 

cv.res <- lassoDIF.CV(Dat, Member, nfold=5)
cv.res

# With simulated data

It   <- 15 # number of items
ItDIFa <- NULL
ItDIFb <- c(1,3)
NR   <- 100 # number of responses for group 1 (reference)
NF   <- 100 # number of responses for group 2 (focal)
a    <- rep(1,It)          # for tests: runif(It,0.2,.5)  
b    <- rnorm(It,1,.5)  
Gb   <- rep(2,2)           # Group value for U-DIF
Ga   <- 0                  # Group value for NU-DIF: need to be fix to 0 for U-DIF
Out1 <- SimDichoDif(It,ItDIFa,ItDIFb,NR,NF,a,b,Ga,Gb)
Dat<-Out1$data[,1:15]
Member<-Out1$data[,16]

set.seed(1234) # appears to be sensitive to random number seed

cv.res <- lassoDIF.CV(Dat, Member, nfold=5)
cv.res

 }
 

Run the code above in your browser using DataLab