Learn R Programming

regDIF: Regularized Differential Item Functioning

This R package performs regularization of differential item functioning (DIF) parameters in item response theory (IRT) models using a penalized expectation-maximization algorithm.

Version 1.1.1 Features

regDIF can:

  • Handle multiple continuous and categorical DIF covariates;
  • Support binary, ordinal, and continuous item responses;
  • Use LASSO, ridge, MCP, elastic net, and group penalty functions for regularization;
  • Allow for proxy data to be used in place of estimating latent variable scores, which leads to much faster estimation speed.

Installation

To get the current released version from CRAN:

install.packages("regDIF")

To get the current development version from Github:

# install.packages("devtools")
devtools::install_github("wbelzak/regDIF")

Getting Started

A simulated data example with 6 item responses (binary) and 3 background variables (gender, age, study) is available in the regDIF package:

library(regDIF)
head(ida)
#>   item1 item2 item3 item4 item5 item6 age gender study
#> 1     0     0     0     0     0     0  -2     -1    -1
#> 2     0     0     0     0     0     0   0     -1    -1
#> 3     0     0     0     0     0     0   3     -1    -1
#> 4     0     1     1     1     1     1   1     -1    -1
#> 5     0     0     0     0     0     0  -2     -1    -1
#> 6     1     0     0     0     0     0   1     -1    -1

First, the item responses and predictor values are separately specified:

item.data <- ida[, 1:6]
pred.data <- ida[, 7:9]

Second, the regDIF() function fits a sequence of 10 tuning parameter values using a penalized EM algorithm, which assumes a normal latent variable affects all item responses:

fit <- regDIF(item.data, pred.data, num.tau = 10)

The DIF results are shown below:

summary(fit)
#> Call:
#> regDIF(item.data = item.data, pred.data = pred.data, num.tau = 10)
#> 
#> Optimal model (out of 10):
#>          tau          bic 
#>    0.1753246 4081.6941000 
#> 
#> Non-zero DIF effects:
#>    item4.int.age    item5.int.age item5.int.gender  item5.int.study 
#>           0.2153          -0.0897          -0.5717           0.6018 
#>  item4.slp.study item5.slp.gender 
#>          -0.0936          -0.1764

When estimation speed is slow, proxy data may be used in place of latent score estimation:

fit_proxy <- regDIF(item.data, pred.data, prox.data = rowSums(item.data))
summary(fit_proxy)
#> Call:
#> regDIF(item.data = item.data, pred.data = pred.data, prox.data = rowSums(item.data))
#> 
#> Optimal model (out of 100):
#>          tau          bic 
#>    0.2766486 3540.8070000 
#> 
#> Non-zero DIF effects:
#> item3.int.gender    item4.int.age item5.int.gender  item5.int.study 
#>           0.0955           0.2200          -0.5118           0.7040 
#> item2.slp.gender  item4.slp.study item5.slp.gender 
#>           0.1102          -0.1413          -0.1384

Other penalty functions (besides LASSO) may also be used. For instance, the elastic net penalty uses a second tuning parameter, alpha, to vary the ratio of LASSO to ridge penalties:

fit_proxy_net <- regDIF(item.data, pred.data, prox.data = rowSums(item.data), alpha = .5)
summary(fit_proxy_net)
#> Call:
#> regDIF(item.data = item.data, pred.data = pred.data, prox.data = rowSums(item.data), 
#>     alpha = 0.5)
#> 
#> Optimal model (out of 100):
#>          tau          bic 
#>    0.5685967 3563.7495000 
#> 
#> Non-zero DIF effects:
#> item3.int.gender    item4.int.age    item5.int.age item5.int.gender 
#>           0.0681           0.1672          -0.0939          -0.3463 
#>  item5.int.study item2.slp.gender  item4.slp.study item5.slp.gender 
#>           0.4346           0.0778          -0.1172          -0.1379

Questions

Please send any questions to wbelzak@gmail.com.

Copy Link

Version

Install

install.packages('regDIF')

Monthly Downloads

203

Version

1.1.1

License

MIT + file LICENSE

Maintainer

William Belzak

Last Published

February 23rd, 2024

Functions in regDIF (1.1.1)

d_categorical

Partial derivatives for ordinal items.
cumulative_traceline_pts_proxy

Ordinal tracelines using proxy data.
d_phi

Partial derivatives for mean impact equation.
d_sigma_gaussian

Partial derivatives for variance parameter of continuous items.
gaussian_traceline_pts_proxy

Continuous tracelines using proxy data.
d_mu_gaussian_proxy

Partial derivatives for mean parameter of continuous items with proxy data.
d_impact_block_proxy

Partial derivatives for mean and variance impact equation using observed score proxy.
d_sigma_gaussian_proxy

Partial derivatives for variance parameter of continuous items with proxy data.
ida

Simulated data example with multiple DIF covariates
plot.regDIF

Plot function for regDIF function
preprocess

Pre-process data.
d_phi_proxy

Partial derivatives for mean impact equation using proxy data.
d_impact_block

Partial derivatives for mean and variance impact equation.
d_mu_gaussian

Partial derivatives for mean parameter of continuous items.
information_criteria

Maximization step.
d_gaussian_itemblock_proxy

Partial derivatives for continuous items using proxy data.
print.regDIF

Print function for regDIF function
d_categorical_proxy

Partial derivatives for ordinal items using proxy data.
em_estimation

Penalized expectation-maximization algorithm.
d_categorical_itemblock

Partial derivatives for ordinal items.
gaussian_traceline_pts

Continuous tracelines.
d_alpha

Partial derivatives for mean impact equation.
postprocess

Maximization step.
regDIF

Regularized Differential Item Functioning
regDIF-package

Regularized differential item functioning for IRT and CFA models.
se.regDIF

Standard Errors for regDIF Model(s)
summary.regDIF

Summary function for regDIF function
Mstep

Maximization step.
Mstep_simple

Maximization step.
d_bernoulli_itemblock_proxy

Partial derivatives for binary items by item-blocks using observed score proxy.
coef.regDIF

Coefficient function for regDIF function
bernoulli_traceline_pts_proxy

Binary item tracelines for proxy scores.
Estep

Expectation step.
Mstep_cd

Maximization step using coordinate descent optimization.
Mstep_cd2

Maximization step using coordinate descent optimization.
d_bernoulli_itemblock

Partial derivatives for binary items by item-blocks.
bernoulli_traceline_pts2

Binary item tracelines.
d_alpha_proxy

Partial derivatives for mean impact equation using proxy data.
Estep_proxy

Expectation step with proxy data.
Mstep_block

Maximization step using latent variable and item response blocks.
bernoulli_traceline_pts

Binary item tracelines.
d_bernoulli

Partial derivatives for binary items.
d_gaussian_itemblock

Partial derivatives for continuous items.
cumulative_traceline_pts

Ordinal tracelines.
d_bernoulli_proxy

Partial derivatives for binary items with proxy data.