Learn R Programming

grmtree (version 0.1.0)

grmforest: Fit a Forest of Graded Response Model Trees for Ensemble-Based DIF Detection

Description

This function implements a forest of graded response model trees (GRM Forest) using bootstrap aggregation (bagging) or random subsampling to enhance the detection and analysis of differential item functioning (DIF) in polytomous items. The GRM Forest approach combines the strengths of multiple GRMTrees to provide more robust and stable DIF detection, particularly for complex datasets with high-dimensional covariates or subtle DIF patterns.

Usage

grmforest(formula, data, control = grmforest.control(), ...)

Value

An object of class grmforest containing:

trees

List of fitted GRM trees

oob_samples

List of out-of-bag samples for each tree

formula

The model formula

data

The original dataset

call

The function call

Arguments

formula

A formula specifying the model structure with the response matrix on the left and partitioning variables on the right (e.g., response_matrix ~ age + gender + education + clinical_variables).

data

A data frame containing the response matrix and partitioning variables. The response matrix should contain polytomous items coded as ordered factors.

control

A control object created by grmforest.control().

...

Additional arguments passed to underlying grmtree() function.

Details

The algorithm works by fitting multiple GRMTrees, each on a random sample of the original data (either through bootstrap sampling or subsampling). For each tree, approximately one-third of the observations are left out as out-of-bag (OOB) samples, which are used for internal validation and variable importance calculation. The ensemble approach reduces variance, minimizes overfitting, and provides more reliable identification of covariates associated with DIF.

Key advantages of the GRM Forest approach include:

  • Enhanced stability in DIF detection across different sampling variations

  • Robust variable importance measures that quantify the relative contribution of each covariate to DIF patterns

  • Reduced false positive rates through consensus-based detection

  • Ability to handle high-dimensional covariate spaces effectively

  • Internal validation through out-of-bag error estimation

The forest implementation supports both bootstrap aggregation (where samples are drawn with replacement) and subsampling (without replacement), allowing flexibility for different data characteristics and research objectives.

See Also

grmtree fits a Graded Response Model Tree, grmtree.control creates a control object for grmtree, grmforest.control creates a control object for grmforest, varimp calculates the variable importance for GRM Forest, plot.varimp creates a bar plot of variable importance scores

Examples

Run this code

# \donttest{
  library(grmtree)
  library(hlt)
  data("asti", package = "hlt")
  asti$resp <- data.matrix(asti[, 1:4])

  # Fit forest with default parameters
  forest <- grmforest(resp ~ gender + group, data = asti)

  # Fit with custom control
  ctrl <- grmforest.control(n_tree = 50, sampling = "subsample")
  forest <- grmforest(resp ~ gender + group, data = asti, control = ctrl)
# }

Run the code above in your browser using DataLab