lsirmgrm: Fit an ordinal LSIRM with the graded response model

Description

lsirmgrm fits an ordinal latent space item response model for Likert-scale (ordered categorical) responses using the graded response model (GRM). This approach extends the traditional GRM by incorporating a latent space representation of respondent-item interactions, providing a spatial interpretation of response patterns.

The model captures interactions between respondents and items through the distance between latent respondent positions $z_j$ and item positions $w_i$ in a shared latent space, allowing for the visualization and interpretation of complex response patterns in Likert-scale assessments.

Usage

lsirmgrm(
  data,
  ncat = NULL,
  missing_data = NA,
  missing.val = 99,
  chains = 1,
  multicore = 1,
  seed = NA,
  ndim = 2,
  niter = 15000,
  nburn = 2500,
  nthin = 5,
  nprint = 500,
  jump_beta = 0.4,
  jump_theta = 1,
  jump_gamma = 0.2,
  jump_z = 0.5,
  jump_w = 0.5,
  pr_mean_beta = 0,
  pr_sd_beta = 1,
  pr_mean_theta = 0,
  pr_sd_theta = 1,
  pr_mean_gamma = 0.5,
  pr_sd_gamma = 1,
  pr_a_theta = 0.001,
  pr_b_theta = 0.001,
  fixed_gamma = FALSE,
  spikenslab = FALSE,
  pr_spike_mean = -3,
  pr_spike_sd = 1,
  pr_slab_mean = 0.5,
  pr_slab_sd = 1,
  pr_xi_a = 1,
  pr_xi_b = 1,
  adapt = NULL,
  verbose = FALSE,
  fix_theta_sd = FALSE
)

Value

An object of class lsirm. For multi-chain fits, it returns a list where each element (chain1, chain2, etc.) is a single-chain fit of class lsirm.

If missing_data = "mar", the returned object additionally contains imp (MCMC draws of imputed responses for each missing cell) and imp_estimate (posterior mean imputation for each missing cell).

Arguments

data

Matrix; an ordinal (ordered categorical) item response matrix. Each row represents a respondent, and each column represents an item. Values can be either 0:(K-1) or 1:K. Missing values can be NA.

ncat

Integer; number of categories $K$. If NULL, it is inferred from the observed data.

missing_data

Character; the type of missing data assumed. Options are NA, "mar", or "mcar". If NA and data contains missing values, it is set to "mcar" internally.

missing.val

Numeric; numeric code used to represent missing values in the C++ sampler. Default is 99.

chains

Integer; number of MCMC chains. Default is 1.

multicore

Integer; number of cores for parallel execution when chains > 1. Default is 1.

seed

Integer; RNG seed. Default is NA.

ndim

Integer; latent space dimension. Default is 2.

niter

Integer; total MCMC iterations. Default is 15000.

nburn

Integer; burn-in iterations. Default is 2500.

nthin

Integer; thinning interval. Default is 5.

nprint

Integer; print interval if verbose=TRUE. Default is 500.

jump_beta

Numeric; proposal SD for GRM thresholds. Default is 0.4. During MCMC sampling, threshold proposals are constrained to maintain the ordering $\beta_{i,1} > \beta_{i,2} > \cdots > \beta_{i,K-1}$ for each item.

jump_theta

Numeric; proposal SD for theta. Default is 1.

jump_gamma

Numeric; proposal SD on log-scale for gamma. Default is 0.2.

jump_z

Numeric; proposal SD for z. Default is 0.5.

jump_w

Numeric; proposal SD for w. Default is 0.5.

pr_mean_beta

Numeric; prior mean for thresholds. Default is 0.

pr_sd_beta

Numeric; prior SD for thresholds. Default is 1.

pr_mean_theta

Numeric; prior mean for theta. Default is 0.

pr_sd_theta

Numeric; prior SD for theta. Default is 1.

pr_mean_gamma

Numeric; log-normal prior mean for gamma. Default is 0.5.

pr_sd_gamma

Numeric; log-normal prior SD for gamma. Default is 1.

pr_a_theta

Numeric; shape for inverse-gamma prior on var(theta). Default is 0.001.

pr_b_theta

Numeric; scale for inverse-gamma prior on var(theta). Default is 0.001.

fixed_gamma

Logical; if TRUE, fixes $\gamma = 1$ (no sampling). Default is FALSE.

spikenslab

Logical; if TRUE, uses spike-and-slab priors for $\gamma$. Default is FALSE.

pr_spike_mean

Numeric; prior mean for the spike component (on log-scale). Default is -3.

pr_spike_sd

Numeric; prior SD for the spike component (on log-scale). Default is 1.

pr_slab_mean

Numeric; prior mean for the slab component (on log-scale). Default is 0.5.

pr_slab_sd

Numeric; prior SD for the slab component (on log-scale). Default is 1.

pr_xi_a

Numeric; Beta prior shape a for mixing weight $\xi$. Default is 1.

pr_xi_b

Numeric; Beta prior shape b for mixing weight $\xi$. Default is 1.

adapt

List; optional adaptive MCMC control. If not NULL, proposal standard deviations are adapted during the burn-in period to reach a target acceptance rate and are held fixed during the main MCMC sampling. When adaptation is enabled, the reported acceptance ratios in the output (accept_beta, accept_theta, etc.) are computed only from iterations after burn-in, reflecting the performance of the adapted proposal distributions. Elements of the list can include:

use_adapt: Logical; if TRUE, adaptive MCMC is used. Default is FALSE.
adapt_interval: Integer; the number of iterations between each update of the proposal SDs. Default is 100.
adapt_rate: Numeric; Robbins-Monro scaling constant (c) in step size formula: adapt_rate / iteration^decay_rate. Default is 1.0. Valid range: any positive value. Recommended: 0.5-2.0.
decay_rate: Numeric; Robbins-Monro decay exponent (alpha) in step size formula. Default is 0.5. Valid range: (0.5, 1]. Recommended: 0.5-0.8.
target_accept: Numeric; target acceptance rate for scalar parameters (beta, theta, gamma). Default is 0.44.
target_accept_zw: Numeric; target acceptance rate for latent positions z and w. Default is 0.234.
target_accept_beta/theta/gamma: Numeric; (optional) parameter-specific target acceptance rates to override target_accept.

verbose

Logical; If TRUE, MCMC progress and parameter samples are printed to the console during execution. Default is FALSE.

fix_theta_sd

Logical; If TRUE, the standard deviation of the respondent latent positions $\theta$ is fixed at 1 instead of being sampled. Default is FALSE.

Details

lsirmgrm implements the Graded Response Model (GRM) in a latent space framework. Let $Y_{j,i} \in \{0,\ldots,K-1\}$ be the ordered categorical response of respondent $j$ to item $i$. The model is defined via cumulative logits: $$\Pr(Y_{j,i} \ge k | \theta_j, \beta_{i,k}, \gamma, z_j, w_i) = \text{logit}^{-1}(\theta_j + \beta_{i,k} - \gamma\,\|z_j-w_i\|)$$ for $k=1,\ldots,K-1$, where $\beta_{i,k}$ are item-specific thresholds (difficulty levels) that satisfy the ordering constraint $\beta_{i,1} > \beta_{i,2} > \cdots > \beta_{i,K-1}$ for identifiability.

Missing data can be handled in two ways:

"mcar": Missing responses are assumed to be Missing Completely At Random. They are ignored in the likelihood calculation.
"mar": Missing responses are assumed to be Missing At Random. The model uses data augmentation to impute missing values at each MCMC iteration based on the current parameter estimates.

For models with spikenslab = TRUE, a spike-and-slab prior is placed on $\log(\gamma)$ to perform model selection between a standard Rasch-type model ($\gamma \approx 0$) and a latent space model ($\gamma > 0$).

References

De Carolis, L., Kang, I., & Jeon, M. (2025). A Latent Space Graded Response Model for Likert-Scale Psychological Assessments. Multivariate Behavioral Research. tools:::Rd_expr_doi("10.1080/00273171.2025.2605678")

Examples

Run this code

# \donttest{
# generate example ordinal item response matrix
set.seed(123)
nsample <- 50
nitem <- 10
data <- matrix(sample(1:5, nsample * nitem, replace = TRUE), nrow = nsample)

# Fit GRM LSIRM using direct function call
fit <- lsirmgrm(data, niter = 1000, nburn = 500, nthin = 2)
summary(fit)

# Realistic example with BFPT data
data(BFPT)
dat <- BFPT
# Handle outliers or special codes
dat[(dat == 0) | (dat == 6)] <- NA
# Reverse code specific items
reverse <- c(2, 4, 6, 8, 10, 11, 13, 15, 16, 17, 18, 19, 20, 21, 23, 25, 27, 32, 34, 36, 42, 44, 46)
dat[, reverse] <- 6 - dat[, reverse]
# Remove missing cases for simple demonstration
dat <- dat[complete.cases(dat), ]
# Fit model (subset for speed)
fit_bfpt <- lsirm(dat[1:50, 1:10] ~ lsirmgrm(niter = 1000, nburn = 500))
summary(fit_bfpt)

# Fit with missing data (MAR)
fit_mar <- lsirm(data ~ lsirmgrm(missing_data = "mar", niter = 1000, nburn = 500))

# Fit with Spike-and-Slab prior for model selection
fit_ss <- lsirm(data ~ lsirmgrm(spikenslab = TRUE, niter = 1000, nburn = 500))

# Fit with adaptive MCMC for automatic tuning
fit_adapt <- lsirmgrm(data, niter = 2000, nburn = 1000, 
                      adapt = list(use_adapt = TRUE, adapt_interval = 50))
# Check adapted jump sizes and acceptance rates
cat("Final jump_beta:", fit_adapt$jump_beta, "\n")
cat("Acceptance rate (post-burnin):", fit_adapt$accept_beta, "\n")
# }

Run the code above in your browser using DataLab