Learn R Programming

aberrance (version 0.2.1)

sim: Simulate data

Description

Simulate data using item response theory (IRT) models.

Usage

sim(psi, xi)

Value

A list is returned. Possible elements include:

x

A matrix of item scores.

d

A matrix of item distractors.

r

A matrix of item responses.

y

A matrix of item log response times.

Arguments

psi

A matrix of item parameters.

xi

A matrix of person parameters.

Models for Item Scores

The Rasch, 2PL, and 3PL models (Birnbaum, 1968; Rasch, 1960) are given by $$P(X_{vi} = 1 | \theta_v, a_i, b_i, c_i) = c_i + \frac{1 - c_i}{1 + \exp\{-a_i(\theta_v - b_i)\}}.$$

  • psi must contain columns named "a", "b", and "c" for the item discrimination, difficulty, and pseudo-guessing parameters, respectively.

  • xi must contain a column named "theta" for the person ability parameters.

The partial credit model (PCM; Masters, 1982) and the generalized partial credit model (GPCM; Muraki, 1992) are given by $$P(X_{vi} = j | \theta_v, a_i, \boldsymbol{c_i}) = \frac{\exp\{\sum_{k=0}^j a_i(\theta_v - c_{ik})\}} {\sum_{l=0}^{m_i} \exp\{\sum_{k=0}^l a_i(\theta_v - c_{ik})\}}.$$

  • psi must contain columns named "a" for the item discrimination parameter and "c0", "c1", ..., for the item category parameters.

  • xi must contain a column named "theta" for the person ability parameters.

The graded response model (GRM; Samejima, 1969) is given by $$P(X_{vi} = j | \theta_v, a_i, \boldsymbol{b_i}) = P(X_{vi} \ge j | \theta_v, a_i, \boldsymbol{b_i}) - P(X_{vi} \ge j + 1 | \theta_v, a_i, \boldsymbol{b_i}),$$ where $$P(X_{vi} \ge j | \theta_v, a_i, \boldsymbol{b_i}) = \begin{cases} 1 &\text{if } j = 0, \\ \frac{1}{1 + \exp\{-a_i(\theta_v - b_{ij})\}} &\text{if } 1 \le j \le m_i, \\ 0 &\text{if } j = m_i + 1. \end{cases}$$

  • psi must contain columns named "a" for the item discrimination parameter and "b1", "b2", ..., for the item location parameters listed in increasing order.

  • xi must contain a column named "theta" for the person ability parameters.

Models for Item Distractors

The nested logit model (NLM; Bolt et al., 2012) is given by $$P(D_{vi} = j | \theta_v, \eta_v, a_i, b_i, c_i, \boldsymbol{\lambda_i}, \boldsymbol{\zeta_i}) = [1 - P(X_{vi} = 1 | \theta_v, a_i, b_i, c_i)] \times P(D_{vi} = j | X_{vi} = 0, \eta_v, \boldsymbol{\lambda_i}, \boldsymbol{\zeta_i}),$$ where $$P(D_{vi} = j | X_{vi} = 0, \eta_v, \boldsymbol{\lambda_i}, \boldsymbol{\zeta_i}) = \frac{\exp(\lambda_{ij} \eta_v + \zeta_{ij})} {\sum_{k=1}^{m_i-1} \exp(\lambda_{ik} \eta_v + \zeta_{ik})}.$$

  • psi must contain columns named "a", "b", and "c" for the item discrimination, difficulty, and pseudo-guessing parameters, respectively, "lambda1", "lambda2", ..., for the item slope parameters, and "zeta1", "zeta2", ..., for the item intercept parameters.

  • xi must contain columns named "theta" and "eta" for the person parameters that govern response correctness and distractor selection, respectively.

Models for Item Responses

The nominal response model (NRM; Bock, 1972) is given by $$P(R_{vi} = j | \eta_v, \boldsymbol{\lambda_i}, \boldsymbol{\zeta_i}) = \frac{\exp(\lambda_{ij} \eta_v + \zeta_{ij})} {\sum_{k=1}^{m_i} \exp(\lambda_{ik} \eta_v + \zeta_{ik})}.$$

  • psi must contain columns named "lambda1", "lambda2", ..., for the item slope parameters and "zeta1", "zeta2", ..., for the item intercept parameters. If there is a correct response category, its parameters should be listed last.

  • xi must contain a column named "eta" for the person parameters that govern response selection.

Models for Item Log Response Times

The lognormal model (van der Linden, 2006) is given by $$f(Y_{vi} | \tau_v, \alpha_i, \beta_i) = \frac{\alpha_i}{\sqrt{2 \pi}} \exp\{-\frac{1}{2}[\alpha_i(Y_{vi} - (\beta_i - \tau_v))]^2\}.$$

  • psi must contain columns named "alpha" and "beta" for the item time discrimination and time intensity parameters, respectively.

  • xi must contain a column named "tau" for the person speed parameters.

References

Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee's ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397--479). Addison-Wesley.

Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37(1), 29--51.

Bolt, D. M., Wollack, J. A., & Suh, Y. (2012). Application of a multidimensional nested logit model to multiple-choice test items. Psychometrika, 77(2), 339--357.

Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149--174.

Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16(2), 159--176.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Danish Institute for Educational Research.

Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika, 34(S1), 1--97.

van der Linden, W. J. (2006). A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31(2), 181--204.

Examples

Run this code
# Setup for Examples 1 to 5 -------------------------------------------------

# Settings
set.seed(0)     # seed for reproducibility
N <- 500        # number of persons
n <- 40         # number of items

# Example 1: 3PL Model and Lognormal Model ----------------------------------

# Generate person parameters
xi <- MASS::mvrnorm(
  N,
  mu = c(theta = 0.00, tau = 0.00),
  Sigma = matrix(c(1.00, 0.25, 0.25, 0.25), ncol = 2)
)

# Generate item parameters
psi <- cbind(
  a = rlnorm(n, meanlog = 0.00, sdlog = 0.25),
  b = NA,
  c = runif(n, min = 0.05, max = 0.30),
  alpha = runif(n, min = 1.50, max = 2.50),
  beta = NA
)

# Generate positively correlated difficulty and time intensity parameters
psi[, c("b", "beta")] <- MASS::mvrnorm(
  n,
  mu = c(b = 0.00, beta = 3.50),
  Sigma = matrix(c(1.00, 0.20, 0.20, 0.15), ncol = 2)
)

# Simulate item scores and log response times
dat <- sim(psi, xi)
x <- dat$x
y <- dat$y

# Example 2: Generalized Partial Credit Model -------------------------------

# Generate person parameters
xi <- cbind(theta = rnorm(N, mean = 0.00, sd = 1.00))

# Generate item parameters
psi <- cbind(
  a = rlnorm(n, meanlog = 0.00, sdlog = 0.25),
  c0 = 0,
  c1 = rnorm(n, mean = -1.00, sd = 0.50),
  c2 = rnorm(n, mean = 0.00, sd = 0.50),
  c3 = rnorm(n, mean = 1.00, sd = 0.50)
)

# Simulate item scores
x <- sim(psi, xi)$x

# Example 3: Graded Response Model ------------------------------------------

# Generate person parameters
xi <- cbind(theta = rnorm(N, mean = 0.00, sd = 1.00))

# Generate item parameters
psi <- cbind(
  a = rlnorm(n, meanlog = 0.00, sdlog = 0.25),
  b1 = rnorm(n, mean = -1.00, sd = 0.50),
  b2 = rnorm(n, mean = 0.00, sd = 0.50),
  b3 = rnorm(n, mean = 1.00, sd = 0.50)
)

# Sort item location parameters in increasing order
psi[, paste0("b", 1:3)] <- t(apply(psi[, paste0("b", 1:3)], 1, sort))

# Simulate item scores
x <- sim(psi, xi)$x

# Example 4: Nested Logit Model ---------------------------------------------

# Generate person parameters
xi <- MASS::mvrnorm(
  N,
  mu = c(theta = 0.00, eta = 0.00),
  Sigma = matrix(c(1.00, 0.80, 0.80, 1.00), ncol = 2)
)

# Generate item parameters
psi <- cbind(
  a = rlnorm(n, meanlog = 0.00, sdlog = 0.25),
  b = rnorm(n, mean = 0.00, sd = 1.00),
  c = runif(n, min = 0.05, max = 0.30),
  lambda1 = rnorm(n, mean = 0.00, sd = 1.00),
  lambda2 = rnorm(n, mean = 0.00, sd = 1.00),
  lambda3 = rnorm(n, mean = 0.00, sd = 1.00),
  zeta1 = rnorm(n, mean = 0.00, sd = 1.00),
  zeta2 = rnorm(n, mean = 0.00, sd = 1.00),
  zeta3 = rnorm(n, mean = 0.00, sd = 1.00)
)

# Simulate item scores and distractors
dat <- sim(psi, xi)
x <- dat$x
d <- dat$d

# Example 5: Nominal Response Model -----------------------------------------

# Generate person parameters
xi <- cbind(eta = rnorm(N, mean = 0.00, sd = 1.00))

# Generate item parameters
psi <- cbind(
  lambda1 = rnorm(n, mean = -0.50, sd = 0.50),
  lambda2 = rnorm(n, mean = -0.50, sd = 0.50),
  lambda3 = rnorm(n, mean = -0.50, sd = 0.50),
  lambda4 = rnorm(n, mean = 1.50, sd = 0.50),
  zeta1 = rnorm(n, mean = -0.50, sd = 0.50),
  zeta2 = rnorm(n, mean = -0.50, sd = 0.50),
  zeta3 = rnorm(n, mean = -0.50, sd = 0.50),
  zeta4 = rnorm(n, mean = 1.50, sd = 0.50)
)

# Simulate item responses
r <- sim(psi, xi)$r

Run the code above in your browser using DataLab