sim.data: generate response data

Description

randomly generate response data matrix according to certen conditions, including attributes distribution, item quality, sample size, Q-matrix and cognitive diagnosis models (CDMs).

Usage

sim.data(
  Q = NULL,
  N = NULL,
  IQ = list(P0 = NULL, P1 = NULL),
  model = "GDINA",
  distribute = "uniform",
  control = NULL,
  verbose = TRUE
)

Value

Object of class simGDINA. An simGDINA object gained by simGDINA function form GDINA package. Elements that can be extracted using method extract include:

dat: An N × I simulated item response matrix.
Q: The Q-matrix.
attribute: An N × K matrix for inviduals' attribute patterns.
catprob.parm: A list of non-zero category success probabilities for each latent group.
delta.parm: A list of delta parameters.
higher.order.parm: Higher-order parameters.
mvnorm.parm: Multivariate normal distribution parameters.
LCprob.parm: A matrix of item/category success probabilities for each latent class.

Arguments

Q

The Q-matrix. A random 30 × 5 Q-matrix (sim.Q) will be used if NULL.

N

Sample size. Default = 500.

IQ

A List contains tow I-length vectors: P0 and P1.

model

Type of model to be fitted; can be "GDINA", "LCDM", "DINA", "DINO", "ACDM", "LLM", or "rRUM".

distribute

Attribute distributions; can be "uniform" for the uniform distribution, "mvnorm" for the multivariate normal distribution (Chiu, Douglas, & Li, 2009) and "horder" for the higher-order distribution (Tu et al., 2022).

control

A list of control parameters with elements:

sigma A positive-definite symmetric matrix specifying the variance-covariance matrix when distribute = "mvnorm". Default = 0.5 (Chiu, Douglas, & Li, 2009).
cutoffs A vector giving the cutoff for each attribute when distribute = "mvnorm". Default = \(k/(1+K)\) (Chiu, Douglas, & Li, 2009).
theta A vector of length N representing the higher-order ability for each examinee. By default, generate randomly from the normal distribution (Tu et al, 2022).
a The slopes for the higher-order model when distribute = "horder". Default = 1.5 (Tu et al, 2022).
b The intercepts when distribute = "horder". By default, select equally spaced values between -1.5 and 1.5 according to the number of attributes (Tu et al, 2022).

verbose

Logical indicating to print information or not. Default is TRUE

Author

Haijiang Qin <Haijiang133@outlook.com>

References

Chiu, C.-Y., Douglas, J. A., & Li, X. (2009). Cluster Analysis for Cognitive Diagnosis: Theory and Applications. Psychometrika, 74(4), 633-665. DOI: 10.1007/s11336-009-9125-0.

Tu, D., Chiu, J., Ma, W., Wang, D., Cai, Y., & Ouyang, X. (2022). A multiple logistic regression-based (MLR-B) Q-matrix validation method for cognitive diagnosis models:A confirmatory approach. Behavior Research Methods. DOI: 10.3758/s13428-022-01880-x.

Examples

Run this code


################################################################
#                           Example 1                          #
#          generate data follow the uniform distrbution        #
################################################################
library(Qval)

set.seed(123)

K <- 5
I <- 10
Q <- sim.Q(K, I)

IQ <- list(
  P0 = runif(I, 0.0, 0.2),
  P1 = runif(I, 0.8, 1.0)
)

data <- sim.data(Q = Q, N = 10, IQ=IQ, model = "GDINA", distribute = "uniform")

print(data$dat)

################################################################
#                           Example 2                          #
#          generate data follow the mvnorm distrbution         #
################################################################
set.seed(123)
K <- 5
I <- 10
Q <- sim.Q(K, I)

IQ <- list(
  P0 = runif(I, 0.0, 0.2),
  P1 = runif(I, 0.8, 1.0)
)

example_cutoffs <- sample(qnorm(c(1:K)/(K+1)), ncol(Q))
data <- sim.data(Q = Q, N = 10, IQ=IQ, model = "GDINA", distribute = "mvnorm",
                 control = list(sigma = 0.5, cutoffs = example_cutoffs))

print(data$dat)

#################################################################
#                            Example 3                          #
#           generate data follow the horder distrbution         #
#################################################################
set.seed(123)
K <- 5
I <- 10
Q <- sim.Q(K, I)

IQ <- list(
  P0 = runif(I, 0.0, 0.2),
  P1 = runif(I, 0.8, 1.0)
)

example_theta <- rnorm(10, 0, 1)
example_b <- seq(-1.5,1.5,length.out=K)
data <- sim.data(Q = Q, N = 10, IQ=IQ, model = "GDINA", distribute = "horder",
                 control = list(theta = example_theta, a = 1.5, b = example_b))

print(data$dat)

Run the code above in your browser using DataLab