data.sirt: Some Example Datasets for the `sirt` Package

Description

Some example datasets for the sirt package.

Usage

data(data.si01)
data(data.si02)
data(data.si03)
data(data.si04)
data(data.si05)
data(data.si06)
data(data.si07)
data(data.si08)

Arguments

Format

The format of the dataset data.si01 is:
'data.frame': 1857 obs. of 3 variables: $ idgroup: int 1 1 1 1 1 1 1 1 1 1 ... $ item1 : int NA NA NA NA NA NA NA NA NA NA ... $ item2 : int 4 4 4 4 4 4 4 2 4 4 ...
The dataset data.si02 is the Stouffer-Toby-dataset published in Lindsay, Clogg and Grego (1991; Table 1, p.97, Cross-classification A):
List of 2 $ data : num [1:16, 1:4] 1 0 1 0 1 0 1 0 1 0 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:4] "I1" "I2" "I3" "I4" $ weights: num [1:16] 42 1 6 2 6 1 7 2 23 4 ...
The format of the dataset data.si03 (containing item parameters of two studies) is:
'data.frame': 27 obs. of 3 variables: $ item : Factor w/ 27 levels "M1","M10","M11",..: 1 12 21 22 ... $ b_study1: num 0.297 1.163 0.151 -0.855 -1.653 ... $ b_study2: num 0.72 1.118 0.351 -0.861 -1.593 ...
The dataset data.si04 is adapted from Bartolucci, Montanari and Pandolfi (2012; Table 4, Table 7). The data contains 4999 persons, 79 items on 5 dimensions. See rasch.mirtlc for using the data in an analysis.
List of 3 $ data : num [1:4999, 1:79] 0 1 1 0 1 1 0 0 1 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:79] "A01" "A02" "A03" "A04" ... $ itempars :'data.frame': 79 obs. of 4 variables: ..$ item : Factor w/ 79 levels "A01","A02","A03",..: 1 2 3 4 5 6 7 8 9 10 ... ..$ dim : num [1:79] 1 1 1 1 1 1 1 1 1 1 ... ..$ gamma : num [1:79] 1 1 1 1 1 1 1 1 1 1 ... ..$ gamma.beta: num [1:79] -0.189 0.25 0.758 1.695 1.022 ... $ distribution: num [1:9, 1:7] 1 2 3 4 5 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:7] "class" "A" "B" "C" ...
The dataset data.si05 contains double ratings of two exchangeable raters for three items which are in Ex1, Ex2 and Ex3, respectively.
List of 3 $ Ex1:'data.frame': 199 obs. of 2 variables: ..$ C7040: num [1:199] NA 1 0 1 1 0 0 0 1 0 ... ..$ C7041: num [1:199] 1 1 0 0 0 0 0 0 1 0 ... $ Ex2:'data.frame': 2000 obs. of 2 variables: ..$ rater1: num [1:2000] 2 0 3 1 2 2 0 0 0 0 ... ..$ rater2: num [1:2000] 4 1 3 2 1 0 0 0 0 2 ... $ Ex3:'data.frame': 2000 obs. of 2 variables: ..$ rater1: num [1:2000] 5 1 6 2 3 3 0 0 0 0 ... ..$ rater2: num [1:2000] 7 2 6 3 2 1 0 1 0 3 ...
The dataset data.si06 contains multiple choice item responses. The correct alternative is denoted as 0, distractors are indicated by the codes 1, 2 or 3.
'data.frame': 4441 obs. of 14 variables: $ WV01: num 0 0 0 0 0 0 0 0 0 3 ... $ WV02: num 0 0 0 3 0 0 0 0 0 1 ... $ WV03: num 0 1 0 0 0 0 0 0 0 0 ... $ WV04: num 0 0 0 0 0 0 0 0 0 1 ... $ WV05: num 3 1 1 1 0 0 1 1 0 2 ... $ WV06: num 0 1 3 0 0 0 2 0 0 1 ... $ WV07: num 0 0 0 0 0 0 0 0 0 0 ... $ WV08: num 0 1 1 0 0 0 0 0 0 0 ... $ WV09: num 0 0 0 0 0 0 0 0 0 2 ... $ WV10: num 1 1 3 0 0 2 0 0 0 0 ... $ WV11: num 0 0 0 0 0 0 0 0 0 0 ... $ WV12: num 0 0 0 2 0 0 2 0 0 0 ... $ WV13: num 3 1 1 3 0 0 3 0 0 0 ... $ WV14: num 3 1 2 3 0 3 1 3 3 0 ...
The dataset data.si07 contains parameters of the empirical illustration of DeCarlo (XXXX). The simulation function sim_fun can be used for simulating data from the IRSDT model (see DeCarlo, XXXX)
List of 3 $ pars :'data.frame': 16 obs. of 3 variables: ..$ item: Factor w/ 16 levels "I01","I02","I03",..: 1 2 3 4 5 6 7 8 9 10 ... ..$ b : num [1:16] -1.1 -0.18 1.44 1.78 -1.19 0.45 -1.12 0.33 0.82 -0.43 ... ..$ d : num [1:16] 2.69 4.6 6.1 3.11 3.2 ... $ trait :'data.frame': 20 obs. of 2 variables: ..$ x : num [1:20] 0.025 0.075 0.125 0.175 0.225 0.275 0.325 0.375 0.425 0.475 ... ..$ prob: num [1:20] 0.0238 0.1267 0.105 0.0594 0.0548 ... $ sim_fun:function (lambda, b, d, items)
The dataset data.si08 contains 5 items with respect to knowledge about lung cancer and the kind of information acquisition (Goodman, 1970; see also Rasch, Kubinger & Yanagida, 2011). L1: reading newspapers, L2: listening radio, L3: reading books and magazines, L4: attending talks, L5: knowledge about lung cancer
'data.frame': 32 obs. of 6 variables: $ L1 : num 1 1 1 1 1 1 1 1 1 1 ... $ L2 : num 1 1 1 1 1 1 1 1 0 0 ... $ L3 : num 1 1 1 1 0 0 0 0 1 1 ... $ L4 : num 1 1 0 0 1 1 0 0 1 1 ... $ L5 : num 1 0 1 0 1 0 1 0 1 0 ... $ wgt: num 23 8 102 67 8 4 35 59 27 18 ...

References

Bartolucci, F., Montanari, G. E., & Pandolfi, S. (2012). Dimensionality of the latent structure and item selection via latent class multidimensional IRT models. Psychometrika, 77, 782-802. 10.1007/s11336-012-9278-0

DeCarlo, L. T. (XXXX). An item response model for true-false exams based on signal detection theory. Applied Psychological Measurement, xx(xx). 10.1177/0146621619843823

Goodman, L. A. (1970). The multivariate analysis of qualitative data: Interactions among multiple classifications. Journal of the American Statistical Association, 65(329), 226-256. 10.1080/01621459.1970.10481076

Lindsay, B., Clogg, C. C., & Grego, J. (1991). Semiparametric estimation in the Rasch model and related exponential response models, including a simple latent class model for item analysis. Journal of the American Statistical Association, 86, 96-107. 10.1080/01621459.1991.10475008

Rasch, D., Kubinger, K. D., & Yanagida, T. (2011). Statistics in psychology using R and SPSS. New York: Wiley. 10.1002/9781119979630

Examples

Run this code

# NOT RUN {
#############################################################################
# EXAMPLE 1: Nested logit model multiple choice dataset data.si06
#############################################################################

data(data.si06, package="sirt")
dat <- data.si06

#** estimate 2PL nested logit model
library(mirt)
mod1 <- mirt::mirt( dat, model=1, itemtype="2PLNRM", key=rep(0,ncol(dat) ),
            verbose=TRUE  )
summary(mod1)
cmod1 <- sirt::mirt.wrapper.coef(mod1)$coef
cmod1[,-1] <- round( cmod1[,-1], 3)

#** normalize item parameters according Suh and Bolt (2010)
cmod2 <- cmod1

# slope parameters
ind <-  grep("ak",colnames(cmod2))
h1 <- cmod2[,ind ]
cmod2[,ind] <- t( apply( h1, 1, FUN=function(ll){ ll - mean(ll) } ) )
# item intercepts
ind <-  paste0( "d", 0:9 )
ind <- which( colnames(cmod2) %in% ind )
h1 <- cmod2[,ind ]
cmod2[,ind] <- t( apply( h1, 1, FUN=function(ll){ ll - mean(ll) } ) )
cmod2[,-1] <- round( cmod2[,-1], 3)

#############################################################################
# EXAMPLE 2: Item response modle based on signal detection theory (IRSDT model)
#############################################################################

data(data.si07, package="sirt")
data <- data.si07

#-- simulate data
set.seed(98)
N <- 2000 # define sample size
# generate membership scores
lambda <- sample(size=N, x=data$trait$x, prob=data$trait$prob, replace=TRUE)
b <- data$pars$b
d <- data$pars$d
items <- data$pars$item
dat <- data$sim_fun(lambda=lambda, b=b, d=d, items=items)

#- estimate IRSDT model as a grade of membership model with two classes
problevels <- seq( 0.025, 0.975, length=20 )
mod1 <- sirt::gom.em( dat, K=2, problevels=problevels )
summary(mod1)
# }