smirt: Multidimensional Noncompensatory, Compensatory and Partially Compensatory Item Response Model

Description

This function estimates the noncompensatory and compensatory multidimensional item response model (Bolt & Lall, 2003; Reckase, 2009) as well as the partially compensatory item response model (Spray et al., 1990) for dichotomous data.

Usage

smirt(dat, Qmatrix, irtmodel="noncomp" , est.b = NULL, est.a = NULL, 
     est.c = NULL, est.d = NULL, est.mu.i=NULL , b.init = NULL, a.init = NULL, 
     c.init = NULL, d.init=NULL, mu.i.init=NULL , Sigma.init=NULL , 
     theta.k=seq(-6,6,len=20), theta.kDES=NULL, 
     qmcnodes=0 , mu.fixed = NULL, variance.fixed = NULL,  est.corr = FALSE, 
     max.increment = 1, increment.factor=1, numdiff.parm = 0.0001, 
     maxdevchange = 0.1, globconv = 0.001, maxiter = 1000, msteps = 4, 
     mstepconv = 0.001)

## S3 method for class 'smirt':
summary(object,...)

## S3 method for class 'smirt':
anova(object,...)

## S3 method for class 'smirt':
logLik(object,...)

## S3 method for class 'smirt':
IRT.irfprob(object,...)

## S3 method for class 'smirt':
IRT.likelihood(object,...)

## S3 method for class 'smirt':
IRT.posterior(object,...)

## S3 method for class 'smirt':
IRT.modelfit(object,...)

## S3 method for class 'IRT.modelfit.smirt':
summary(object,...)

Arguments

dat

Data frame with dichotomous item responses

Qmatrix

The Q-matrix which specifies the loadings to be estimated

irtmodel

The item response model. Options are the noncompensatory model ("noncomp"), the compensatory model ("comp") and the partially compensatory model ("partcomp"). See Details for more explanations.

est.b

An integer matrix (if irtmodel="noncomp") or integer vector (if irtmodel="comp") for $b$ parameters to be estimated

est.a

An integer matrix for $a$ parameters to be estimated. If est.a="2PL", then all item loadings will be estimated and the variances are set to one (and therefore est.corr=TRUE).

est.c

An integer vector for $c$ parameters to be estimated

est.d

An integer vector for $d$ parameters to be estimated

est.mu.i

An integer vector for $\mu_i$ parameters to be estimated

b.init

Initial $b$ coefficients. For irtmodel="noncomp" it must be a matrix, for irtmodel="comp" it is a vector.

a.init

Initial $a$ coefficients arranged in a matrix

c.init

Initial $c$ coefficients

d.init

Initial $d$ coefficients

mu.i.init

Initial $d$ coefficients

Sigma.init

Initial covariance matrix $\Sigma$

theta.k

Vector of discretized trait distribution. This vector is expanded in all dimensions by using the expand.grid function. If a user specifies a design matrix theta.kDES of transformed $\bold{\theta}_p$ values (see Details and Exa

theta.kDES

An optional design matrix. This matrix will differ from the ordinary theta grid in case of nonlinear item response models.

qmcnodes

Number of integration nodes for quasi Monte Carlo integration (see Pan & Thompson, 2007). Integration points are obtained by using the function qmc.nodes. Note that when using quasi Monte Carlo nodes,

mu.fixed

Matrix with fixed entries in the mean vector. By default, all means are set to zero.

variance.fixed

Matrix (with rows and three columns) with fixed entries in the covariance matrix (see Examples). The entry $c_{kd}$ of the covariance between dimensions $k$ and $d$ is set to $c_0$ iff variance.fixed has a row with a $k$ in the first

est.corr

Should only a correlation matrix instead of a covariance matrix be estimated?

max.increment

Maximum increment

increment.factor

A value (larger than one) which defines the extent of the decrease of the maximum increment of item parameters in every iteration. The maximum increment in iteration iter is defined as max.increment*increment.factor^(-iter) wh

numdiff.parm

Numerical differentiation parameter

maxdevchange

Convergence criterion for change in relative deviance

globconv

Global convergence criterion for parameter change

maxiter

Maximum number of iterations

msteps

Number of iterations within a M step

mstepconv

Convergence criterion within a M step

object

Object of class smirt

...

Further arguments to be passed

Value

A list with following entries:
devianceDeviance
icInformation criteria
itemData frame with item parameters
personData frame with person parameters. It includes the person mean of all item responses (M; percentage correct of all non-missing items), the EAP estimate and its corresponding standard error for all dimensions (EAP and SE.EAP) and the maximum likelihood estimate as well as the mode of the posterior distribution (MLE and MAP).
EAP.relEAP reliability
mean.traitMeans of trait
sd.traitStandard deviations of trait
SigmaTrait covariance matrix
cor.traitTrait correlation matrix
bMatrix (vector) of $b$ parameters
se.bMatrix (vector) of standard errors $b$ parameters
aMatrix of $a$ parameters
se.aMatrix of standard errors of $a$ parameters
cVector of $c$ parameters
se.cVector of standard errors of $c$ parameters
dVector of $d$ parameters
se.dVector of standard errors of $d$ parameters
mu.iVector of $\mu_i$ parameters
se.mu.iVector of standard errors of $\mu_i$ parameters
f.yi.qkIndividual likelihood
f.qk.yiIndividual posterior
probsProbabilities of item response functions evaluated at theta.k
n.ikExpected counts
iterNumber of iterations
dat2Processed data set
dat2.respData set of response indicators
INumber of items
DNumber of dimensions
KMaximum item response score
theta.kUsed theta integration grid
pi.kDistribution function evaluated at theta.k
irtmodelUsed IRT model
QmatrixUsed Q-matrix

Details

The noncompensatory item response model (irtmodel="noncomp"; e.g. Bolt & Lall, 2003) is defined as $$P(X_{pi}=1 | \bold{\theta}_p ) = c_i + (d_i - c_i ) \prod_l invlogit( a_{il} q_{il} \theta_{pl} - b_{il} )$$ where $i$, $p$, $l$ denote items, persons and dimensions respectively. The compensatory item response model (irtmodel="comp") is defined by $$P(X_{pi}=1 | \bold{\theta}_p ) = c_i + (d_i - c_i ) invlogit( \sum_l a_{il} q_{il} \theta_{pl} - b_{i} )$$ Using a design matrix theta.kDES the model can be made even more general in a model which is linear in item parameters $$P(X_{pi}=1 | \bold{\theta}_p ) = c_i + (d_i - c_i ) invlogit( \sum_l a_{il} q_{il} t_l ( \bold{ \theta_{p} } ) - b_{i} )$$ with known functions $t_l$ of the trait vector $\bold{\theta}_p$. Fixed values of the functions $t_l$ are specified in the $\bold{\theta}_p$ design matrix theta.kDES. The partially compensatory item response model (irtmodel="partcomp") is defined by $$P(X_{pi}=1 | \bold{\theta}_p ) = c_i + (d_i - c_i ) \frac{ \exp \left( \sum_l ( a_{il} q_{il} \theta_{pl} - b_{il} ) \right) } { \mu_i \prod_l ( 1 + \exp ( a_{il} q_{il} \theta_{pl} - b_{il} ) ) + ( 1- \mu_i) ( 1 + \exp \left( \sum_l ( a_{il} q_{il} \theta_{pl} - b_{il} ) \right) ) }$$ with item parameters $\mu_i$ indicating the degree of compensatory. $\mu_i=1$ indicates a noncompensatory model while $\mu_i=0$ indicates a (fully) compensatory model. The models are estimated by an EM algorithm employing marginal maximum likelihood.

References

Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo. Applied Psychological Measurement, 27, 395-414. Pan, J., & Thompson, R. (2007). Quasi-Monte Carlo estimation in generalized linear mixed models. Computational Statistics & Data Analysis, 51, 5765-5775. Reckase, M.D. (2009). Multidimensional Item Response Theory. New York: Springer. Spray, J. A., Davey, T. C., Reckase, M. D., Ackerman, T. A., & Carlson, J. E. (1990). Comparison of two logistic multidimensional item response theory models. ACT Research Report No. ACT-RR-ONR-90-8.

Examples

Run this code

#############################################################################
## SIMULATED EXAMPLE 1: Noncompensatory and compensatory IRT models
#############################################################################
set.seed(997)

# (1) simulate data from a two-dimensional noncompensatory
#     item response model
#   -> increase number of iterations in all models!

N <- 1000	# number of persons
I <- 10		# number of items
theta0 <- rnorm( N , sd= 1 )
theta1 <- theta0 + rnorm(N , sd = .7 )
theta2 <- theta0 + rnorm(N , sd = .7 )
Q <- matrix( 1 , nrow=I,ncol=2 )
Q[ 1:(I/2) , 2 ] <- 0
Q[ I,1] <- 0
b <- matrix( rnorm( I*2 ) , I , 2 )
a <- matrix( 1 , I  , 2 )

# simulate data                                     
prob <- dat <- matrix(0 , nrow=N , ncol=I )
for (ii in 1:I){
prob[,ii] <- ( plogis( theta1 - b[ii,1]  ) )^Q[ii,1]
prob[,ii] <- prob[,ii] * ( plogis( theta2 - b[ii,2]  ) )^Q[ii,2]
            }
dat[ prob > matrix(runif( N*I),N,I) ] <- 1
colnames(dat) <- paste0("I",1:I)

#***
# Model 1: Noncompensatory 1PL model
mod1 <- smirt(dat, Qmatrix=Q , maxiter=10 ) # change number of iterations
summary(mod1)

#***
# Model 2: Noncompensatory 2PL model
mod2 <- smirt(dat,Qmatrix=Q , est.a="2PL" , maxiter=15 )
summary(mod2)

# Model 2a: avoid convergence problems with increment.factor
mod2a <- smirt(dat,Qmatrix=Q , est.a="2PL" , maxiter=30 , increment.factor=1.03)
summary(mod2a)

#***
# Model 3: some fixed c and d parameters different from zero or one
c.init <- rep(0,I)
c.init[ c(3,7)] <- .2
d.init <- rep(1,I)
d.init[c(4,8)] <- .95
mod3 <- smirt( dat , Qmatrix=Q , c.init=c.init , d.init=d.init )
summary(mod3)

#***
# Model 4: some estimated c and d parameters (in parameter groups)
est.c <- c.init <- rep(0,I)
c.estpars <- c(3,6,7)
c.init[ c.estpars ] <- .2
est.c[c.estpars] <- 1
est.d <- rep(0,I)
d.init <- rep(1,I)
d.estpars <- c(6,9)
d.init[ d.estpars ] <- .95
est.d[ d.estpars ] <- d.estpars   # different d parameters
mod4 <- smirt(dat,Qmatrix=Q , est.c=est.c , c.init=c.init ,
            est.d=est.d , d.init=d.init  )
summary(mod4)

#***
# Model 5: Unidimensional 1PL model
Qmatrix <- matrix( 1 , nrow=I , ncol=1 )
mod5 <- smirt( dat , Qmatrix=Qmatrix )
summary(mod5)

#***
# Model 6: Unidimensional 2PL model
mod6 <- smirt( dat , Qmatrix=Qmatrix , est.a="2PL" )
summary(mod6)

#***
# Model 7: Compensatory model with between item dimensionality
# Note that the data is simulated under the noncompensatory condition
# Therefore Model 7 should have a worse model fit than Model 1
Q1 <- Q
Q1[ 6:10 , 1] <- 0
mod7 <- smirt(dat,Qmatrix=Q1 , irtmodel="comp" , maxiter=30)
summary(mod7)

#***
# Model 8: Compensatory model with within item dimensionality
# 		assuming zero correlation between dimensions
variance.fixed <- as.matrix( cbind( 1,2,0) )
# set the covariance between the first and second dimension to zero
mod8 <- smirt(dat,Qmatrix=Q , irtmodel="comp" , variance.fixed=variance.fixed ,
            maxiter=30)
summary(mod8)

#***
# Model 8b: 2PL model with starting values for a and b parameters
b.init <- rep(0,10)  # set all item difficulties initially to zero
# b.init <- NULL
a.init <- Q       # initialize a.init with Q-matrix
# provide starting values for slopes of first three items on Dimension 1
a.init[1:3,1] <- c( .55 , .32 , 1.3)   

mod8b <- smirt(dat,Qmatrix=Q , irtmodel="comp" , variance.fixed=variance.fixed ,
              b.init=b.init , a.init=a.init , maxiter=20 , est.a="2PL" )              
summary(mod8b)

#***
# Model 9: Unidimensional model with quadratic item response functions
# define theta
theta.k <- seq( - 6 , 6 , len=15 )
theta.k <- as.matrix( theta.k , ncol=1 )
# define design matrix
theta.kDES <- cbind( theta.k[,1] , theta.k[,1]^2 )
# define Q-matrix
Qmatrix <- matrix( 0 , I , 2 )
Qmatrix[,1] <- 1
Qmatrix[ c(3,6,7) , 2 ] <- 1
colnames(Qmatrix) <- c("F1" , "F1sq" )
# estimate model
mod9 <- smirt(dat,Qmatrix=Qmatrix , maxiter=50 , irtmodel="comp" , 
           theta.k=theta.k , theta.kDES=theta.kDES , est.a="2PL" )           
summary(mod9)

#***
# Model 10: Two-dimensional item response model with latent interaction
#           between dimensions
theta.k <- seq( - 6 , 6 , len=15 )
theta.k <- expand.grid( theta.k , theta.k )	# expand theta to 2 dimensions
# define design matrix
theta.kDES <- cbind( theta.k , theta.k[,1]*theta.k[,2] )
# define Q-matrix
Qmatrix <- matrix( 0 , I , 3 )
Qmatrix[,1] <- 1
Qmatrix[ 6:10 , c(2,3) ] <- 1
colnames(Qmatrix) <- c("F1" , "F2" , "F1iF2" )
# estimate model
mod10 <- smirt(dat,Qmatrix=Qmatrix ,irtmodel="comp" , theta.k=theta.k , 
            theta.kDES= theta.kDES , est.a="2PL" )           
summary(mod10)

#****
# Model 11: Example Quasi Monte Carlo integration
Qmatrix <- matrix( 1 , I , 1 )
mod11 <- smirt( dat , irtmodel="comp" , Qmatrix=Qmatrix , qmcnodes=1000 )
summary(mod11)

#############################################################################
## EXAMPLE 2: Dataset Reading data.read
##            Multidimensional models for dichotomous data
#############################################################################

data(data.read)
dat <- data.read
I <- ncol(dat)    # number of items

#***
# Model 1: 3-dimensional 2PL model

# define Q-matrix
Qmatrix <- matrix(0,nrow=I,ncol=3)
Qmatrix[1:4,1] <- 1
Qmatrix[5:8,2] <- 1
Qmatrix[9:12,3] <- 1

# estimate model
mod1 <- smirt( dat , Qmatrix=Qmatrix , irtmodel="comp" , est.a="2PL" ,
            qmcnodes=1000 , maxiter=20)
summary(mod1)

#***
# Model 2: 3-dimensional Rasch model
mod2 <- smirt( dat , Qmatrix=Qmatrix , irtmodel="comp" , 
              qmcnodes=1000 , maxiter=20)
summary(mod2)

#***
# Model 3: 3-dimensional 2PL model with uncorrelated dimensions
# fix entries in variance matrix
variance.fixed <- cbind( c(1,1,2) , c(2,3,3) , 0 )
# set the following covariances to zero: cov[1,2]=cov[1,3]=cov[2,3]=0

# estimate model
mod3 <- smirt( dat , Qmatrix=Qmatrix , irtmodel="comp" , est.a="2PL" , 
             variance.fixed=variance.fixed , qmcnodes=1000 , maxiter=20)
summary(mod3)

#***
# Model 4: Bifactor model with one general factor (g) and
#          uncorrelated specific factors

# define a new Q-matrix
Qmatrix1 <- cbind( 1 , Qmatrix )
# uncorrelated factors
variance.fixed <- cbind( c(1,1,1,2,2,3) , c(2,3,4,3,4,4) , 0 )
# The first dimension refers to the general factors while the other
# dimensions refer to the specific factors.
# The specification means that: 
# Cov[1,2]=Cov[1,3]=Cov[1,4]=Cov[2,3]=Cov[2,4]=Cov[3,4]=0 

# estimate model
mod4 <- smirt( dat , Qmatrix=Qmatrix1 , irtmodel="comp" , est.a="2PL" , 
             variance.fixed=variance.fixed , qmcnodes=1000 , maxiter=20)
summary(mod4)

#############################################################################
## SIMULATED EXAMPLE 3: Partially compensatory model
#############################################################################

#**** simulate data
set.seed(7656)
I <- 10		 # number of items
N <- 2000        # number of subjects
Q <- matrix( 0 , 3*I,2)  # Q-matrix
Q[1:I,1] <- 1
Q[1:I + I ,2] <- 1
Q[1:I + 2*I ,1:2] <- 1
b <- matrix( runif( 3*I *2, -2 , 2 ) , nrow=3*I , 2 )
b <- b*Q
b <- round( b , 2 )
mui <- rep(0,3*I)
mui[ seq(2*I+1 , 3*I) ] <- 0.65
# generate data
dat <- matrix( NA , N , 3*I )
colnames(dat) <- paste0("It" , 1:(3*I) )
# simulate item responses
library(MASS)
theta <- MASS::mvrnorm(N , mu=c(0,0) , Sigma = matrix( c( 1.2 , .6,.6,1.6)  ,2 , 2 ) )
for (ii in 1:(3*I)){
    # define probability
    tmp1 <- exp( theta[,1] * Q[ii,1] - b[ii,1] +  theta[,2] * Q[ii,2] - b[ii,2] )
    # non-compensatory model
    nco1 <- ( 1 + exp( theta[,1] * Q[ii,1] - b[ii,1] ) ) * 
                  ( 1 + exp( theta[,2] * Q[ii,2] - b[ii,2] ) )
    co1 <- ( 1 + tmp1 )  
    p1 <- tmp1 / ( mui[ii] * nco1 + ( 1 - mui[ii] )*co1 )
    dat[,ii] <- 1 * ( runif(N) < p1 )
            }

#*** Model 1: Joint mu.i parameter for all items
est.mu.i <- rep(0,3*I)
est.mu.i[ seq(2*I+1,3*I)] <- 1
mod1 <- smirt( dat , Qmatrix = Q , irtmodel = "partcomp" , est.mu.i=est.mu.i)
summary(mod1)

#*** Model 2: Separate mu.i parameter for all items
est.mu.i[ seq(2*I+1,3*I)] <- 1:I
mod2 <- smirt( dat , Qmatrix = Q , irtmodel = "partcomp" , est.mu.i=est.mu.i)
summary(mod2)

Run the code above in your browser using DataLab