rm.sdt: Hierachical Rater Model Based on Signal Detection Theory (HRM-SDT)

Description

This function estimates a version of the hierarchical rater model (HRM) based on signal detection theory (HRM-SDT; DeCarlo, 2005; DeCarlo, Kim & Johnson, 2011).

Usage

rm.sdt(dat, pid, rater, Qmatrix = NULL, theta.k = seq(-9, 9, len = 30), 
    est.a.item = FALSE, est.c.rater = "n", est.d.rater = "n", est.mean=FALSE , 
    skillspace="normal" , tau.item.fixed = NULL , a.item.fixed = NULL , 
    d.min = 0.5, d.max = 100, d.start = 3, c.start=NULL, tau.start=NULL, sd.start=1, 
    d.prior = c(3,100), c.prior=c(3,100), tau.prior=c(0,1000), a.prior=c(1,100), 
    max.increment = 1, numdiff.parm = 0.00001, maxdevchange = 0.1, globconv = .001, 
    maxiter = 1000, msteps = 4, mstepconv = 0.001, fac_incr=.99, 
    PEM=FALSE, PEM_itermax=maxiter )
# S3 method for rm.sdt
summary(object, file=NULL, ...)    
# S3 method for rm.sdt
plot(x, ask=TRUE, ...)
# S3 method for rm.sdt
anova(object,...)
# S3 method for rm.sdt
logLik(object,...)
# S3 method for rm.sdt
IRT.factor.scores(object, type="EAP", ...)
# S3 method for rm.sdt
IRT.irfprob(object,...)
# S3 method for rm.sdt
IRT.likelihood(object,...)
# S3 method for rm.sdt
IRT.posterior(object,...)
# S3 method for rm.sdt
IRT.modelfit(object,...)
# S3 method for IRT.modelfit.rm.sdt
summary(object,...)

Arguments

dat

Original data frame. Ratings on variables must be in rows, i.e. every row corresponds to a person-rater combination.

pid

Person identifier.

rater

Rater identifier.

Qmatrix

An optional Q-matrix. If this matrix is not provided, then by default the ordinary scoring of categories (from 0 to the maximum score of $K$) is used.

theta.k

A grid of theta values for the ability distribution.

est.a.item

Should item parameters $a_i$ be estimated?

est.c.rater

Type of estimation for item-rater parameters $c_{ir}$ in the signal detection model. Options are 'n' (no estimation), 'e' (set all parameters equal to each other), 'i' (item wise estmation), 'r' (rater wise estimation) and 'a' (all parameters are estimated independently from each other).

est.d.rater

Type of estimation of $d$ parameters. Options are the same as in est.c.rater.

est.mean

Optional logical indicating whether the mean of the trait distribution should be estimated.

skillspace

Specified $\theta$ distribution type. It can be "normal" or "discrete". In the latter case, all probabilities of the distribution are separately estimated.

tau.item.fixed

Optional matrix with three columns specifying fixed $\tau$ parameters. The first two columns denote item and category indices, the third the fixed value. See Example 3.

a.item.fixed

Optional matrix with two columns specifying fixed $a$ parameters. First column: Item index. Second column: Fixed $a$ parameter.

d.min

Minimal $d$ parameter to be estimated

d.max

Maximal $d$ parameter to be estimated

d.start

Starting value(s) of $d$ parameters

c.start

Starting values of $c$ parameters

tau.start

Starting values of $\tau$ parameters

sd.start

Starting value for trait standard deviation

d.prior

Normal prior $N(M,S^2)$ for $d$ parameters

c.prior

Normal prior for $c$ parameters. The prior for parameter $c_{irk}$ is defined as $M \cdot ( k - 0.5) $ where $M$ is c.prior[1].

tau.prior

Normal prior for $\tau$ parameters

a.prior

Normal prior for $a$ parameters

max.increment

Maximum increment of item parameters during estimation

numdiff.parm

Numerical differentiation step width

maxdevchange

Maximum relative deviance change as a convergence criterion

globconv

Maximum parameter change

maxiter

Maximum number of iterations

msteps

Maximum number of iterations during an M step

mstepconv

Convergence criterion in an M step

fac_incr

Factor for decreasing the increments in every iteration. The factor should be smaller than 1, small values correspond to strong decreases in parameter changes in iterations.

PEM

Logical indicating whether the P-EM acceleration should be applied (Berlinet & Roland, 2012).

PEM_itermax

Number of iterations in which the P-EM method should be applied.

object

Object of class rm.sdt

file

Optional file name in which summary should be written.

Object of class rm.sdt

ask

Optional logical indicating whether a new plot should be asked for.

type

Factor score estimation method. Up to now, only type="EAP" is supported.

…

Further arguments to be passed

Value

A list with following entries:

deviance

Deviance

Information criteria and number of parameters

item

Data frame with item parameters. The columns N and M denote the number of oberved ratings and the observed mean of all ratings, respectively. In addition to item parameters $\tau_{ik}$ and $a_i$, the mean for the latent response (latM) is computed as $E( \eta_i ) = \sum_p P( \theta_p ) q_{ik} P( \eta_i = k | \theta_p ) $ which provides an item parameter at the original metric of ratings. The latent standard deviation (latSD) is computed in the same manner.

rater

Data frame with rater parameters. Transformed $c$ parameters (c_x.trans) are computed as $c_{irk} / ( d_{ir} )$.

person

Data frame with person parameters: EAP and corresponding standard errors

EAP.rel

EAP reliability

EAP.rel

EAP reliability

Mean of the trait distribution

sigma

Standard deviation of the trait distribution

tau.item

Item parameters $\tau_{ik}$

se.tau.item

Standard error of item parameters $\tau_{ik}$

a.item

Item slopes $a_i$

se.a.item

Standard error of item slopes $a_i$

c.rater

Rater parameters $c_{irk}$

se.c.rater

Standard error of rater severity parameter $c_{irk}$

d.rater

Rater slope parameter $d_{ir}$

se.d.rater

Standard error of rater slope parameter $d_{ir}$

f.yi.qk

Individual likelihood

f.qk.yi

Individual posterior distribution

probs

Item probabilities at grid theta.k. Note that these probabilities are calculated on the pseudo items $i \times r$, i.e. the interaction of item and rater.

prob.item

Probabilities $P( \eta_i = \eta | \theta )$ of latent item responses evaluated at theta grid $\theta_p$.

n.ik

Expected counts

pi.k

Estimated trait distribution $P(\theta_p)$.

maxK

Maximum number of categories

procdata

Processed data

iter

Number of iterations

…

Further values

Details

The specification of the model follows DeCarlo et al. (2011). The second level models the ideal rating (latent response) $\eta =0, ...,K$ of person $p$ on item $i$ $$ P( \eta_{pi} = \eta | \theta_p ) \propto exp( a_{i} q_{ik} \theta_p - \tau_{ik} ) $$ At the first level, the ratings $X_{pir}$ for person $p$ on item $i$ and rater $r$ are modelled as a signal detection model $$ P( X_{pir} \le k | \eta_{pi} ) = G( c_{irk} - d_{ir} \eta_{pi} )$$ where $G$ is the logistic distribution function and the categories are $k=1,\ldots , K+1$. Note that the item response model can be equivalently written as $$ P( X_{pir} \ge k | \eta_{pi} ) = G( d_{ir} \eta_{pi} - c_{irk})$$

The thresholds $c_{irk}$ can be further restricted to $c_{irk} = c_{k}$ (est.c.rater='e'), $c_{irk} = c_{ik}$ (est.c.rater='i') or $c_{irk} = c_{ir}$ (est.c.rater='r'). The same holds for rater precision parameters $d_{ir}$.

References

Berlinet, A. F., & Roland, C. (2012). Acceleration of the EM algorithm: P-EM versus epsilon algorithm. Computational Statistics & Data Analysis, 56(12), 4122-4137.

DeCarlo, L. T. (2005). A model of rater behavior in essay grading based on signal detection theory. Journal of Educational Measurement, 42, 53-76.

DeCarlo, L. T. (2010). Studies of a latent-class signal-detection model for constructed response scoring II: Incomplete and hierarchical designs. ETS Research Report ETS RR-10-08. Princeton NJ: ETS.

DeCarlo, T., Kim, Y., & Johnson, M. S. (2011). A hierarchical rater model for constructed responses, with a signal detection rater model. Journal of Educational Measurement, 48, 333-356.

Examples

Run this code

# NOT RUN {
#############################################################################
# EXAMPLE 1: Hierarchical rater model (HRM-SDT) data.ratings1
#############################################################################
data(data.ratings1)
dat <- data.ratings1

# }
# NOT RUN {
# Model 1: Partial Credit Model: no rater effects
mod1 <- sirt::rm.sdt( dat[ , paste0( "k",1:5) ] , rater=dat$rater , 
            pid=dat$idstud , est.c.rater="n", d.start=100,  est.d.rater="n" )
summary(mod1)
            
# Model 2: Generalized Partial Credit Model: no rater effects
mod2 <- sirt::rm.sdt( dat[ , paste0( "k",1:5) ] , rater=dat$rater , 
            pid=dat$idstud , est.c.rater="n" , est.d.rater="n" , 
            est.a.item =TRUE , d.start=100)
summary(mod2)
            
# Model 3: Equal effects in SDT
mod3 <- sirt::rm.sdt( dat[ , paste0( "k",1:5) ] , rater=dat$rater , 
            pid=dat$idstud , est.c.rater="e" , est.d.rater="e")
summary(mod3)

# Model 4: Rater effects in SDT
mod4 <- sirt::rm.sdt( dat[ , paste0( "k",1:5) ] , rater=dat$rater , 
            pid=dat$idstud , est.c.rater="r" , est.d.rater="r")
summary(mod4)

#############################################################################
# EXAMPLE 2: HRM-SDT data.ratings3
#############################################################################

data(data.ratings3)
dat <- data.ratings3
dat <- dat[ dat$rater < 814 , ]
psych::describe(dat)
            
# Model 1: item- and rater-specific effects
mod1 <- sirt::rm.sdt( dat[ , paste0( "crit",c(2:4)) ] , rater=dat$rater , 
            pid=dat$idstud , est.c.rater="a" , est.d.rater="a" )
summary(mod1)
plot(mod1)

# Model 2: Differing number of categories per variable
mod2 <- sirt::rm.sdt( dat[ , paste0( "crit",c(2:4,6)) ] , rater=dat$rater , 
            pid=dat$idstud , est.c.rater="a" , est.d.rater="a")
summary(mod2)
plot(mod2)

#############################################################################
# EXAMPLE 3: Hierarchical rater model with discrete skill spaces
#############################################################################

data(data.ratings3)
dat <- data.ratings3
dat <- dat[ dat$rater < 814 , ]
psych::describe(dat)

# Model 1: Discrete theta skill space with values of 0,1,2 and 3
mod1 <- sirt::rm.sdt( dat[ , paste0( "crit",c(2:4)) ] , theta.k = 0:3 , rater=dat$rater , 
            pid=dat$idstud , est.c.rater="a" , est.d.rater="a" , skillspace="discrete" )
summary(mod1)
plot(mod1)

# Model 2: Modelling of one item by using a discrete skill space and
#          fixed item parameters

# fixed tau and a parameters
tau.item.fixed <- cbind( 1, 1:3,  100*cumsum( c( 0.5, 1.5, 2.5)) )
a.item.fixed <- cbind( 1, 100 )
# fit HRM-SDT 
mod2 <- sirt::rm.sdt( dat[ , "crit2" , drop=FALSE] , theta.k = 0:3 , rater=dat$rater , 
            tau.item.fixed=tau.item.fixed ,a.item.fixed=a.item.fixed, pid=dat$idstud, 
            est.c.rater="a", est.d.rater="a", skillspace="discrete" )
summary(mod2)            
plot(mod2)
# }

Run the code above in your browser using DataLab