noharm.sirt: NOHARM Model in R

Description

The function is an R implementation of the normal ogive harmonic analysis robust method (the NOHARM model; McDonald, 1997). Exploratory and confirmatory multidimensional item response models for dichotomous data using the probit link function can be estimated. Lower asymptotes (guessing parameters) and upper asymptotes (one minus slipping parameters) can be provided as fixed values.

Usage

noharm.sirt(dat, pm=NULL, N=NULL, weights=NULL, Fval=NULL, Fpatt=NULL, Pval=NULL,
   Ppatt=NULL, Psival=NULL, Psipatt=NULL, dimensions=NULL, lower=0, upper=1, wgtm=NULL,
   pos.loading=FALSE, pos.variance=FALSE, pos.residcorr=FALSE, maxiter=1000, conv=1e-6,
   optimizer="nlminb", par_lower=NULL, reliability=FALSE, …)
# S3 method for noharm.sirt
summary(object, file=NULL, …)

Arguments

dat

Matrix of dichotomous item responses. This matrix may contain missing data (indicated by NA) but missingness is assumed to be missing completely at random (MCAR). Alternatively, a product-moment matrix pm can be used as input.

Optional product-moment matrix

Sample size if pm is provided

weights

Optional vector of student weights.

Fval

Initial or fixed values of the loading matrix $\bold{F}$.

Fpatt

Pattern matrix of the loading matrix $\bold{F}$. If elements should be estimated, then an entry of 1 must be included in the pattern matrix. Parameters which should be estimated with equality constraints must be indicated by same integers but values largers than 1.

Pval

Initial or fixed values for the covariance matrix $\bold{P}$.

Ppatt

Pattern matrix for the covariance matrix $\bold{P}$.

Psival

Initial or fixed values for the matrix of residual correlations $\bold{\Psi}$.

Psipatt

Pattern matrix for the matrix of residual correlations $\bold{\Psi}$.

dimensions

Number of dimensions if an exploratory factor analysis should be estimated.

lower

Fixed vector (or numeric) of lower asymptotes $c_i$.

upper

Fixed vector (or numeric) of upper asymptotes $d_i$.

wgtm

Matrix with positive entries which indicates by a positive entry which item pairs should be used for estimation.

pos.loading

An optional logical indicating whether all entries in the loading matrix $\bold{F}$ should be positive

pos.variance

An optional logical indicating whether all variances (i.e. diagonal entries in $\bold{P}$) should be positive

pos.residcorr

An optional logical indicating whether all entries in the matrix of residual correlations $\bold{\Psi}$ should be positive

par_lower

Optional vector of lower parameter bounds

maxiter

Maximum number of iterations

conv

Convergence criterion for parameters

optimizer

Optimization function to be used. Can be "nlminb" for stats::nlminb or "optim" for stats::optim.

reliability

Logical indicating whether reliability should be computed.

…

Further arguments to be passed.

object

Object of class noharm.sirt

file

String indicating a file name for summary.

Value

A list. The most important entries are

tanaka

Tanaka fit statistic

rmsr

RMSR fit statistic

N.itempair

Sample size per item pair

Product moment matrix

wgtm

Matrix of weights for each item pair

sumwgtm

Sum of lower triangle matrix wgtm

lower

Lower asymptotes

upper

Upper asymptotes

residuals

Residual matrix from approximation of the pm matrix

final.constants

Final constants

factor.cor

Covariance matrix

thresholds

Threshold parameters

uniquenesses

Uniquenesses

loadings

Matrix of standardized factor loadings (delta parametrization)

loadings.theta

Matrix of factor loadings $\bold{F}$ (theta parametrization)

residcorr

Matrix of residual correlations

Nobs

Number of observations

Nitems

Number of items

Fpatt

Pattern loading matrix for $\bold{F}$

Ppatt

Pattern loading matrix for $\bold{P}$

Psipatt

Pattern loading matrix for $\bold{\Psi}$

dat

Used dataset

dimensions

Number of dimensions

iter

Number of iterations

Nestpars

Number of estimated parameters

chisquare

Statistic $\chi^2$

Degrees of freedom

chisquare_df

Ratio $\chi^2 / df$

rmsea

RMSEA statistic

p.chisquare

Significance for $\chi^2$ statistic

omega.rel

Reliability of the sum score according to Green and Yang (2009)

Details

The NOHARM item response model follows the response equation $$P( X_{pi}=1 | \bold{\theta}_p )=c_i + ( d_i - c_i ) \Phi( f_{i0} + f_{i1} \theta_{p1} + ... + f_{iD} \theta_{pD} ) $$ for item responses $X_{pi}$ of person $p$ on item $i$, $\bold{F}=(f_{id})$ is a loading matrix and $\bold{P}$ the covariance matrix of $\bold{\theta}_p$. The lower asymptotes $c_i$ and upper asymptotes $d_i$ must be provided as fixed values. The response equation can be equivalently written by introducing a latent continuous item response $X_{pi}^\ast$ $$ X_{pi}^\ast=f_{i0} + f_{i1} \theta_{p1} + ... + f_{iD} \theta_{pD} + e_{pi} $$ with a standard normally distributed residual $e_{pi}$. These residuals have a correlation matrix $\bold{\Psi}$ with ones in the diagonal. In this R implementation of the NOHARM model, correlations between residuals are allowed.

The estimation relies on a Hermite series approximation of the normal ogive item response functions. In more detail, a series expansion $$\Phi(x)=b_0 + b_1 H_1(x) + b_2 H_2(x) + b_3 H_3(x)$$ is used (McDonald, 1982a). This enables to express cross products $p_{ij}=P(X_i=1, X_j=1)$ as a function of unknown model parameters $$\hat{p}_{ij}=b_{0i} b_{0j} + \sum_{m=1}^3 b_{mi} b_{mj} \left( \frac{\bold{f}_i \bold{P} \bold{f}_j }{\sqrt{ (1+\bold{f}_i \bold{P} \bold{f}_i) (1+\bold{f}_j \bold{P} \bold{f}_j)}} \right) ^m $$ where $b_{0i}=p_{i}=P(X_i=1)=c_i + (d_i - c_i) \Phi(\tau_i)$, $b_{1i}=(d_i-c_i)\phi(\tau_i)$, $b_{2i}=(d_i-c_i)\tau_i \phi(\tau_i) / \sqrt{2}$, and $b_{3i}=(d_i-c_i)(\tau_i^2 - 1)\phi(\tau_i) / \sqrt{6}$.

The least squares criterion $\sum_{i<j} ( p_{ij} - \hat{p}_{ij})^2$ is used for estimating unknown model parameters (McDonald, 1982a, 1982b, 1997).

For derivations of standard errors and fit statistics see Maydeu-Olivares (2001) and Swaminathan and Rogers (2016).

For the statistical properties of the NOHARM approach see Knol and Berger (1991), Finch (2011) or Svetina and Levy (2016).

References

Finch, H. (2011). Multidimensional item response theory parameter estimation with nonsimple structure items. Applied Psychological Measurement, 35(1), 67-82. 10.1177/0146621610367787

Fraser, C., & McDonald, R. P. (1988). NOHARM: Least squares item factor analysis. Multivariate Behavioral Research, 23, 267-269. 10.1207/s15327906mbr2302_9

Fraser, C., & McDonald, R. P. (2012). NOHARM 4 Manual. http://noharm.niagararesearch.ca/nh4man/nhman.html.

Knol, D. L., & Berger, M. P. (1991). Empirical comparison between factor analysis and multidimensional item response models. Multivariate Behavioral Research, 26(3), 457-477. 10.1207/s15327906mbr2603_5

Maydeu-Olivares, A. (2001). Multidimensional item response theory modeling of binary data: Large sample properties of NOHARM estimates. Journal of Educational and Behavioral Statistics, 26(1), 51-71. 10.3102/10769986026001051

McDonald, R. P. (1982a). Linear versus nonlinear models in item response theory. Applied Psychological Measurement, 6(4), 379-396. 10.1177/014662168200600402

McDonald, R. P. (1982b). Unidimensional and multidimensional models for item response theory. I.R.T., C.A.T. conference, Minneapolis, 1982, Proceedings.

McDonald, R. P. (1997). Normal-ogive multidimensional model. In W. van der Linden & R. K. Hambleton (1997): Handbook of modern item response theory (pp. 257-269). New York: Springer. 10.1007/978-1-4757-2691-6

Svetina, D., & Levy, R. (2016). Dimensionality in compensatory MIRT when complex structure exists: Evaluation of DETECT and NOHARM. The Journal of Experimental Education, 84(2), 398-420. 10.1080/00220973.2015.1048845

Swaminathan, H., & Rogers, H. J. (2016). Normal-ogive multidimensional models. In W. J. van der Linden (Ed.). Handbook of item response theory. Volume One: Models (pp. 167-187). Boca Raton: CRC Press. 10.1201/9781315374512

Examples

Run this code

# NOT RUN {
#############################################################################
# EXAMPLE 1: Two-dimensional IRT model with 10 items
#############################################################################

#**** data simulation
set.seed(9776)
N <- 3400 # sample size
# define difficulties
f0 <- c( .5, .25, -.25, -.5, 0, -.5, -.25, .25, .5, 0 )
I <- length(f0)
# define loadings
f1 <- matrix( 0, I, 2 )
f1[ 1:5,1] <- c(.8,.7,.6,.5, .5)
f1[ 6:10,2] <- c(.8,.7,.6,.5, .5 )
# covariance matrix
Pval <- matrix( c(1,.5,.5,1), 2, 2 )
# simulate theta
library(mvtnorm)
theta <- mvtnorm::rmvnorm(N, mean=c(0,0), sigma=Pval )
# simulate item responses
dat <- matrix( NA, N, I )
for (ii in 1:I){ # ii <- 1
    dat[,ii] <- 1*( stats::pnorm(f0[ii]+theta[,1]*f1[ii,1]+theta[,2]*f1[ii,2])>
                     stats::runif(N) )
        }
colnames(dat) <- paste0("I", 1:I)

#**** Model 1: Two-dimensional CFA with estimated item loadings
# define pattern matrices
Pval <- .3+0*Pval
Ppatt <- 1*(Pval>0)
diag(Ppatt) <- 0
diag(Pval) <- 1
Fval <- .7 * ( f1>0)
Fpatt <- 1 * ( Fval > 0 )
# estimate model
mod1 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt, Fpatt=Fpatt, Fval=Fval, Pval=Pval )
summary(mod1)
# EAP ability estimates
pmod1 <- sirt::R2noharm.EAP(mod1, theta.k=seq(-4,4,len=10) )
# model fit
summary( sirt::modelfit.sirt(mod1) )

# }
# NOT RUN {
#*** compare results with NOHARM software
noharm.path <- "c:/NOHARM"   # specify path for noharm software
mod1a <- sirt::R2noharm( dat=dat, model.type="CFA",  F.pattern=Fpatt, F.init=Fval,
             P.pattern=Ppatt, P.init=Pval, writename="r2noharm_example",
             noharm.path=noharm.path, dec="," )
summary(mod1a)

#**** Model 1c: put some equality constraints
Fpatt[ c(1,4),1] <- 3
Fpatt[ cbind( c(3,7), c(1,2)) ] <- 4
mod1c <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt, Fpatt=Fpatt, Fval=Fval, Pval=Pval)
summary(mod1c)

#**** Model 2: Two-dimensional CFA with correlated residuals
# define pattern matrix for residual correlation
Psipatt <- 0*diag(I)
Psipatt[1,2] <- 1
Psival <- 0*Psipatt
# estimate model
mod2 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval,
            Psival=Psival, Psipatt=Psipatt )
summary(mod2)

#**** Model 3: Two-dimensional Rasch model
# pattern matrices
Fval <- matrix(0,10,2)
Fval[1:5,1] <- Fval[6:10,2] <- 1
Fpatt <- 0*Fval
Ppatt <- Pval <- matrix(1,2,2)
Pval[1,2] <- Pval[2,1] <- 0
# estimate model
mod3 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval )
summary(mod3)
# model fit
summary( sirt::modelfit.sirt( mod3 ))

#** compare fit with NOHARM
noharm.path <- "c:/NOHARM"
P.pattern <- Ppatt ; P.init <- Pval
F.pattern <- Fpatt ; F.init <- Fval
mod3b <- sirt::R2noharm( dat=dat, model.type="CFA",
             F.pattern=F.pattern, F.init=F.init, P.pattern=P.pattern,
             P.init=P.init, writename="example_sim_2dim_rasch",
             noharm.path=noharm.path, dec="," )
summary(mod3b)

#############################################################################
# EXAMPLE 2: data.read
#############################################################################

data(data.read)
dat <- data.read
I <- ncol(dat)

#**** Model 1: Unidimensional Rasch model
Fpatt <- matrix( 0, I, 1 )
Fval <- 1 + 0*Fpatt
Ppatt <- Pval <- matrix(1,1,1)
# estimate model
mod1 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval )
summary(mod1)
plot(mod1)    # semPaths plot

#**** Model 2: Rasch model in which item pairs within a testlet are excluded
wgtm <- matrix( 1, I, I )
wgtm[1:4,1:4] <- wgtm[5:8,5:8] <- wgtm[ 9:12, 9:12] <- 0
# estimation
mod2 <- sirt::noharm.sirt(dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval, wgtm=wgtm)
summary(mod2)

#**** Model 3: Rasch model with correlated residuals
Psipatt <- Psival <- 0*diag(I)
Psipatt[1:4,1:4] <- Psipatt[5:8,5:8] <- Psipatt[ 9:12, 9:12] <- 1
diag(Psipatt) <- 0
Psival <- .6*(Psipatt>0)
# estimation
mod3 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval,
            Psival=Psival, Psipatt=Psipatt )
summary(mod3)
# allow only positive residual correlations
mod3b <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt, Fpatt=Fpatt, Fval=Fval, Pval=Pval,
            Psival=Psival, Psipatt=Psipatt, pos.residcorr=TRUE)
summary(mod3b)
#* constrain residual correlations
Psipatt[1:4,1:4] <- 2
Psipatt[5:8,5:8] <- 3
Psipatt[ 9:12, 9:12] <- 4
mod3c <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt, Fpatt=Fpatt, Fval=Fval, Pval=Pval,
            Psival=Psival, Psipatt=Psipatt, pos.residcorr=TRUE)
summary(mod3c)

#**** Model 4: Rasch testlet model
Fval <- Fpatt <- matrix( 0, I, 4 )
Fval[,1] <- Fval[1:4,2] <- Fval[5:8,3] <- Fval[9:12,4 ] <- 1
Ppatt <- Pval <- diag(4)
colnames(Ppatt) <- c("g", "A", "B","C")
Pval <- .5*Pval
# estimation
mod4 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval  )
summary(mod4)
# allow only positive variance entries
mod4b <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval,
               pos.variance=TRUE )
summary(mod4b)

#**** Model 5: Bifactor model
Fval <- matrix( 0, I, 4 )
Fval[,1] <- Fval[1:4,2] <- Fval[5:8,3] <- Fval[9:12,4 ] <- .6
Fpatt <- 1 * ( Fval > 0 )
Pval <- diag(4)
Ppatt <- 0*Pval
colnames(Ppatt) <- c("g", "A", "B","C")
# estimation
mod5 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval  )
summary(mod5)
# allow only positive loadings
mod5b <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval,
              pos.loading=TRUE )
summary(mod5b)
summary( sirt::modelfit.sirt(mod5b))

#**** Model 6: 3-dimensional Rasch model
Fval <- matrix( 0, I, 3 )
Fval[1:4,1] <- Fval[5:8,2] <- Fval[9:12,3 ] <- 1
Fpatt <- 0*Fval
Pval <- .6*diag(3)
diag(Pval) <- 1
Ppatt <- 1+0*Pval
colnames(Ppatt) <- c("A", "B","C")
# estimation
mod6 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval  )
summary(mod6)
summary( sirt::modelfit.sirt(mod6) )  # model fit

#**** Model 7: 3-dimensional 2PL model
Fval <- matrix( 0, I, 3 )
Fval[1:4,1] <- Fval[5:8,2] <- Fval[9:12,3 ] <- 1
Fpatt <- Fval
Pval <- .6*diag(3)
diag(Pval) <- 1
Ppatt <- 1+0*Pval
diag(Ppatt) <- 0
colnames(Ppatt) <- c("A", "B","C")
# estimation
mod7 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval  )
summary(mod7)
summary( sirt::modelfit.sirt(mod7) )

#**** Model 8: Exploratory factor analysis with 3 dimensions
# estimation
mod8 <- sirt::noharm.sirt( dat=dat, dimensions=3  )
summary(mod8)

#############################################################################
# EXAMPLE 3: Product-moment matrix input, McDonald (1997)
#############################################################################

# data from Table 1 of McDonald (1997, p. 266)
pm0 <- "
0.828
0.567 0.658
0.664 0.560 0.772
0.532 0.428 0.501 0.606
0.718 0.567 0.672 0.526 0.843
"
pm <- miceadds::string_to_matrix(x=pm0, as_numeric=TRUE, extend=TRUE)
I <- nrow(pm)
rownames(pm) <- colnames(pm) <- paste0("I", 1:I)

#- Model 1: Unidimensional model
Fval <- matrix(.7, nrow=I, ncol=1)
Fpatt <- 1+0*Fval
Pval <- matrix(1, nrow=1,ncol=1)
Ppatt <- 0*Pval

mod1 <- sirt::noharm.sirt(pm=pm, N=1000, Fval=Fval, Fpatt=Fpatt, Pval=Pval, Ppatt=Ppatt)
summary(mod1)

#- Model 2: Twodimensional exploratory model
mod2 <- sirt::noharm.sirt(pm=pm, N=1000, dimensions=2)
summary(mod2)

#- Model 3: Unidimensional model with correlated residuals
Psival <- matrix(0, nrow=I, ncol=I)
Psipatt <- 0*Psival
Psipatt[5,1] <- 1

mod3 <- sirt::noharm.sirt(pm=pm, N=1000, Fval=Fval, Fpatt=Fpatt, Pval=Pval, Ppatt=Ppatt,
            Psival=Psival, Psipatt=Psipatt)
summary(mod3)
# }

Run the code above in your browser using DataLab