LSMfit: Fitting Latent Space Item Response Models using Joint Maximum Likelihood Estimation

Description

This function fits a Latent Space Item Response Model (LSIRM) with an R dimensional latent space using penalized Joint Maximum Likelihood (pJML) or constrained Joint Maxmum Likelihood (cJML) to observed binary or ordinal item scores.

Usage

LSMfit(X, ndim_z, penalty=NULL, C=NULL, starts=NULL,
       tol=.1e-2, silent=FALSE)

Value

An object of class LSMfit with values

theta: $\theta_p$ estimates
b: $b_i$ estimates
z: $z_{pr}$ estimates
w: $z_{ir}$ estimates
logL: value of the loglikelihood at convergence
starts: the starting values used
as_starts: a list containing the parameter estimates, suitable to be used as argument for starts in a new run
internal: various matrices used internally

Arguments

X

A matrix of size N by n containing the binary or ordinal item scores, where N is the number of subjects and n is the number of items. The number of item score categories can be different across items as long as the lowest score is coded 0 for all items. NA's are allowed.

ndim_z

Number of dimensions of the latent space, R.

penalty

The weight for the L2 penalty of pJML. If both penalty and C (see below) are NULL (the default), a pJML is used with a weight of 1 (i.e., standard normal prior on all parameters ).

C

The maximum size of the norm of the item parameter vectors. The resulting maximum norm for the person parameter vectors is 1/2*C.

starts

Either a list containing starting values for the model parameters (see details) or a character string inidcating the method of starting value calculation. The options are:

"wls": the starting values are determined from fitting a R+1 dimensional item factor model to the polychoric correlation matrix of X using weighted least squares

"ml"

the starting values are determined from fitting a R+1 dimensional linear factor model to the polychoric correlation matrix of X using normal theory maximum likelihood (default)

"random"

starting values are drawn from normal distributions.

tol

Convergence criterion: Iterations stop if the difference in loglikelihoods between two subsequent iterations is smaller than this number. Default is .001.

silent

Logical. If FALSE, iterations details are printed to the screen during estimation.

Author

Dylan Molenaar d.molenaar@uva.nl

Details

LSMfit optimizes the joint likelihood function of the LSIRM described in Molenaar and Jeon (in press) using a variant of the alternating optimization algorithm by Chen et al. (2019) and using either a L2 regularization penalty similar to Bergner et al. (2022) or a constraint on the norms of the parameter vectors similar to Chen et al. (2019). For binary $X_{pi}$, the LISRM by Molenaar and Jeon is given by: $$logit(E(X_{pi})) = \theta_p + b_i - (\Sigma_{r=1}^{R} (z_{pr}-w_{ir})^2)^{1/2}$$

where $\theta_p$ is a person intercept, $b_i$ is an item intercept, $R$ denotes the dimension of the latent space, and $z_{pr}$ and $w_{ir}$ are respectively the person and item coordinates in the latent space. The matrix $\bold{W}$ containing the $w_{ir}$ parameters is constrained to an echelon structure (i.e., all elements from the upper triangle of submatrix $\bold{W}_{1:(R-1),1:(R-1)}$ are fixed to 0. Next, cJML estimation involves constraining the Euclidean norm of person parameter vector $\tau_{1p}=[\theta_p,z_{p1},z_{p2},...,z_{pK}]$ to be equal to C, and the Euclidean norm of the item parameter vector $\tau_{2i}=[b_i,w_{i1},w_{i2},...,w_{iK}]$ to be equal to 1/2*C. On the contrary, pJML estimation involves adding an L2 regularization penalty for all parameters to the joint likelihood function in such a way that the penalty parameter can be interpreted as the precision of a 0-centered normal prior on the parameters.

Using the starts argument, starting values can be provided in a list containing entries:

z0: a N by R matrix with starting values for $z_{pr}$
w0: a n by R matrix with starting values for $w_{ir}$
b0: a n by 1 matrix with starting values for $b_i$
theta0: a N by 1 matrix with starting values for $\theta_p$

Alternatively, starting values can be automatically determined by LSMfit. To this end, the following R+1 factor model wil be fit (omitting the item intercept):

$$g(E(X_{pi}))=\eta_{p0}+\Sigma_{r=1}^{R} \lambda_{ir} \eta_{pr}$$

where the n by R matrix of $\lambda_{ir}$ parameters follows the echelon structure above, and $g(.)$ is either the identity link or the probit link (see below). The model above is fit to the polychoric correlation matrix of $\bold{X}$ using either weighted least squares (WLS) estimation with a probit link for $g(.)$ or normal theory maximum likelihood (ML) estimation with an identity link for $g(.)$ (i.e., the polychoric correlation matrix is treated as a pmcc matrix). In both cases, the thresholds of the polychoric correlation matrix are taken as the basis for the starting values of $b_i$, the factor score estimates of $\eta_{p0}$ are taken as starts for $\theta_p$, the estimates of $\eta_{pr}$ are taken as the starts for $z_{pr}$, and the estimates of $\lambda_{ir}$ are taken as a basis for the starting values of $w_{ir}$. The WLS approach is statistically the most rigorous approach but can be time consuming, while the ML approach is an ad-hoc approach but which is fast and turns out to work well in practice. The ML approach is the default approach to obtain starting values if starts=NULL. Especially for models with R>2 fitting the factor model above may fail. In that case, LSMfit automatically switches to random starts.

Ordinal items are internally accomodated by dummy coding the items with more than 2 score levels into C-1 binary variables using a cummulatrive binary coding scheme (see Molenaar & Jeon, in press). Next, the dummy coded variables are submitted to the binary LSIRM above with the $w_{ir}$ parameters equated for dummy coded variables that correspond to the same original items. In the resuling model, the estimates of $b_i$ correspond to the category parameters of a sequential IRT model (Tutz, 1990) which are generally close to those of a graded response IRT model. The number of score levels can be different across items as long as the lowest score is coded 0 for all items

References

Bergner, Y., Halpin, P., & Vie, J. J. (2022). Multidimensional Item Response Theory in the Style of Collaborative Filtering. Psychometrika, 87(1), 266-288. https://doi.org/10.1007/s11336-021-09788-9

Chen, Y., Li, X., & Zhang, S. (2019). Joint maximum likelihood estimation for high-dimensional exploratory item factor analysis. Psychometrika, 84(1), 124-146. https://doi.org/10.1007/s11336-018-9646-5

Molenaar, D., & Jeon, M.J. (in press). Joint maximum likelihood estimation of latent space item response models. Psychometrika.

Tutz, G. (1990). Sequential item response models with an ordered response. British Journal of Mathematical and Statistical Psychology, 43(1), 39-55. https://doi.org/10.1111/j.2044-8317.1990.tb00925.x

Examples

Run this code

 #
 # only binary items
 #

 # data sim with 1000 subjects and 20 binary items
 # according to 2 dimensional latent space model (R=2)
 set.seed(1111)
 N=1000
 nit=20
 ndim_z=2
 dat_obj=LSMsim(N,nit,ndim_z)
 X=dat_obj$X
 zt=dat_obj$par$zt      # rotated true z, see ?LSMsim and ?LSMrotate
 wt=dat_obj$par$wt      # rotated true w

 #fit model
 results=LSMfit(X,2)

 #plot the parameter recovery results
 oldpar=par(mfrow=c(2,2))

 s_p=sign(cor(results$z,zt))          # to correct for sign switches in the plots
 s_i=sign(cor(results$w,wt))

 plot(s_p[1,1]*zt[,1],results$z[,1]); abline(0,1)
 plot(s_p[2,2]*zt[,2],results$z[,2]); abline(0,1)
 plot(s_i[1,1]*wt[,1],results$w[,1]); abline(0,1)
 plot(s_i[2,2]*wt[,2],results$w[,2]); abline(0,1)

 par(oldpar)

# \donttest{
 #
 # mixed scale items
 #

 # data sim with 1000 subjects and 20 mixed scale items
 # according to 2 dimensional latent space model (R=2)
 set.seed(1111)
 N=1000
 nit=20
 ndim_z=2
 nc=rpois(nit,2)+2   # number of response categories
                     # (between 2 and 7 for this seed)
 dat_obj=LSMsim(N,nit,ndim_z,nc=nc)
 X=dat_obj$X
 zt=dat_obj$par$zt      # rotated true z, see ?LSMsim and ?LSMrotate
 wt=dat_obj$par$wt      # rotated true w

 #fit model
 results=LSMfit(X,2)

 #plot the parameter recovery results
 oldpar=par(mfrow=c(2,2))

 s_p=sign(cor(results$z,zt))          # to correct for sign switches in the plots
 s_i=sign(cor(results$w,wt))

 plot(s_p[1,1]*zt[,1],results$z[,1]); abline(0,1)
 plot(s_p[2,2]*zt[,2],results$z[,2]); abline(0,1)
 plot(s_i[1,1]*wt[,1],results$w[,1]); abline(0,1)
 plot(s_i[2,2]*wt[,2],results$w[,2]); abline(0,1)

 par(oldpar)
 # }

Run the code above in your browser using DataLab