mirt: Full-Information Item Factor Analysis (Multidimensional Item Response Theory)

Description

mirt fits an unconditional maximum likelihood factor analysis model to any mixture of dichotomous and polytomous data under the item response theory paradigm using either Cai's (2010) Metropolis-Hastings Robbins-Monro (MHRM) algorithm or with an EM algorithm approach outlined by Bock and Aiken (1981) using rectangular or quasi-Monte Carlo integration grids. Models containing 'explanatory' person or item level predictors can only be included by using the mixedmirt function, though latent regression models can be fit using the formula input below. Tests that form a two-tier or bi-factor structure should be estimated with the bfactor function, which uses a dimension reduction EM algorithm for modeling item parcels. Multiple group analyses (useful for DIF and DTF testing) are also available using the multipleGroup function.

Usage

mirt(data, model, itemtype = NULL, guess = 0, upper = 1, SE = FALSE,
  covdata = NULL, formula = NULL, SE.type = "crossprod", method = "EM",
  optimizer = NULL, pars = NULL, constrain = NULL, parprior = NULL,
  calcNull = TRUE, draws = 5000, survey.weights = NULL, quadpts = NULL,
  TOL = NULL, gpcm_mats = list(), grsm.block = NULL, rsm.block = NULL,
  key = NULL, large = FALSE, GenRandomPars = FALSE,
  accelerate = "Ramsay", empiricalhist = FALSE, verbose = TRUE,
  solnp_args = list(), alabama_args = list(), spline_args = list(),
  control = list(), technical = list(), ...)

Arguments

data

a matrix or data.frame that consists of numerically ordered data, with missing data coded as NA

model

a string to be passed (or an object returned from) mirt.model, declaring how the IRT model is to be estimated (loadings, constraints, priors, etc). For exploratory IRT models, a single numeric value indicating the number of factors to extract is also supported

itemtype

type of items to be modeled, declared as a vector for each item or a single value which will be repeated globally. The NULL default assumes that the items follow a graded or 2PL structure, however they may be changed to the following: 'Rasch', '2PL', '3PL', '3PLu', '4PL', 'graded', 'grsm', 'grsmIRT', 'gpcm', 'rsm', 'nominal', 'ideal', 'PC2PL', 'PC3PL', '2PLNRM', '3PLNRM', '3PLuNRM', '4PLNRM', and 'spline', for the Rasch/partial credit, 2 parameter logistic, 3 parameter logistic (lower or upper asymptote estimated), 4 parameter logistic, graded response model, rating scale graded response model (in slope-intercept and classical form), generalized partial credit model (with optional scoring matricies from gpcm_mats), Rasch rating scale model, nominal model, ideal-point model, 2-3PL partially compensatory model, 2-4 parameter nested logistic models, and spline response models with the bs (default) and ns functions, respectively. User defined item classes can also be defined using the createItem function

guess

fixed pseudo-guessing parameters. Can be entered as a single value to assign a global guessing parameter or may be entered as a numeric vector corresponding to each item

upper

fixed upper bound parameters for 4-PL model. Can be entered as a single value to assign a global guessing parameter or may be entered as a numeric vector corresponding to each item

logical; estimate the standard errors by computing the parameter information matrix? See SE.type for the type of estimates available

covdata

a data.frame of data used for latent regression models

formula

an R formula (or list of formulas) indicating how the latent traits can be regressed using external covariates in covdata. If a named list of formulas is supplied (where the names correspond to the latent trait names in model) then specific regression effects can be estimated for each factor. Supplying a single formula will estimate the regression parameters for all latent traits by default

SE.type

type of estimation method to use for calculating the parameter information matrix for computing standard errors and wald tests. Can be 'Richardson', 'forward', or 'central' for the numerical Richardson, forward difference, and central difference evaluation of observed Hessian, 'Fisher' for the expected information, 'complete' for information based on the complete-data Hessian used in EM algorithm, 'SEM' for the supplemented EM (disables the accelerate option; EM only), 'crossprod' for standard error computations based on the variance of the Fisher scores, 'Louis' for Louis' (1982) computation of the observed information matrix, and 'sandwich' for the sandwich covariance estimate.

Note that for 'SEM' option increasing the number of iterations (NCYCLES and TOL, see below) will help to improve the accuracy, and will be run in parallel if a mirtCluster object has been defined. When method = 'BL' then the option 'numerical' is available to obtain the numerical estimate from a call to optim.

Other options include 'MHRM' and 'FMHRM' for stochastic approximations based on the Robbins-Monro filter or a fixed number of MHRM draws without the RM filter. These are the only options supported when method = 'MHRM'

method

a character object specifying the estimation algorithm to be used. The default is 'EM', for the standard EM algorithm with fixed quadrature, or 'QMCEM' for quasi-Monte Carlo EM estimation. The option 'MHRM' may also be passed to use the MH-RM algorithm, as well as 'BL' for the Bock and Lieberman approach (generally not recommended for longer tests).

The 'EM' is generally effective with 1-3 factors, but methods such as the 'QMCEM' or 'MHRM' should be used when the dimensions are 3 or more

optimizer

a character indicating which numerical optimizer to use. By default, the EM algorithm will use the 'BFGS' when there are no upper and lower bounds, and a penelized version of the BFGS algorithm when there are.

Another good option which supports bound constraints is the 'nlminb'. Other options include the Newton-Raphson ('NR'), which often will be more efficient than the 'BFGS' but not as stable for more complex IRT models (such as the nominal or nested logit models) and does not support upper and lower bound constraints. As well, the 'Nelder-Mead', 'SANN', and 'L-BFGS-B' estimators are also available, but their routine use generally is not required or recommended. The MH-RM algorithm uses the 'NR' by default, and currently cannot be changed.

Additionally, estimation subroutines from the Rsolnp and alabama packages are available by passing the arguments 'solnp' and 'alabama', respectively. This should be used in conjunction with the solnp_args and alabama_args specified below. If equality constraints were specified in the model definition only the parameter with the lowest parnum in the pars = 'values' data.frame is used in the estimation vector passed to the objective function, and group hyper-parameters are omitted. Equality an inequality functions should be of the form function(p, optim_args), where optim_args is a list of internally parameters that largely can be ignored when defining constraints (though use of browser() here may be helpful). Note: for the 'alabama' optimizer, the starting values should be adjusted such that all constraints are met prior to the first maximization-step. The 'solnp' optimizer is less sensative to this initial conditoin restriction, but it may also if the model is unstable early in the EM cycles

pars

a data.frame with the structure of how the starting values, parameter numbers, estimation logical values, etc, are defined. The user may observe how the model defines the values by using pars = 'values', and this object can in turn be modified and input back into the estimation with pars = mymodifiedpars

constrain

a list of user declared equality constraints. To see how to define the parameters correctly use pars = 'values' initially to see how the parameters are labeled. To constrain parameters to be equal create a list with separate concatenated vectors signifying which parameters to constrain. For example, to set parameters 1 and 5 equal, and also set parameters 2, 6, and 10 equal use constrain = list(c(1,5), c(2,6,10)). Constraints can also be specified using the mirt.model syntax (recommended)

parprior

a list of user declared prior item probabilities. To see how to define the parameters correctly use pars = 'values' initially to see how the parameters are labeled. Can define either normal (e.g., intercepts, lower/guessing and upper bounds), log-normal (e.g., for univariate slopes), or beta prior probabilities. To specify a prior the form is c('priortype', ...), where normal priors are parprior = list(c(parnumbers, 'norm', mean, sd)), parprior = list(c(parnumbers, 'lnorm', log_mean, log_sd)) for log-normal, and parprior = list(c(parnumbers, 'beta', alpha, beta)) for beta, and parprior = list(c(parnumbers, 'expbeta', alpha, beta)) for the beta distribution after applying the function plogis to the input value (note, this is specifically for applying a beta prior to the lower/upper-bound parameters in 3/4PL models). Priors can also be specified using mirt.model syntax (recommended)

calcNull

logical; calculate the Null model for additional fit statistics (e.g., TLI)? Only applicable if the data contains no NA's and the data is not overly sparse, otherwise it is ignored

draws

the number of Monte Carlo draws to estimate the log-likelihood for the MH-RM algorithm. Default is 5000

survey.weights

a optional numeric vector of survey weights to apply for each case in the data (EM estimation only). If not specified, all cases are weighted equally (the standard IRT approach). The sum of the survey.weights must equal the total sample size for proper weighting to be applied

quadpts

number of quadrature points per dimension (must be larger than 2). By default the number of quadrature uses the following scheme: switch(as.character(nfact), '1'=61, '2'=31, '3'=15, '4'=9, '5'=7, 3). However, if the method input is set to 'QMCEM' and this argument is left blank then the default number of quasi-Monte Carlo integration nodes will be set to 5000 in total

TOL

convergence threshold for EM or MH-RM; defaults are .0001 and .001. If SE.type = 'SEM' and this value is not specified, the default is set to 1e-5. If empiricalhist = TRUE and TOL is not specified then the default 3e-5 will be used. To evaluate the model using only the starting values pass TOL = NaN, and to evalute the starting values without the log-likelihood pass TOL = NA

gpcm_mats

a list of matrices specifying how the scoring coefficients in the (generalized) partial credit model should be constructed. If omitted, the standard gpcm format will be used (i.e., seq(0, k, by = 1) for each trait). This input should be used if traits should be scored different for each category (e.g., matrix(c(0:3, 1,0,0,0), 4, 2) for a two-dimensional model where the first trait is scored like a gpcm, but the second trait is only positively indicated when the first category is selected). Can be used when itemtypes are 'gpcm' or 'Rasch', but only when the respective element in gpcm_mats is not NULL

grsm.block

an optional numeric vector indicating where the blocking should occur when using the grsm, NA represents items that do not belong to the grsm block (other items that may be estimated in the test data). For example, to specify two blocks of 3 with a 2PL item for the last item: grsm.block = c(rep(1,3), rep(2,3), NA). If NULL the all items are assumed to be within the same group and therefore have the same number of item categories

rsm.block

same as grsm.block, but for 'rsm' blocks

key

a numeric vector of the response scoring key. Required when using nested logit item types, and must be the same length as the number of items used. Items that are not nested logit will ignore this vector, so use NA in item locations that are not applicable

large

either a logical, indicating whether the internal collapsed data should be returned, or a list of internally computed data tables. If TRUE is passed, a list containing the organized tables is returned. This list object can then be passed back into large to avoid reorganizing the data again (useful when the dataset are very large and computing the tabulated data is computationally burdensome).

The best strategy for large data is to always pass the internal data to the estimation function, shown below:

Compute organized data: e.g., internaldat <- mirt(Science, 1, large = TRUE)
Pass the organized data to all estimation functions: e.g., mod <- mirt(Science, 1, large = internaldat)

GenRandomPars

logical; generate random starting values prior to optimization instead of using the fixed internal starting values?

accelerate

a character vector indicating the type of acceleration to use. Default is 'Ramsay', but may also be 'squarem' for the SQUAREM procedure (specifically, the gSqS3 approach) described in Varadhan and Roldand (2008). To disable the acceleration, pass 'none'

empiricalhist

logical; estimate prior distribution using an empirical histogram approach. Only applicable for unidimensional models estimated with the EM algorithm. The number of cycles, TOL, and quadpts are adjusted accomodate for less precision during estimation (TOL = 3e-5, NCYCLES = 2000, quadpts = 199)

verbose

logical; print observed- (EM) or complete-data (MHRM) log-likelihood after each iteration cycle? Default is TRUE

solnp_args

a list of arguments to be passed to the solnp::solnp() function for equality constraints, inequality constraints, etc

alabama_args

a list of arguments to be passed to the alabama::constrOptim.nl() function for equality constraints, inequality constraints, etc

spline_args

a named list of lists containing information to be passed to the bs (default) and ns for each spline itemtype. Each element must refer to the name of the itemtype with the spline, while the internal list names refer to the arguments which are passed. For example, if item 2 were called 'read2', and item 5 were called 'read5', both of which were of itemtype 'spline' but item 5 should use the ns form, then a modified list for each input might be of the form:

spline_args = list(read2 = list(degree = 4), read5 = list(fun = 'ns', knots = c(-2, 2)))

This code input changes the bs() splines function to have a degree = 4 input, while the second element changes to the ns() function with knots set a c(-2, 2)

control

a list passed to the respective optimizers (i.e., optim(), nlminb(), etc)

technical

a list containing lower level technical parameters for estimation. May be:

NCYCLES: maximum number of EM or MH-RM cycles; defaults are 500 and 2000
MAXQUAD: maximum number of quadratures, which you can increase if you have more than 4GB or RAM on your PC; default 20000
theta_lim: range of integration grid for each dimension; default is c(-6, 6)
set.seed: seed number used during estimation. Default is 12345
SEtol: standard error tolerance criteria for the S-EM and MHRM computation of the information matrix. Default is 1e-3
symmetric_SEM: logical; force S-EM information matrix to be symmetric? Default is TRUE so that computation of standard errors are more stable. Setting this to FALSE can help to detect solutions that have not reached the ML estimate
SEM_window: ratio of values used to define the S-EM window based on the observed likelihood differences across EM iterations. The default is c(0, 1 - SEtol), which provides nearly the very full S-EM window (i.e., nearly all EM cycles used). To use the a smaller SEM window change the window to to something like c(.9, .999) to start at a point farther into the EM history
warn: logical; include warning messages during estimation? Default is TRUE
message: logical; include general messages during estimation? Default is TRUE
customK: a numeric vector used to explicitly declare the number of response categories for each item. This should only be used when constructing mirt model for reasons other than parameter estimation (such as to obtain factor scores), and requires that the input data all have 0 as the lowest category. The format is the same as the extract.mirt(mod, 'K') slot in all converged models
customPriorFun: a custom function used to determine the normalized density for integration in the EM algorithm. Must be of the form function(Theta, Etable){...}, and return a numeric vector with the same length as number of rows in Theta. The Etable input contains the aggregated table generated from the current E-step computations. For proper integration, the returned vector should sum to 1 (i.e., normalized). Note that if using the Etable it will be NULL on the first call, therefore the prior will have to deal with this issue accordingly
customTheta: a custom Theta grid, in matrix form, used for integration. If not defined, the grid is determined internally based on the number of quadpts
delta: the deviation term used in numerical estimates when computing the ACOV matrix with the 'forward' or 'central' numerical approaches. Default is 1e-5
parallel: logical; use the parallel cluster defined by mirtCluster? Default is TRUE
removeEmptyRows: logical; remove response vectors that only contain NA's? Default is FALSE
internal_constrains: logical; include the internal constrains when using certain IRT models (e.g., 'grsm' itemtype). Disable this if you want to use special optimizers such as the solnp. Default is TRUE
gain: a vector of two values specifying the numerator and exponent values for the RM gain function $(val1 / cycle)^val2$. Default is c(0.10, 0.75)
BURNIN: number of burn in cycles (stage 1) in MH-RM; default is 150
SEMCYCLES: number of SEM cycles (stage 2) in MH-RM; default is 100
MHDRAWS: number of Metropolis-Hasting draws to use in the MH-RM at each iteration; default is 5
MHcand: a vector of values used to tune the MH sampler. Larger values will cause the acceptance ratio to decrease. One value is required for each group in unconditional item factor analysis (mixedmirt() requires additional values for random effect). If null, these values are determined internally, attempting to tune the acceptance of the draws to be between .1 and .4
MHRM_SE_draws: number of fixed draws to use when SE=TRUE and SE.type = 'FMHRM' and the maximum number of draws when SE.type = 'MHRM'. Default is 2000

...

additional arguments to be passed

Value

function returns an object of class SingleGroupClass (SingleGroupClass-class)

Confirmatory and Exploratory IRT

Specification of the confirmatory item factor analysis model follows many of the rules in the structural equation modeling framework for confirmatory factor analysis. The variances of the latent factors are automatically fixed to 1 to help facilitate model identification. All parameters may be fixed to constant values or set equal to other parameters using the appropriate declarations. Confirmatory models may also contain 'explanatory' person or item level predictors, though including predictors is currently limited to the mixedmirt function.

When specifying a single number greater than 1 as the model input to mirt an exploratory IRT model will be estimated. Rotation and target matrix options are available if they are passed to generic functions such as summary-method and fscores. Factor means and variances are fixed to ensure proper identification.

If the model is an exploratory item factor analysis estimation will begin by computing a matrix of quasi-polychoric correlations. A factor analysis with nfact is then extracted and item parameters are estimated by $a_{ij} = f_{ij}/u_j$, where $f_{ij}$ is the factor loading for the jth item on the ith factor, and $u_j$ is the square root of the factor uniqueness, $\sqrt{1 - h_j^2}$. The initial intercept parameters are determined by calculating the inverse normal of the item facility (i.e., item easiness), $q_j$, to obtain $d_j = q_j / u_j$. A similar implementation is also used for obtaining initial values for polytomous items.

A note on upper and lower bound parameters

Internally the $g$ and $u$ parameters are transformed using a logit transformation ($log(x/(1-x))$), and can be reversed by using $1 / (1 + exp(-x))$ following convergence. This also applies when computing confidence intervals for these parameters, and is done so automatically if coef(mod, rawug = FALSE).

As such, when applying prior distributions to these parameters it is recommended to use a prior that ranges from negative infinity to positive infinity, such as the normally distributed prior via the 'norm' input (see mirt.model).

Convergence for quadrature methods

Unrestricted full-information factor analysis is known to have problems with convergence, and some items may need to be constrained or removed entirely to allow for an acceptable solution. As a general rule dichotomous items with means greater than .95, or items that are only .05 greater than the guessing parameter, should be considered for removal from the analysis or treated with prior parameter distributions. The same type of reasoning is applicable when including upper bound parameters as well. For polytomous items, if categories are rarely endorsed then this will cause similar issues. Also, increasing the number of quadrature points per dimension, or using the quasi-Monte Carlo integration method, may help to stabilize the estimation process in higher dimensions. Finally, solutions that are not well defined also will have difficulty converging, and can indicate that the model has been misspecified (e.g., extracting too many dimensions).

Convergence for MH-RM method

For the MH-RM algorithm, when the number of iterations grows very high (e.g., greater than 1500) or when Max Change = .2500 values are repeatedly printed to the console too often (indicating that the parameters were being constrained since they are naturally moving in steps greater than 0.25) then the model may either be ill defined or have a very flat likelihood surface, and genuine maximum-likelihood parameter estimates may be difficult to find. Standard errors are computed following the model convergence by passing SE = TRUE, to perform an addition MH-RM stage but treating the maximum-likelihood estimates as fixed points.

Additional helper functions

Additional functions are available in the package which can be useful pre- and post-estimation. These are:

mirt.model: Define the IRT model specification use special syntax. Useful for defining between and within group parameter constraints, prior parameter distributions, and specifying the slope coefficients for each factor
coef-method: Extract raw coefficients from the model, along with their standard errors and confidence intervals
summary-method: Extract standardized loadings from model. Accepts a rotate argument for exploratory item response model
anova-method: Compare nested models using likelihood ratio statistics as well as information criteria such as the AIC and BIC
residuals-method: Compute pairwise residuals between each item using methods such as the LD statistic (Chen & Thissen, 1997), as well as response pattern residuals
plot-method: Plot various types of test level plots including the test score and information functions and more
itemplot: Plot various types of item level plots, including the score, standard error, and information functions, and more
createItem: Create a customized itemtype that does not currently exist in the package
imputeMissing: Impute missing data given some computed Theta matrix
fscores: Find predicted scores for the latent traits using estimation methods such as EAP, MAP, ML, WLE, and EAPsum
wald: Compute Wald statistics follow the convergence of a model with a suitable information matrix
M2: Limited information goodness of fit test statistic based to determine how well the model fits the data
itemfit and personfit: Goodness of fit statistics at the item and person levels, such as the S-X2, infit, outfit, and more
boot.mirt: Compute estimated parameter confidence intervals via the bootstrap methods
mirtCluster: Define a cluster for the package functions to use for capitalizing on multi-core architecture to utilize available CPUs when possible. Will help to decrease estimation times for tasks that can be run in parallel

IRT Models

The parameter labels use the follow convention, here using two factors and $k$ as the number of categories.

Rasch

Only one intercept estimated, and the latent variance of $\theta$ is freely estimated. If the data have more than two categories then a partial credit model is used instead (see 'gpcm' below). $$P(x = 1|\theta, d) = \frac{1}{1 + exp(-(\theta + d))}$$

2-4PL

Depending on the model $u$ may be equal to 1 and $g$ may be equal to 0. $$P(x = 1|\theta, \psi) = g + \frac{(u - g)}{ 1 + exp(-(a_1 * \theta_1 + a_2 * \theta_2 + d))}$$

graded

The graded model consists of sequential 2PL models, and here $k$ is the predicted category. $$P(x = k | \theta, \psi) = P(x \ge k | \theta, \phi) - P(x \ge k + 1 | \theta, \phi)$$

grsm

A more constrained version of the graded model where graded spacing is equal accross item blocks and only adjusted by a single 'difficulty' parameter (c) while the latent variance of $\theta$ is freely estimated. Again, $$P(x = k | \theta, \psi) = P(x \ge k | \theta, \phi) - P(x \ge k + 1 | \theta, \phi)$$ but now $$P = \frac{1}{1 + exp(-(a_1 * \theta_1 + a_2 * \theta_2 + d_k + c))}$$

grsmIRT

Similar to the grsm item type, but using the IRT parameterization instead (see Muraki, 1990 for this exact form). This is restricted to unidimensional models only, whereas grsm may be used for unidimensional or multidimensional models and is more consistant with the form of other IRT models in mirt

gpcm/nominal

For the gpcm the $d$ values are treated as fixed and orderd values from 0:(k-1) (in the nominal model $d_0$ is also set to 0). Additionally, for identification in the nominal model $ak_0 = 0$, $ak_{(k-1)} = (k - 1)$. $$P(x = k | \theta, \psi) = \frac{exp(ak_{k-1} * (a_1 * \theta_1 + a_2 * \theta_2) + d_{k-1})} {\sum_1^k exp(ak_{k-1} * (a_1 * \theta_1 + a_2 * \theta_2) + d_{k-1})}$$

For partial credit model (when itemtype = 'Rasch'; unidimensional only) the above model is further constrained so that $ak = (0,1,\ldots, k-1)$, $a_1 = 1$, and the latent variance of $\theta_1$ is freely estimated. However, more specific scoring function may be included by passing a suitable list or matricies to the gpcm_mats input argument.

In the nominal model this parametrization helps to identify the empirical ordering of the categories by inspecting the $ak$ values. Larger values indicate that the item category is more positively related to the latent trait(s) being measured. For instance, if an item was truly ordinal (such as a Likert scale), and had 4 response categories, we would expect to see $ak_0 < ak_1 < ak_2 < ak_3$ following estimation. If on the other hand $ak_0 > ak_1$ then it would appear that the second category is less related to to the trait than the first, and therefore the second category should be understood as the 'lowest score'.

NOTE: The nominal model can become numerical unstable if poor choices for the high and low values are chosen, resulting in ak values greater than abs(10) or more. It is recommended to choose high and low anchors that cause the estimated parameters to fall between 0 and the number of categories - 1 either by theoretical means or by re-estimating the model with better values following convergence.

rsm

A more constrained version of the partial credit model where the spacing is equal across item blocks and only adjusted by a single 'difficulty' parameter (c). Note that this is analogous to the relationship between the graded model and the grsm (with an additional constraint regarding the fixed discrimination parameters; the discrimination constraint can, however, be relaxed by adjusting the starting values specifications manually and applying additional equality constraints).

ideal

The ideal point model has the form, with the upper bound constraint on $d$ set to 0: $$P(x = 1 | \theta, \psi) = exp(-0.5 * (a_1 * \theta_1 + a_2 * \theta_2 + d)^2)$$

partcomp

Partially compensatory models consist of the product of 2PL probability curves. $$P(x = 1 | \theta, \psi) = g + (1 - g) (\frac{1}{1 + exp(-(a_1 * \theta_1 + d_1))} * \frac{1}{1 + exp(-(a_2 * \theta_2 + d_2))})$$

2-4PLNRM

Nested logistic curves for modeling distractor items. Requires a scoring key. The model is broken into two components for the probability of endorsement. For successful endorsement the probability trace is the 1-4PL model, while for unsuccessful endorsement: $$P(x = 0 | \theta, \psi) = (1 - P_{1-4PL}(x = 1 | \theta, \psi)) * P_{nominal}(x = k | \theta, \psi)$$ which is the product of the compliment of the dichotomous trace line with the nominal response model. In the nominal model, the slope parameters defined above are constrained to be 1's, while the last value of the $ak$ is freely estimated.

spline

Spline response models attempt to model the response curves uses non-linear and potentially non-monotonic patterns. The form is $$P(x = 1|\theta, \eta) = \frac{1}{1 + exp(-(\eta_1 * X_1 + \eta_2 * X_2 + \cdots + \eta_n * X_n))}$$ where the $X_n$ are from the spline design matrix $X$ organized from the grid of $\theta$ values. B-splines with a natural or polynomial basis are supported, and the intercept input is set to TRUE by default.

HTML help files, exercises, and examples

To access examples, vignettes, and exercise files that have been generated with knitr please visit https://github.com/philchalmers/mirt/wiki.

References

Andrich, D. (1978). A rating scale formulation for ordered response categories. Psychometrika, 43, 561-573.

Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443-459.

Bock, R. D., Gibbons, R., & Muraki, E. (1988). Full-Information Item Factor Analysis. Applied Psychological Measurement, 12(3), 261-280.

Bock, R. D. & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika, 35, 179-197.

Cai, L. (2010a). High-Dimensional exploratory item factor analysis by a Metropolis-Hastings Robbins-Monro algorithm. Psychometrika, 75, 33-57.

Cai, L. (2010b). Metropolis-Hastings Robbins-Monro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics, 35, 307-335.

Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 1-29.

Chalmers, R., P. & Flora, D. (2014). Maximum-likelihood Estimation of Noncompensatory IRT Models with the MH-RM Algorithm. Applied Psychological Measurement, 38, 339-358.

Chen, W. H. & Thissen, D. (1997). Local dependence indices for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265-289.

Lord, F. M. & Novick, M. R. (1968). Statistical theory of mental test scores. Addison-Wesley.

Ramsay, J. O. (1975). Solving implicit equations in psychometric data analysis. Psychometrika, 40, 337-360.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Danish Institute for Educational Research.

Maydeu-Olivares, A., Hernandez, A. & McDonald, R. P. (2006). A Multidimensional Ideal Point Item Response Theory Model for Binary Data. Multivariate Behavioral Research, 41, 445-471.

Muraki, E. (1990). Fitting a polytomous item response model to Likert-type data. Applied Psychological Measurement, 14, 59-71.

Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159-176.

Muraki, E. & Carlson, E. B. (1995). Full-information factor analysis for polytomous item responses. Applied Psychological Measurement, 19, 73-90.

Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monographs, 34.

Suh, Y. & Bolt, D. (2010). Nested logit models for multiple-choice item response data. Psychometrika, 75, 454-473.

Sympson, J. B. (1977). A model for testing with multidimensional items. Proceedings of the 1977 Computerized Adaptive Testing Conference.

Thissen, D. (1982). Marginal maximum likelihood estimation for the one-parameter logistic model. Psychometrika, 47, 175-186.

Varadhan, R. & Roland, C. (2008). Simple and Globally Convergent Methods for Accelerating the Convergence of Any EM Algorithm. Scandinavian Journal of Statistics, 35, 335-353.

Wood, R., Wilson, D. T., Gibbons, R. D., Schilling, S. G., Muraki, E., & Bock, R. D. (2003). TESTFACT 4 for Windows: Test Scoring, Item Statistics, and Full-information Item Factor Analysis [Computer software]. Lincolnwood, IL: Scientific Software International.

Examples

Run this code

# NOT RUN {
# }
# NOT RUN {
#load LSAT section 7 data and compute 1 and 2 factor models
data <- expand.table(LSAT7)

(mod1 <- mirt(data, 1))
coef(mod1)
(mod2 <- mirt(data, 1, SE = TRUE)) #standard errors with crossprod method
(mod2 <- mirt(data, 1, SE = TRUE, SE.type = 'SEM')) #standard errors with SEM method
coef(mod2)
(mod3 <- mirt(data, 1, SE = TRUE, SE.type = 'Richardson')) #with numerical Richardson method
residuals(mod1)
plot(mod1) #test score function
plot(mod1, type = 'trace') #trace lines
plot(mod2, type = 'info') #test information
plot(mod2, MI=200) #expected total score with 95% confidence intervals

#estimated 3PL model for item 5 only
(mod1.3PL <- mirt(data, 1, itemtype = c('2PL', '2PL', '2PL', '2PL', '3PL')))
coef(mod1.3PL)

#internally g and u pars are stored as logits, so usually a good idea to include normal prior
#  to help stabilize the parameters. For a value around .182 use a mean
#  of -1.5 (since 1 / (1 + exp(-(-1.5))) == .182)
model <- 'F = 1-5
         PRIOR = (5, g, norm, -1.5, 3)'
mod1.3PL.norm <- mirt(data, model, itemtype = c('2PL', '2PL', '2PL', '2PL', '3PL'))
coef(mod1.3PL.norm)
#limited information fit statistics
M2(mod1.3PL.norm)

#unidimensional ideal point model
idealpt <- mirt(data, 1, itemtype = 'ideal')
plot(idealpt, type = 'trace', facet_items = TRUE)
plot(idealpt, type = 'trace', facet_items = FALSE)

#two factors (exploratory)
mod2 <- mirt(data, 2)
coef(mod2)
summary(mod2, rotate = 'oblimin') #oblimin rotation
residuals(mod2)
plot(mod2)
plot(mod2, rotate = 'oblimin')

anova(mod1, mod2) #compare the two models
scoresfull <- fscores(mod2) #factor scores for each response pattern
head(scoresfull)
scorestable <- fscores(mod2, full.scores = FALSE) #save factor score table
head(scorestable)

#confirmatory (as an example, model is not identified since you need 3 items per factor)
# Two ways to define a confirmatory model: with mirt.model, or with a string

# these model definitions are equivalent
cmodel <- mirt.model('
   F1 = 1,4,5
   F2 = 2,3')
cmodel2 <- 'F1 = 1,4,5
            F2 = 2,3'

cmod <- mirt(data, cmodel)
# cmod <- mirt(data, cmodel2) # same as above
coef(cmod)
anova(cmod, mod2)
#check if identified by computing information matrix
(cmod <- mirt(data, cmodel, SE = TRUE))

###########
#data from the 'ltm' package in numeric format
pmod1 <- mirt(Science, 1)
plot(pmod1)
summary(pmod1)

#Constrain all slopes to be equal with the constrain = list() input or mirt.model() syntax
#first obtain parameter index
values <- mirt(Science,1, pars = 'values')
values #note that slopes are numbered 1,5,9,13, or index with values$parnum[values$name == 'a1']
(pmod1_equalslopes <- mirt(Science, 1, constrain = list(c(1,5,9,13))))
coef(pmod1_equalslopes)

# using mirt.model syntax, constrain all item slopes to be equal
model <- 'F = 1-4
          CONSTRAIN = (1-4, a1)'
(pmod1_equalslopes <- mirt(Science, model))
coef(pmod1_equalslopes)

coef(pmod1_equalslopes)
anova(pmod1_equalslopes, pmod1) #significantly worse fit with almost all criteria

pmod2 <- mirt(Science, 2)
summary(pmod2)
plot(pmod2, rotate = 'oblimin')
itemplot(pmod2, 1, rotate = 'oblimin')
anova(pmod1, pmod2)

#unidimensional fit with a generalized partial credit and nominal model
(gpcmod <- mirt(Science, 1, 'gpcm'))
coef(gpcmod)

#for the nominal model the lowest and highest categories are assumed to be the
#  theoretically lowest and highest categories that related to the latent trait(s)
(nomod <- mirt(Science, 1, 'nominal'))
coef(nomod) #ordering of ak values suggest that the items are indeed ordinal
anova(gpcmod, nomod)
itemplot(nomod, 3)

## example applying survey weights.
# weight the first half of the cases to be more representative of population
survey.weights <- c(rep(2, nrow(Science)/2), rep(1, nrow(Science)/2))
survey.weights <- survey.weights/sum(survey.weights) * nrow(Science)
unweighted <- mirt(Science, 1)
weighted <- mirt(Science, 1, survey.weights=survey.weights)

###########
#empirical dimensionality testing that includes 'guessing'

data(SAT12)
data <- key2binary(SAT12,
  key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))

mod1 <- mirt(data, 1)
extract.mirt(mod1, 'time') #time elapsed for each estimation component

#optionally use Newton-Raphson for (generally) faster convergence in the M-step's
mod1 <- mirt(data, 1, optimizer = 'NR')
extract.mirt(mod1, 'time')

mod2 <- mirt(data, 2, optimizer = 'NR')
#difficulty converging with reduced quadpts, reduce TOL
mod3 <- mirt(data, 3, TOL = .001, optimizer = 'NR')
anova(mod1,mod2)
anova(mod2, mod3) #negative AIC, 2 factors probably best

#same as above, but using the QMCEM method for generally better accuracy in mod3
mod3 <- mirt(data, 3, method = 'QMCEM', TOL = .001, optimizer = 'NR')
anova(mod2, mod3)

#with fixed guessing parameters
mod1g <- mirt(data, 1, guess = .1)
coef(mod1g)

###########
#graded rating scale example

#make some data
set.seed(1234)
a <- matrix(rep(1, 10))
d <- matrix(c(1,0.5,-.5,-1), 10, 4, byrow = TRUE)
c <- seq(-1, 1, length.out=10)
data <- simdata(a, d + c, 2000, itemtype = rep('graded',10))

mod1 <- mirt(data, 1)
mod2 <- mirt(data, 1, itemtype = 'grsm')
coef(mod2)
anova(mod2, mod1) #not sig, mod2 should be preferred
itemplot(mod2, 1)
itemplot(mod2, 5)
itemplot(mod2, 10)

###########
# 2PL nominal response model example (Suh and Bolt, 2010)
data(SAT12)
SAT12[SAT12 == 8] <- NA #set 8 as a missing value
head(SAT12)

#correct answer key
key <- c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5)
scoredSAT12 <- key2binary(SAT12, key)
mod0 <- mirt(scoredSAT12, 1)

#for first 5 items use 2PLNRM and nominal
scoredSAT12[,1:5] <- as.matrix(SAT12[,1:5])
mod1 <- mirt(scoredSAT12, 1, c(rep('nominal',5),rep('2PL', 27)))
mod2 <- mirt(scoredSAT12, 1, c(rep('2PLNRM',5),rep('2PL', 27)), key=key)
coef(mod0)$Item.1
coef(mod1)$Item.1
coef(mod2)$Item.1
itemplot(mod0, 1)
itemplot(mod1, 1)
itemplot(mod2, 1)

#compare added information from distractors
Theta <- matrix(seq(-4,4,.01))
par(mfrow = c(2,3))
for(i in 1:5){
    info <- iteminfo(extract.item(mod0,i), Theta)
    info2 <- iteminfo(extract.item(mod2,i), Theta)
    plot(Theta, info2, type = 'l', main = paste('Information for item', i), ylab = 'Information')
    lines(Theta, info, col = 'red')
}
par(mfrow = c(1,1))

#test information
plot(Theta, testinfo(mod2, Theta), type = 'l', main = 'Test information', ylab = 'Information')
lines(Theta, testinfo(mod0, Theta), col = 'red')

###########
# using the MH-RM algorithm
data(LSAT7)
fulldata <- expand.table(LSAT7)
(mod1 <- mirt(fulldata, 1, method = 'MHRM'))

#Confirmatory models

#simulate data
a <- matrix(c(
1.5,NA,
0.5,NA,
1.0,NA,
1.0,0.5,
 NA,1.5,
 NA,0.5,
 NA,1.0,
 NA,1.0),ncol=2,byrow=TRUE)

d <- matrix(c(
-1.0,NA,NA,
-1.5,NA,NA,
 1.5,NA,NA,
 0.0,NA,NA,
3.0,2.0,-0.5,
2.5,1.0,-1,
2.0,0.0,NA,
1.0,NA,NA),ncol=3,byrow=TRUE)

sigma <- diag(2)
sigma[1,2] <- sigma[2,1] <- .4
items <- c(rep('dich',4), rep('graded',3), 'dich')
dataset <- simdata(a,d,2000,items,sigma)

#analyses
#CIFA for 2 factor crossed structure

model.1 <- '
  F1 = 1-4
  F2 = 4-8
  COV = F1*F2'

#compute model, and use parallel computation of the log-likelihood
mirtCluster()
mod1 <- mirt(dataset, model.1, method = 'MHRM')
coef(mod1)
summary(mod1)
residuals(mod1)

#####
#bifactor
model.3 <- '
  G = 1-8
  F1 = 1-4
  F2 = 5-8'

mod3 <- mirt(dataset,model.3, method = 'MHRM')
coef(mod3)
summary(mod3)
residuals(mod3)
anova(mod1,mod3)

#####
#polynomial/combinations
data(SAT12)
data <- key2binary(SAT12,
                  key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))

model.quad <- '
       F1 = 1-32
  (F1*F1) = 1-32'


model.combo <- '
       F1 = 1-16
       F2 = 17-32
  (F1*F2) = 1-8'

(mod.quad <- mirt(data, model.quad))
summary(mod.quad)
(mod.combo <- mirt(data, model.combo))
anova(mod.quad, mod.combo)

#non-linear item and test plots
plot(mod.quad)
plot(mod.combo, type = 'SE')
itemplot(mod.quad, 1, type = 'score')
itemplot(mod.combo, 2, type = 'score')
itemplot(mod.combo, 2, type = 'infocontour')

## empirical histogram examples (normal, skew and bimodality)
#make some data
set.seed(1234)
a <- matrix(rlnorm(50, .2, .2))
d <- matrix(rnorm(50))
ThetaNormal <- matrix(rnorm(2000))
ThetaBimodal <- scale(matrix(c(rnorm(1000, -2), rnorm(1000,2)))) #bimodal
ThetaSkew <- scale(matrix(rchisq(2000, 3))) #positive skew
datNormal <- simdata(a, d, 2000, itemtype = 'dich', Theta=ThetaNormal)
datBimodal <- simdata(a, d, 2000, itemtype = 'dich', Theta=ThetaBimodal)
datSkew <- simdata(a, d, 2000, itemtype = 'dich', Theta=ThetaSkew)

normal <- mirt(datNormal, 1, empiricalhist = TRUE)
plot(normal, type = 'empiricalhist')
histogram(ThetaNormal, breaks=30)

bimodal <- mirt(datBimodal, 1, empiricalhist = TRUE)
plot(bimodal, type = 'empiricalhist')
histogram(ThetaBimodal, breaks=30)

skew <- mirt(datSkew, 1, empiricalhist = TRUE)
plot(skew, type = 'empiricalhist')
histogram(ThetaSkew, breaks=30)

#####
# non-linear parameter constraints with Rsolnp package (alabama supported as well):
# Find Rasch model subject to the constraint that the intercepts sum to 0

dat <- expand.table(LSAT6)

#free latent mean and variance terms
model <- 'Theta = 1-5
          MEAN = Theta
          COV = Theta*Theta'

#view how vector of parameters is organized internally
sv <- mirt(dat, model, itemtype = 'Rasch', pars = 'values')
sv[sv$est, ]

#constraint: create function for solnp to compute constraint, and declare value in eqB
eqfun <- function(p, optim_args) sum(p[1:5]) #could use browser() here, if it helps
LB <- c(rep(-15, 6), 1e-4) # more reasonable lower bound for variance term

mod <- mirt(dat, model, sv=sv, itemtype = 'Rasch', optimizer = 'solnp',
   solnp_args=list(eqfun=eqfun, eqB=0, LB=LB))
print(mod)
coef(mod)
(ds <- sapply(coef(mod)[1:5], function(x) x[,'d']))
sum(ds)

# same likelihood location as: mirt(dat, 1, itemtype = 'Rasch')


#######
# latent regression Rasch model

#simulate data
set.seed(1234)
N <- 1000

# covariates
X1 <- rnorm(N); X2 <- rnorm(N)
covdata <- data.frame(X1, X2)
Theta <- matrix(0.5 * X1 + -1 * X2 + rnorm(N, sd = 0.5))

#items and response data
a <- matrix(1, 20); d <- matrix(rnorm(20))
dat <- simdata(a, d, 1000, itemtype = 'dich', Theta=Theta)

#unconditional Rasch model
mod0 <- mirt(dat, 1, 'Rasch')

#conditional model using X1 and X2 as predictors of Theta
mod1 <- mirt(dat, 1, 'Rasch', covdata=covdata, formula = ~ X1 + X2)
coef(mod1, simplify=TRUE)
anova(mod0, mod1)

#bootstrapped confidence intervals
boot.mirt(mod1, R=5)

#draw plausible values for secondary analyses
pv <- fscores(mod1, plausible.draws = 10)
pvmods <- lapply(pv, function(x, covdata) lm(x ~ covdata$X1 + covdata$X2),
                 covdata=covdata)
#population characteristics recovered well, and can be averaged over
so <- lapply(pvmods, summary)
so

# compute Rubin's multiple imputation average
par <- lapply(so, function(x) x$coefficients[, 'Estimate'])
SEpar <- lapply(so, function(x) x$coefficients[, 'Std. Error'])
averageMI(par, SEpar)

############
# Example using Guass-Hermite quadrature with custom input functions

library(fastGHQuad)
data(SAT12)
data <- key2binary(SAT12,
                   key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))
GH <- gaussHermiteData(50)
Theta <- matrix(GH$x)

# This prior works for uni- and multi-dimensional models
prior <- function(Theta, Etable){
    P <- grid <- GH$w / sqrt(pi)
    if(ncol(Theta) > 1)
        for(i in 2:ncol(Theta))
            P <- expand.grid(P, grid)
     if(!is.vector(P)) P <- apply(P, 1, prod)
     P
}

GHmod1 <- mirt(data, 1, optimizer = 'NR',
              technical = list(customTheta = Theta, customPriorFun = prior))
coef(GHmod1, simplify=TRUE)

Theta2 <- as.matrix(expand.grid(Theta, Theta))
GHmod2 <- mirt(data, 2, optimizer = 'NR', TOL = .0002,
              technical = list(customTheta = Theta2, customPriorFun = prior))
summary(GHmod2, suppress=.2)

# }

Run the code above in your browser using DataLab