mirt: Full-Information Item Factor Analysis (Multidimensional Item Response Theory)

Description

mirt fits an unconditional maximum likelihood factor analysis model to dichotomous and polytomous data under the item response theory paradigm. Fits univariate and multivariate Rasch, 1-4PL, graded, (generalized) partial credit, nominal, multiple choice, and partially compenatory models using the EM algorithm.

Usage

mirt(data, model, itemtype = NULL, guess = 0, upper = 1,
    SE = FALSE, SEtol = .001, pars = NULL, constrain =
    NULL, parprior = NULL, rotate = 'varimax', Target =
    NaN, prev.cor = NULL, quadpts = NULL, grsm.block =
    NULL, D = 1.702, verbose = FALSE, debug = FALSE,
    technical = list(), ...)
  ## S3 method for class 'ExploratoryClass':
summary(object, rotate = '',
    Target = NULL, suppress = 0, digits = 3, verbose =
    TRUE, ...)
  ## S3 method for class 'ExploratoryClass':
coef(object, rotate = '',
    Target = NULL, digits = 3, ...)
  ## S3 method for class 'ExploratoryClass':
anova(object, object2)
  ## S3 method for class 'ExploratoryClass':
fitted(object, digits = 3,
    ...)
  ## S3 method for class 'ExploratoryClass':
plot(x, y, type = 'info',
    npts = 50, theta_angle = 45, rot = list(xaxis = -70,
    yaxis = 30, zaxis = 10), ...)
  ## S3 method for class 'ExploratoryClass':
residuals(object, restype =
    'LD', digits = 3, df.p = FALSE, printvalue = NULL,
    verbose = TRUE, ...)

Arguments

data

a matrix or data.frame that consists of numerically ordered data, with missing data coded as NA

model

an object returned from confmirt.model() declaring how the factor model is to be estimated, or a single numeric value indicating the number of exploratory factors to estimate. See confmirt.

itemtype

type of items to be modeled, declared as a vector for each item or a single value which will be repeated globally. The NULL default assumes that the items follow a graded or 2PL structure, however they may be changed to the following: 'Rasch', '1P

grsm.block

an optional numeric vector indicating where the blocking should occur when using the grsm, NA represents items that do not belong to the grsm block (other items that may be estimated in the test data). For example, to specify two blocks of 3 with

logical, estimate the standard errors? Calls the MHRM subroutine for a stochastic approximation

SEtol

tollerance value used to stop the MHRM estimation when SE = TRUE. Lower values will take longer but may be more stable for computing the information matrix

guess

fixed pseudo-guessing parameters. Can be entered as a single value to assign a global guessing parameter or may be entered as a numeric vector corresponding to each item

upper

fixed upper bound parameters for 4-PL model. Can be entered as a single value to assign a global guessing parameter or may be entered as a numeric vector corresponding to each item

prev.cor

use a previously computed correlation matrix to be used to estimate starting values for the EM estimation? Default in NULL

rotate

type of rotation to perform after the initial orthogonal parameters have been extracted by using summary; default is 'varimax'. See below for list of possible rotations. If

rotate !=
  ''

in the summary

D

a numeric value used to adjust the logistic
  metric to be more similar to a normal cumulative density
  curve. Default is 1.702

Target

a dummy variable matrix indicting a target
  rotation pattern

constrain

a list of user declared equality
  constraints. To see how to define the parameters
  correctly use pars = 'values' initially to see how
  the parameters are labeled. To constrain parameters to be
  equal create a list with separate concatena

parprior

a list of user declared prior item
  probabilities. To see how to define the parameters
  correctly use pars = 'values' initially to see how
  the parameters are labeled. Can define either normal
  (normally for slopes and intercepts) or beta

pars

a data.frame with the structure of how the
  starting values, parameter numbers, and estimation
  logical values are defined. The user may observe how the
  model defines the values by using pars = 'values',
  and this object can in turn be m

quadpts

number of quadrature points per dimension

printvalue

a numeric value to be specified when
  using the res='exp' option. Only prints patterns
  that have standardized residuals greater than
  abs(printvalue). The default (NULL) prints all
  response patterns

x

an object of class mirt to be plotted or
  printed

y

an unused variable to be ignored

object

a model estimated from mirt of class
  ExploratoryClass or ConfirmatoryClass

object2

a second model estimated from any of the
  mirt package estimation methods ExploratoryClass
  with more estimated parameters than object

suppress

a numeric value indicating which
  (possibly rotated) factor loadings should be suppressed.
  Typical values are around .3 in most statistical
  software. Default is 0 for no suppression

digits

number of significant digits to be rounded

type

type of plot to view; can be 'info' to
  show the test information function, 'infocontour'
  for the test information contours, or 'SE' for the
  test standard error function

theta_angle

numeric values ranging from 0 to 90
  used in plot. If a vector is used then a bubble
  plot is created with the summed information across the
  angles specified (e.g., theta_angle = seq(0, 90,
  by=10))

npts

number of quadrature points to be used for
  plotting features. Larger values make plots look
  smoother

rot

allows rotation of the 3D graphics

restype

type of residuals to be displayed. Can be
  either 'LD' for a local dependence matrix (Chen &
  Thissen, 1997) or 'exp' for the expected values
  for the frequencies of every response pattern

df.p

logical; print the degrees of freedom and
  p-values?

verbose

logical; print observed log-likelihood
  value at each iteration?

debug

logical; turn on debugging features?

technical

a list containing lower level technical
  parameters for estimation. May be: [object Object],[object Object],[object Object],[object Object]

...

additional arguments to be passed

`Confirmatory IRT`

Specification of the confirmatory item factor analysis
  model follows many of the rules in the SEM framework for
  confirmatory factor analysis. The variances of the latent
  factors are automatically fixed to 1 to help facilitate
  model identification. All parameters may be fixed to
  constant values or set equal to other parameters using
  the appropriate declarations. If the model is
  confirmatory then the returned class will be
  'ConfirmatoryClass'.

`Exploratory IRT`

Specifying a number as the second input to confmirt an
  exploratory IRT model is estimated and can be viewed as a
  stochastic analogue of mirt, with much of the same
  behaviour and specifications. Rotation and target matrix
  options will be used in this subroutine and will be
  passed to the returned object for use in generic
  functions such as summary() and fscores.
  Again, factor means and variances are fixed to ensure
  proper identification. If the model is confirmatory then
  the returned class will be 'ExploratoryClass'.
  Estimation often begins by computing a matrix of
  quasi-tetrachoric correlations, potentially with
  Carroll's (1945) adjustment for chance responds. A MINRES
  factor analysis with nfact is then extracted and
  item parameters are estimated by $a_{ij} =
  f_{ij}/u_j$, where $f_{ij}$ is the factor loading for
  the jth item on the ith factor, and
  $u_j$ is the square root of the factor uniqueness,
  $\sqrt{1 - h_j^2}$. The initial intercept parameters
  are determined by calculating the inverse normal of the
  item facility (i.e., item easiness), $q_j$, to obtain
  $d_j = q_j / u_j$. A similar implementation is also
  used for obtaining initial values for polychotomous
  items. Following these initial estimates the model is
  iterated using the EM estimation strategy with fixed
  quadrature points. Implicit equation accelerations
  described by Ramsey (1975) are also added to facilitate
  parameter convergence speed, and these are adjusted every
  third cycle.

`Convergence`

Unrestricted full-information factor analysis is known to
  have problems with convergence, and some items may need
  to be constrained or removed entirely to allow for an
  acceptable solution. As a general rule dichotomous items
  with means greater than .95, or items that are only .05
  greater than the guessing parameter, should be considered
  for removal from the analysis or treated with prior
  distributions. The same type of reasoning is applicable
  when including upper bound parameters as well. Also,
  increasing the number of quadrature points per dimension
  may help to stabilize the estimation process.

`Details`

mirt follows the item factor analysis strategy by
  marginal maximum likelihood estimation (MML) outlined in
  Bock and Aiken (1981), Bock, Gibbons and Muraki (1988),
  and Muraki and Carlson (1995). Nested models may be
  compared via the approximate chi-squared difference test
  or by a reduction in AIC/BIC values (comparison via
  anova). The general equation used for
  multidimensional item response theory is a logistic form
  with a scaling correction of 1.702. This correction is
  applied to allow comparison to mainstream programs such
  as TESTFACT (2003) and POLYFACT.
  Factor scores are estimated assuming a normal prior
  distribution and can be appended to the input data matrix
  (full.data = TRUE) or displayed in a summary table
  for all the unique response patterns. summary and
  coef allow for all the rotations available from
  the GPArotation package (e.g., rotate =
  'oblimin') as well as a 'promax' rotation.
  Using plot will plot the test information function
  or the test standard errors for 1 and 2 dimensional
  solutions. To examine individual item plots use
  itemplot. Residuals are computed using the
  LD statistic (Chen & Thissen, 1997) in the lower diagonal
  of the matrix returned by residuals, and Cramer's
  V above the diagonal.

`References`

Bock, R. D., & Aitkin, M. (1981). Marginal maximum
  likelihood estimation of item parameters: Application of
  an EM algorithm. Psychometrika, 46(4), 443-459.
  Bock, R. D., Gibbons, R., & Muraki, E. (1988).
  Full-Information Item Factor Analysis. Applied
  Psychological Measurement, 12(3), 261-280.
  Carroll, J. B. (1945). The effect of difficulty and
  chance success on correlations between items and between
  tests. Psychometrika, 26, 347-372.
  Chalmers, R., P. (2012). mirt: A Multidimensional Item
  Response Theory Package for the R Environment.
  Journal of Statistical Software, 48(6), 1-29.
  Muraki, E. & Carlson, E. B. (1995). Full-information
  factor analysis for polytomous item responses.
  Applied Psychological Measurement, 19, 73-90.
  Ramsay, J. O. (1975). Solving implicit equations in
  psychometric data analysis. Psychometrika, 40(3),
  337-360.
  Wood, R., Wilson, D. T., Gibbons, R. D., Schilling, S.
  G., Muraki, E., & Bock, R. D. (2003). TESTFACT 4 for
  Windows: Test Scoring, Item Statistics, and
  Full-information Item Factor Analysis [Computer
  software]. Lincolnwood, IL: Scientific Software
  International.

`See Also`

expand.table, key2binary,
  confmirt, bfactor,
  multipleGroup, wald
  itemplot, fscores

`Examples`

Run this code#load LSAT section 7 data and compute 1 and 2 factor models
data(LSAT7)
data <- expand.table(LSAT7)

(mod1 <- mirt(data, 1))
summary(mod1)
residuals(mod1)
plot(mod1) #test information function

#estimated 3PL model for item 5 only
(mod1.3PL <- mirt(data, 1, itemtype = c('2PL', '2PL', '2PL', '2PL', '3PL')))
coef(mod1.3PL)

(mod2 <- mirt(data, 2, SE = TRUE))
summary(mod2, rotate = 'oblimin')
coef(mod2)
residuals(mod2)
plot(mod2)

anova(mod1, mod2) #compare the two models
scores <- fscores(mod2) #save factor score table

#confirmatory
cmodel <- confmirt.model()
   F1 = 1,4,5
   F2 = 2,3


cmod <- mirt(data, cmodel)
coef(cmod)
anova(cmod, mod2)

###########
#data from the 'ltm' package in numeric format
pmod1 <- mirt(Science, 1)
plot(pmod1)
summary(pmod1)

#Constrain all slopes to be equal
#first obtain parameter index
values <- mirt(Science,1, pars = 'values')
values #note that slopes are numbered 1,5,9,13
(pmod1_equalslopes <- mirt(Science, 1, constrain = list(c(1,5,9,13))))
coef(pmod1_equalslopes)

pmod2 <- mirt(Science, 2)
summary(pmod2)
residuals(pmod2)
plot(pmod2, theta_angle = seq(0,90, by = 5)) #sum across angles of theta 1
itemplot(pmod2, 1)
anova(pmod1, pmod2)


###########
data(SAT12)
data <- key2binary(SAT12,
  key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))

mod1 <- mirt(data, 1)
mod2 <- mirt(data, 2, quadpts = 15)
mod3 <- mirt(data, 3, quadpts = 10)
anova(mod1,mod2)
anova(mod2, mod3) #negative AIC, 2 factors probably best

#with fixed guessing parameters
mod1g <- mirt(data, 1, guess = .1)
coef(mod1g)

#with estimated guessing and beta priors (for better stability)
itemtype <- rep('3PL', 32)
sv <- mirt(data, 1, itemtype, pars = 'values')
gindex <- sv$parnum[sv$name == 'g']
parprior <- list(c(gindex, 'beta', 10, 90))
mod1wg <- mirt(data, 1, itemtype, guess = .1, parprior=parprior, verbose=TRUE)
coef(mod1wg)
anova(mod1g, mod1wg)

###########
#graded rating scale example

#make some data
a <- matrix(rep(1/1.702, 10))
d <- matrix(c(1,0.5,-.5,-1), 10, 4, byrow = TRUE)
c <- seq(-1, 1, length.out=10)
data <- simdata(a, d + c, 2000, itemtype = rep('graded',10))

#use much better start values to save iterations
sv <- mirt(data, 1, itemtype = 'grsm', pars = 'values')
sv[,5] <- c(as.vector(t(cbind(a,d,c))),0,1)

mod1 <- mirt(data, 1)
mod2 <- mirt(data, 1, itemtype = 'grsm', verbose = TRUE, pars = sv)
coef(mod2)
anova(mod2, mod1) #not sig, mod2 should be prefered
Run the code above in your browser using DataLab