mirt: Full-Information Item Factor Analysis (Multidimensional Item Response Theory)

Description

mirt fits an unconditional maximum likelihood factor analysis model to dichotomous and polytomous data under the item response theory paradigm. Fits univariate and multivariate Rasch, 1-4PL, graded, (generalized) partial credit, nominal, and multiple choice models using the EM algorithm.

Usage

mirt(data, nfact, itemtype = NULL, guess = 0, upper = 1,
    SE = FALSE, startvalues = NULL, constrain = NULL,
    freepars = NULL, parprior = NULL, rotate = 'varimax',
    Target = NULL, prev.cor = NULL, quadpts = NULL, verbose
    = FALSE, debug = FALSE, technical = list(), ...)
  ## S3 method for class 'mirt':
summary(object, rotate = '', Target =
    NULL, suppress = 0, digits = 3, print = TRUE, ...)
  ## S3 method for class 'mirt':
coef(object, rotate = '', Target = NULL,
    allpars = FALSE, digits = 3, ...)
  ## S3 method for class 'mirt':
anova(object, object2, ...)
  ## S3 method for class 'mirt':
fitted(object, digits = 3, ...)
  ## S3 method for class 'mirt':
plot(x, type = 'info', npts = 50,
    theta_angle = 45, rot = list(x = -70, y = 30, z = 10),
    ...)
  ## S3 method for class 'mirt':
residuals(object, restype = 'LD', digits
    = 3, printvalue = NULL, ...)

Arguments

data

a matrix or data.frame that consists of numerically ordered data, with missing data coded as NA

nfact

number of factors to be extracted

itemtype

type of items to be modeled, declared as a vector for each item or a single value which will be repeated globally. The NULL default assumes that the items are ordinal or 2PL, however they may be changed to the following: 'Rasch', '1PL', '2PL', '3P

logical, estimate the standard errors?

guess

fixed pseudo-guessing parameters. Can be entered as a single value to assign a global guessing parameter or may be entered as a numeric vector corresponding to each item

upper

fixed upper bound parameters for 4-PL model. Can be entered as a single value to assign a global guessing parameter or may be entered as a numeric vector corresponding to each item

prev.cor

use a previously computed correlation matrix to be used to estimate starting values for the EM estimation? Default in NULL

rotate

type of rotation to perform after the initial orthogonal parameters have been extracted by using summary; default is 'varimax'. See below for list of possible rotations. If

rotate !=
  ''

in the summary

allpars

logical; print all the item parameters
  instead of just the slopes?

Target

a dummy variable matrix indicting a target
  rotation pattern

constrain

a list of user declared equality
  constraints. To see how to define the parameters
  correctly use constrain = 'index' initially to see
  how the parameters are labeled. To constrain parameters
  to be equal create a list with separate conca

parprior

a list of user declared prior item
  probabilities. To see how to define the parameters
  correctly use parprior = 'index' initially to see
  how the parameters are labeled. Can define either normal
  (normally for slopes and intercepts) or b

freepars

a list of user declared logical values
  indicating which parameters to estimate. To see how to
  define the parameters correctly use freepars =
  'index' initially to see how the parameters are labeled.
  These values may be modified and inp

startvalues

a list of user declared start values
  for parameters. To see how to define the parameters
  correctly use startvalues = 'index' initially to
  see what the defaults would noramlly be. These values may
  be modified and input back into the fu

quadpts

number of quadrature points per dimension

printvalue

a numeric value to be specified when
  using the res='exp' option. Only prints patterns
  that have standardized residuals greater than
  abs(printvalue). The default (NULL) prints all
  response patterns

print

logical; print output to console?

x

an object of class mirt to be plotted or
  printed

object

a model estimated from mirt of class
  mirtClass

object2

a second model estimated from mirt
  of class mirtClass with more estimated parameters
  than object

suppress

a numeric value indicating which
  (possibly rotated) factor loadings should be suppressed.
  Typical values are around .3 in most statistical
  software. Default is 0 for no suppression

digits

number of significant digits to be rounded

type

type of plot to view; can be 'curve'
  for the total test score as a function of two dimensions,
  or 'info' to show the test information function
  for two dimensions

theta_angle

numeric values ranging from 0 to 90
  used in plot. If a vector is used then a bubble
  plot is created with the summed information across the
  angles specified (e.g., theta_angle = seq(0, 90,
  by=10))

npts

number of quadrature points to be used for
  plotting features. Larger values make plots look
  smoother

rot

allows rotation of the 3D graphics

restype

type of residuals to be displayed. Can be
  either 'LD' for a local dependence matrix (Chen &
  Thissen, 1997) or 'exp' for the expected values
  for the frequencies of every response pattern

verbose

logical; print observed log-likelihood
  value at each iteration?

debug

logical; turn on debugging features?

technical

a list containing lower level technical
  parameters for estimation. May be: [object Object],[object Object],[object Object],[object Object],[object Object]

...

additional arguments to be passed

`Convergence`

Unrestricted full-information factor analysis is known to
  have problems with convergence, and some items may need
  to be constrained or removed entirely to allow for an
  acceptable solution. As a general rule dichotomous items
  with means greater than .95, or items that are only .05
  greater than the guessing parameter, should be considered
  for removal from the analysis or treated with prior
  distributions. The same type of reasoning is applicable
  when including upper bound parameters as well. Also,
  increasing the number of quadrature points per dimension
  may help to stabilize the estimation process.

`Details`

mirt follows the item factor analysis strategy by
  marginal maximum likelihood estimation (MML) outlined in
  Bock and Aiken (1981), Bock, Gibbons and Muraki (1988),
  and Muraki and Carlson (1995). Nested models may be
  compared via the approximate chi-squared difference test
  or by a reduction in AIC/BIC values (comparison via
  anova). The general equation used for
  multidimensional item response theory is a logistic form
  with a scaling correction of 1.702. This correction is
  applied to allow comparison to mainstream programs such
  as TESTFACT (2003) and POLYFACT.
  Estimation often begins by computing a matrix of
  quasi-tetrachoric correlations, potentially with
  Carroll's (1945) adjustment for chance responds. A MINRES
  factor analysis with nfact is then extracted and
  item parameters are estimated by $a_{ij} =
  f_{ij}/u_j$, where $f_{ij}$ is the factor loading for
  the jth item on the ith factor, and
  $u_j$ is the square root of the factor uniqueness,
  $\sqrt{1 - h_j^2}$. The initial intercept parameters
  are determined by calculating the inverse normal of the
  item facility (i.e., item easiness), $q_j$, to obtain
  $d_j = q_j / u_j$. A similar implementation is also
  used for obtaining initial values for polychotomous
  items. Following these initial estimates the model is
  iterated using the EM estimation strategy with fixed
  quadrature points. Implicit equation accelerations
  described by Ramsey (1975) are also added to facilitate
  parameter convergence speed, and these are adjusted every
  third cycle.
  Factor scores are estimated assuming a normal prior
  distribution and can be appended to the input data matrix
  (full.data = TRUE) or displayed in a summary table
  for all the unique response patterns. summary and
  coef allow for all the rotations available from
  the GPArotation package (e.g., rotate =
  'oblimin') as well as a 'promax' rotation.
  Using plot will plot the either the test surface
  function or the test information function for 1 and 2
  dimensional solutions. To examine individual item plots
  use itemplot. Residuals are computed using
  the LD statistic (Chen & Thissen, 1997) in the lower
  diagonal of the matrix returned by residuals, and
  Cramer's V above the diagonal.

`References`

Bock, R. D., & Aitkin, M. (1981). Marginal maximum
  likelihood estimation of item parameters: Application of
  an EM algorithm. Psychometrika, 46(4), 443-459.
  Bock, R. D., Gibbons, R., & Muraki, E. (1988).
  Full-Information Item Factor Analysis. Applied
  Psychological Measurement, 12(3), 261-280.
  Carroll, J. B. (1945). The effect of difficulty and
  chance success on correlations between items and between
  tests. Psychometrika, 26, 347-372.
  Chalmers, R., P. (2012). mirt: A Multidimensional Item
  Response Theory Package for the R Environment.
  Journal of Statistical Software, 48(6), 1-29.
  Muraki, E. & Carlson, E. B. (1995). Full-information
  factor analysis for polytomous item responses.
  Applied Psychological Measurement, 19, 73-90.
  Ramsay, J. O. (1975). Solving implicit equations in
  psychometric data analysis. Psychometrika, 40(3),
  337-360.
  Wood, R., Wilson, D. T., Gibbons, R. D., Schilling, S.
  G., Muraki, E., & Bock, R. D. (2003). TESTFACT 4 for
  Windows: Test Scoring, Item Statistics, and
  Full-information Item Factor Analysis [Computer
  software]. Lincolnwood, IL: Scientific Software
  International.

`See Also`

expand.table, key2binary,
  polymirt, confmirt,
  bfactor, itemplot

`Examples`

Run this code#load LSAT section 7 data and compute 1 and 2 factor models
data(LSAT7)
data <- expand.table(LSAT7)

(mod1 <- mirt(data, 1))
summary(mod1)
residuals(mod1)
plot(mod1) #test information function

#estimated 3PL model for item 5 only
(mod1.3PL <- mirt(data, 1, itemtype = c('2PL', '2PL', '2PL', '2PL', '3PL')))
coef(mod1.3PL, allpars = TRUE)

(mod2 <- mirt(data, 2))
summary(mod2)
coef(mod2)
residuals(mod2)
plot(mod2)

anova(mod1, mod2) #compare the two models
scores <- fscores(mod2) #save factor score table

###########
#data from the 'ltm' package in numeric format
pmod1 <- mirt(Science, 1)
plot(pmod1)
summary(pmod1)

#Constrain all slopes to be equal
#first obtain parameter index
mirt(Science,1, constrain = 'index') #note that slopes are numbered 1,5,9,13
(pmod1_equalslopes <- mirt(Science, 1, constrain = list(c(1,5,9,13))))
coef(pmod1_equalslopes)

pmod2 <- mirt(Science, 2)
coef(pmod2)
residuals(pmod2)
plot(pmod2, theta_angle = seq(0,90, by = 5)) #sum across angles of theta 1
itemplot(pmod2, 1)
anova(pmod1, pmod2)


###########
data(SAT12)
data <- key2binary(SAT12,
  key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))

mod1 <- mirt(data, 1)
mod2 <- mirt(data, 2)
mod3 <- mirt(data, 3)
anova(mod1,mod2)
anova(mod2, mod3) #negative AIC, 2 factors probably best

#with fixed guessing parameters
mod1g <- mirt(data, 1, guess = .1)
coef(mod1g)
mod2g <- mirt(data, 2, guess = .1)
coef(mod2g)
anova(mod1g, mod2g)
summary(mod2g, rotate='promax')
Run the code above in your browser using DataLab