bfactor: Full-Information Item Bifactor Analysis

Description

bfactor fits a confirmatory maximum likelihood bifactor model to dichotomous data under the item response theory paradigm. Pseudo-guessing parameters may be included but must be declared as constant, since the estimation of these parameters often leads to unacceptable solutions.

Usage

bfactor(fulldata, specific, guess = 0, prev.cor=NULL, par.prior = FALSE,
  startvalues = NULL,  quadpts = NULL, ncycles = 300, EMtol=.001, nowarn = TRUE, debug = FALSE, ...)
## S3 method for class 'bfactor':
summary(object, digits = 3, ...)
## S3 method for class 'bfactor':
coef(object, digits = 3, ...)
## S3 method for class 'bfactor':
fitted(object, digits = 3, ...)
## S3 method for class 'bfactor':
residuals(object, type = 'LD', digits = 3, ...)

Arguments

fulldata

a matrix or data.frame that consists of only 0, 1, and NA values to be factor analyzed. If scores have been recorded by the response pattern then they can be recoded to dichotomous format using the

specific

a numeric vector specifying where the which factor loads on which item. For example if for a 4 item test with two specific factors, the first specific factor loading on the first two items, then the vector may be specified as c(1,1,2,2).

guess

fixed pseudo-guessing parameter. Can be entered as a single value to assign a global guessing parameter or may be entered as a numeric vector for each item.

prev.cor

uses a previously computed correlation matrix to be used to estimate starting values for the EM estimation

par.prior

a list declaring which items should have assumed priors distributions, and what these prior weights are. Elements are slope and int to specify the coefficients beta prior for the slopes and normal prior for the intercepts, and

startvalues

user declared start values for parameters.

quadpts

number of quadrature points per dimension. If NULL then the number of quadrature points is set to 15.

ncycles

the number of EM iterations to be performed.

EMtol

if the largest change in the EM cycle is less than this value then the EM iteration are stopped early.

object

a model estimated from bfactor of class bfactor.

type

type of residuals to be displayed. Can be either 'LD' for a local dependence matrix (Chen & Thissen, 1997) or 'exp' for the expected values for the frequencies of every response pattern.

digits

number of significant digits to be rounded.

nowarn

logical; suppress warnings from dependent packages?

debug

logical; turn on debugging features?

...

additional arguments to be passed.

Value

bfactor returns an object of class bfactor, with the following elements:
parsestimated parameters of the model, the rightmost column being the intercept
guessa vector of the constant user-supplied pseudo-guessing parameters
facilitya proportion vector of item endorsement
cormatif not initially specified, a tetrachoric matrix with Carroll's correction for guessing (if applicable); else the user supplied correlation matrix
log.liklog-likelihood of the model
AICAkaike Information Criteria
X2chi-squared statistic
dfdegrees of freedom associated with X2
pprobability associated with df and X2
tabdataa summary table of unique response patterns with the frequency of occurrence in the rightmost column
Plcalculated expected probability vector for each unique response pattern
Thetacombination of quadrature points for each factor; contains $quadpts^{nfact}$ number of rows
fulldatacomplete data complete with the original item names
empdistcalculated empirical distribution
h2a vector of communalities
EMiternumber of EM cycles performed; will be less than ncycles if the tolerance is reached
specifica numeric vector specifying where the specific item loadings are located
logicalfacta logical matrix indicating where the factor loadings are located
itemnamesthe names of the items
Callthe function call

Details

bfactor follows the item factor analysis strategy explicated by Gibbons and Hedeker (1992). Nested models may be compared via the approximate chi-squared difference test or by a reduction in AIC (accessible via anova); note that this only makes sense when comparing class bfactor models to class mirt. The general equation used for item bifactor analysis in this package is in the logistic form with a scaling correction of 1.702. This correction is applied to allow comparison to mainstream programs such as TESTFACT 4 (2003).

Unlike TESTFACT 4 (2003) initial start values are computed by using information from the matrix of tetrachoric correlations, potentially with Carroll's (1945) adjustment for chance responses. To begin, a MINRES factor analysis with one factor is extracted, and the transformed loadings and intercepts (see mirt for more details) are used as starting values for the general factor loadings and item intercepts. Values for the specific factor loadings are taken to be half the magnitude of the extracted general factor loadings. Note that while the sign of the loading may be incorrect for specific factors (and possibly for some of the general factor loadings) the intercepts and general factor loadings will be relatively close to the final solution. These initial values should be an improvement over the TESTFACT 4 initial starting values of 1.414 for all the general factor slopes, 1 for all the specific factor slopes, and 0 for all the intercepts.

Factor scores are estimated assuming a normal prior distribution and can be appended to the input data matrix (full.scores = TRUE) or displayed in a summary table for all the unique response patterns. Fitted and residual values may be observed via fitted.bfactor and residual.bfactor. To examine individuals item plots use itemplot which will also plot information and surface functions (although the plink package is much more suitable for IRT graphics). Residuals are computed using the LD statistic (Chen & Thissen, 1997) in the lower diagonal of the matrix returned by residuals, and Cramer's V above the diagonal.

References

Gibbons, R. D., & Hedeker, D. R. (1992). Full-information Item Bi-Factor Analysis. Psychometrika, 57, 423-436.

Carroll, J. B. (1945). The effect of difficulty and chance success on correlations between items and between tests. Psychometrika, 26, 347-372.

Wood, R., Wilson, D. T., Gibbons, R. D., Schilling, S. G., Muraki, E., & Bock, R. D. (2003). TESTFACT 4 for Windows: Test Scoring, Item Statistics, and Full-information Item Factor Analysis [Computer software]. Lincolnwood, IL: Scientific Software International.

Examples

Run this code

###load SAT12 and compute bifactor model with 3 specific factors
data(SAT12)
fulldata <- key2binary(SAT12,
  key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))
specific <- c(2,3,2,3,3,2,1,2,1,1,1,3,1,3,1,2,1,1,3,3,1,1,3,1,3,3,1,3,2,3,1,2)
mod1 <- bfactor(fulldata, specific)
coef(mod1)

###Try with guessing parameters added
guess <- rep(.1,32)
mod2 <- bfactor(fulldata, specific, guess = guess)
coef(mod2) #item 32 too difficult to include guessing par

#fix by imposing a weak intercept prior
mod3a <- bfactor(fulldata, specific, guess = guess, par.prior =
    list(int = c(0,4), int.items = 32))
coef(mod3a)

#...or by removing guessing parameter
guess[32] <- 0
mod3b <- bfactor(fulldata, specific, guess = guess)
coef(mod3b)

Run the code above in your browser using DataLab