factor.scores: Factor Scores - Ability Estimates

Description

Computation of factor scores for grm, ltm, rasch and tpm models.

Usage

factor.scores(object, …)
# S3 method for gpcm
factor.scores(object, resp.patterns = NULL, 
        method = c("EB", "EAP", "MI"), B = 5, robust.se = FALSE, 
        prior = TRUE, return.MIvalues = FALSE, …)
# S3 method for grm
factor.scores(object, resp.patterns = NULL, 
        method = c("EB", "EAP", "MI"), B = 5, prior = TRUE, 
        return.MIvalues = FALSE, …)
# S3 method for ltm
factor.scores(object, resp.patterns = NULL, 
        method = c("EB", "EAP", "MI", "Component"), B = 5, 
        robust.se = FALSE, prior = TRUE, return.MIvalues = FALSE, 
        …)
# S3 method for rasch
factor.scores(object, resp.patterns = NULL, 
        method = c("EB", "EAP", "MI"), B = 5, robust.se = FALSE,
	    prior = TRUE, return.MIvalues = FALSE, …)
# S3 method for tpm
factor.scores(object, resp.patterns = NULL, 
        method = c("EB", "EAP", "MI"), B = 5, prior = TRUE, 
        return.MIvalues = FALSE, …)

Arguments

object

an object inheriting from either class gpcm, class grm, class ltm, class rasch or class tpm.

resp.patterns

a matrix or a data.frame of response patterns with columns denoting the items; if NULL the factor scores are computed for the observed response patterns.

method

a character supplying the scoring method; available methods are: Empirical Bayes, Expected a Posteriori, Multiple Imputation, and Component. See Details section for more info.

the number of multiple imputations to be used if method = "MI".

robust.se

logical; if TRUE the sandwich estimator is used for the estimation of the covariance matrix of the MLEs. See Details section for more info.

prior

logical. If TRUE, then the prior normal distribution for the latent abilities is taken into account in the calculation of the posterior modes, when method = "EB".

return.MIvalues

logical. If TRUE, then the estimated z-values and their covariance matrix are contained as extra attributes "zvalues.MI" and "var.zvalues.MI", respectively, in the returned score.dat data frame.

…

additional arguments; currently none is used.

Value

An object of class fscores is a list with components,

score.dat

the data.frame of observed response patterns including, observed and expected frequencies (only if the observed data response matrix contains no missing vales), the factor scores and their standard errors.

method

a character giving the scoring method used.

the number of multiple imputations used; relevant only if method = "MI".

call

a copy of the matched call of object.

resp.pats

logical; is TRUE if resp.patterns argument has been specified.

coef

the parameter estimates returned by coef(object); this is NULL when object inherits from class grm.

Details

Factor scores or ability estimates are summary measures of the posterior distribution $p(z|x)$, where $z$ denotes the vector of latent variables and $x$ the vector of manifest variables.

Usually as factor scores we assign the modes of the above posterior distribution evaluated at the MLEs. These Empirical Bayes estimates (use method = "EB") and their associated variance are good measures of the posterior distribution while $p \rightarrow \infty$, where $p$ is the number of items. This is based on the result $$p(z|x)=p(z|x; \hat{\theta})(1+O(1/p)),$$ where $\hat{\theta}$ are the MLEs. However, in cases where $p$ and/or $n$ (the sample size) is small we ignore the variability of plugging-in estimates but not the true parameter values. A solution to this problem can be given using Multiple Imputation (MI; use method = "MI"). In particular, MI is used the other way around, i.e.,

Step 1:: Simulate new parameter values, say $\theta^*$, from $N(\hat{\theta}, C(\hat{\theta}))$, where $C(\hat{\theta})$ is the large sample covariance matrix of $\hat{\theta}$ (if robust.se = TRUE, $C(\hat{\theta})$ is based on the sandwich estimator).
Step 2:: Maximize $p(z|x; \theta^*)$ wrt $z$ and also compute the associated variance to this mode.
Step 3:: Repeat steps 1-2 B times and combine the estimates using the known formulas of MI.

This scheme explicitly acknowledges the ignorance of the true parameter values by drawing from their large sample posterior distribution while taking into account the sampling error. The modes of the posterior distribution $p(z|x; \theta)$ are numerically approximated using the BFGS algorithm in optim().

The Expected a posteriori scores (use method = "EAP") computed by factor.scores() are defined as follows: $$\int z p(z | x; \hat{\theta}) dz.$$

The Component scores (use method = "Component") proposed by Bartholomew (1984) is an alternative method to scale the sample units in the latent dimensions identified by the model that avoids the calculation of the posterior mode. However, this method is not valid in the general case where nonlinear latent terms are assumed.

References

Bartholomew, D. (1984) Scaling binary data using a factor model. Journal of the Royal Statistical Society, Series B, 46, 120--123.

Bartholomew, D. and Knott, M. (1999) Latent Variable Models and Factor Analysis, 2nd ed. London: Arnold.

Bartholomew, D., Steel, F., Moustaki, I. and Galbraith, J. (2002) The Analysis and Interpretation of Multivariate Data for Social Scientists. London: Chapman and Hall.

Rizopoulos, D. (2006) ltm: An R package for latent variable modelling and item response theory analyses. Journal of Statistical Software, 17(5), 1--25. URL http://www.jstatsoft.org/v17/i05/

Rizopoulos, D. and Moustaki, I. (2008) Generalized latent variable models with nonlinear effects. British Journal of Mathematical and Statistical Psychology, 61, 415--438.

Examples

Run this code

# NOT RUN {
## Factor Scores for the Rasch model
fit <- rasch(LSAT)
factor.scores(fit) # Empirical Bayes


## Factor scores for all subjects in the
## original dataset LSAT
factor.scores(fit, resp.patterns = LSAT)


## Factor scores for specific patterns,
## including NA's, can be obtained by 
factor.scores(fit, resp.patterns = rbind(c(1,0,1,0,1), c(NA,1,0,NA,1)))


# }
# NOT RUN {
## Factor Scores for the two-parameter logistic model
fit <- ltm(Abortion ~ z1)
factor.scores(fit, method = "MI", B = 20) # Multiple Imputation

## Factor Scores for the graded response model
fit <- grm(Science[c(1,3,4,7)])
factor.scores(fit, resp.patterns = rbind(1:4, c(NA,1,2,3)))
# }

Run the code above in your browser using DataLab