Learn R Programming

sirt (version 1.5-0)

plausible.value.imputation.raschtype: Plausible Value Imputation in Generalized Logistic Item Response Model

Description

This function performs unidimensional plausible value imputation (Adams & Wu, 2007; Mislevy, 1991).

Usage

plausible.value.imputation.raschtype(data=NULL, f.yi.qk=NULL, X, 
   Z=NULL, beta0=rep(0, ncol(X)), sig0=1, b=rep(1, ncol(X)), 
   a=rep(1, length(b)), c=rep(0, length(b)), d=1+0*b, 
   alpha1=0, alpha2=0, theta.list=seq(-5, 5, len=50), 
   cluster=NULL, iter, burnin, nplausible=1, printprogress=TRUE)

Arguments

data
An $N \times I$ data frame of dichotomous responses
f.yi.qk
An optional matrix which contains the individual likelihood. This matrix is produced by rasch.mml2 or rasch.copula2. The use of this argument
X
A matrix of individual covariates for the latent regression of $\theta$ on $X$
Z
A matrix of individual covariates for the regression of individual residual variances on $Z$
beta0
Initial vector of regression coefficients
sig0
Initial vector of coefficients for the variance heterogeneity model
b
Vector of item difficulties. It must not be provided if the individual likelihood f.yi.qk is specified.
a
Optional vector of item slopes
c
Optional vector of lower item asymptotes
d
Optional vector of upper item asymptotes
alpha1
Parameter $\alpha_1$ in generalized item response model
alpha2
Parameter $\alpha_2$ in generalized item response model
theta.list
Vector of theta values at which the ability distribution should be evaluated
cluster
Cluster identifier (e.g. schools or classes) for including theta means in the plausible imputation.
iter
Number of iterations
burnin
Number of burn-in iterations for plausible value imputation
nplausible
Number of plausible values
printprogress
A logical indicated whether iteratiomn progress should be displayed at the console.

Value

  • A list with following entries:
  • coefs.XSampled regression coefficients for covariates $X$
  • coefs.ZSampled coefficients for modeling variance heterogeneity for covariates $Z$
  • pvdrawsMatrix with drawn plausible values
  • posteriorPosterior distribution from last iteration
  • EAPIndividual EAP estimate
  • SE.EAPStandard error of the EAP estimate
  • pv.indexesIndex of iterations for which plausible values were drawn

Details

Plausible values are drawn from the latent regression model with heterogeneous variances: $$\theta_p = X_p \beta + \epsilon_p \quad , \quad \epsilon_p \sim N( 0 , \sigma_p^2 ) \quad , \quad \log( \sigma_p ) = Z_p \gamma + \nu_p$$

References

Adams, R., & Wu. M. (2007). The mixed-coefficients multinomial logit model: A generalized form of the Rasch model. In M. von Davier & C. H. Carstensen: Multivariate and Mixture Distribution Rasch Models: Extensions and Applications (pp. 57-76). New York: Springer. Mislevy, R. J. (1991). Randomization-based inference about latent variables from complex samples. Psychometrika, 56, 177-196.

See Also

For estimating the latent regression model see latent.regression.em.raschtype.

Examples

Run this code
#############################################################################
# SIMULATED EXAMPLE 1: Rasch model with covariates
#############################################################################

set.seed(899)
I <- 21     # number of items
b <- seq(-2,2, len=I)   # item difficulties
n <- 2000       # number of students

# simulate theta and covariates
theta <- rnorm( n )
x <- .7 * theta + rnorm( n , .5 )
y <- .2 * x+ .3*theta + rnorm( n , .4 )
dfr <- data.frame( theta , 1 , x , y )

# simulate Rasch model
dat1 <- sim.raschtype( theta = theta , b = b )

# Plausible value draws
pv1 <- plausible.value.imputation.raschtype(data=dat1 , X=dfr[,-1] , b = b ,
            nplausible=3 , iter=10 , burnin=5)
# estimate linear regression based on first plausible value
mod1 <- lm( pv1$pvdraws[,1] ~ x+y )
summary(mod1)
  ##               Estimate Std. Error t value Pr(>|t|)    
  ##   (Intercept) -0.27755    0.02121  -13.09   <2e-16 ***
  ##   x            0.40483    0.01640   24.69   <2e-16 ***
  ##   y            0.20307    0.01822   11.15   <2e-16 ***

# true regression estimate
summary( lm( theta ~ x + y ) )
  ## Coefficients:
  ##             Estimate Std. Error t value Pr(>|t|)    
  ## (Intercept) -0.27821    0.01984  -14.02   <2e-16 ***
  ## x            0.40747    0.01534   26.56   <2e-16 ***
  ## y            0.18189    0.01704   10.67   <2e-16 ***

#############################################################################
# SIMULATED EXAMPLE 2: Classical test theory, homogeneous regression variance
#############################################################################

set.seed(899)
n <- 3000       # number of students
x <- round( runif( n , 0 ,1 ) )
y <- rnorm(n)
# simulate true score theta
theta <- .4*x + .5 * y + rnorm(n)
# simulate observed score by adding measurement error
sig.e <- rep( sqrt(.40) , n )
theta_obs <- theta + rnorm( n , sd=sig.e)

# define theta grid for evaluation of density
theta.list <- mean(theta_obs) + sd(theta_obs) * seq( - 5 , 5 , length=21)
# compute individual likelihood
f.yi.qk <- dnorm( outer( theta_obs , theta.list , "-" ) / sig.e )
f.yi.qk <- f.yi.qk / rowSums(f.yi.qk)
# define covariates
X <- cbind( 1 , x , y )
# draw plausible values        
mod2 <- plausible.value.imputation.raschtype( f.yi.qk =f.yi.qk , 
                  theta.list=theta.list , X=X , iter=10 , burnin=5)

# linear regression
mod1 <- lm( mod2$pvdraws[,1] ~ x+y )
summary(mod1)
  ##             Estimate Std. Error t value Pr(>|t|)    
  ## (Intercept) -0.01393    0.02655  -0.525      0.6    
  ## x            0.35686    0.03739   9.544   <2e-16 ***
  ## y            0.53759    0.01872  28.718   <2e-16 ***

# true regression model
summary( lm( theta ~ x + y ) )
  ##             Estimate Std. Error t value Pr(>|t|)    
  ## (Intercept) 0.002931   0.026171   0.112    0.911    
  ## x           0.359954   0.036864   9.764   <2e-16 ***
  ## y           0.509073   0.018456  27.584   <2e-16 ***

#############################################################################
# SIMULATED EXAMPLE 3: Classical test theory, heterogeneous regression variance
#############################################################################

set.seed(899)
n <- 5000       # number of students
x <- round( runif( n , 0 ,1 ) )
y <- rnorm(n)
# simulate true score theta
theta <- .4*x + .5 * y + rnorm(n) * ( 1 - .4 * x )
# simulate observed score by adding measurement error
sig.e <- rep( sqrt(.40) , n )
theta_obs <- theta + rnorm( n , sd=sig.e)

# define theta grid for evaluation of density
theta.list <- mean(theta_obs) + sd(theta_obs) * seq( - 5 , 5 , length=21)
# compute individual likelihood
f.yi.qk <- dnorm( outer( theta_obs , theta.list , "-" ) / sig.e )
f.yi.qk <- f.yi.qk / rowSums(f.yi.qk)
# define covariates
X <- cbind( 1 , x , y )
# draw plausible values (assuming variance homogeneity)
mod3a <- plausible.value.imputation.raschtype( f.yi.qk =f.yi.qk , 
                  theta.list=theta.list , X=X , iter=10 , burnin=5)
# draw plausible values (assuming variance heterogeneity)
#  -> include predictor Z
mod3b <- plausible.value.imputation.raschtype( f.yi.qk =f.yi.qk , 
                  theta.list=theta.list , X=X , Z=X , iter=10 , burnin=5)

# investigate variance of theta conditional on x
res3 <- sapply( 0:1 , FUN = function(vv){
        c( var(theta[x==vv]), var(mod3b$pvdraw[x==vv,1]),
              var(mod3a$pvdraw[x==vv,1]))})
rownames(res3) <- c("true" , "pv(hetero)" , "pv(homog)" )
colnames(res3) <- c("x=0","x=1")     
  ## > round( res3 , 2 )
  ##             x=0  x=1
  ## true       1.30 0.58
  ## pv(hetero) 1.29 0.55
  ## pv(homog)  1.06 0.77
## -> assuming heteroscedastic variances recovers true conditional variance

Run the code above in your browser using DataLab