Learn R Programming

CDM (version 1.4-16)

gdm: General Diagnostic Model

Description

This function estimates the general diagnostic model (von Davier, 2008; Xu & von Davier, 2008) which handles multidimensional item response models with ordered discrete or continuous latent variables for polytomous item responses.

Usage

gdm( data, theta.k, irtmodel="2PL", group=NULL, weights=rep(1, nrow(data)), 
    Qmatrix=NULL , thetaDes = NULL , skillspace="loglinear",  
    b.constraint=NULL, a.constraint=NULL, 
    mean.constraint=NULL , Sigma.constraint=NULL , delta.designmatrix=NULL, 
    standardized.latent=FALSE , centered.latent=FALSE ,  maxiter=1000, 
    conv=10^(-5), globconv=10^(-5), msteps=8 , convM=.0005 ,
    decrease.increments = FALSE , use.freqpatt=FALSE , ...)

Arguments

data
An $N$ times $I$ matrix of polytomous item responses with categories $k=0,1,...,K$
theta.k
In the one-dimensional case it must be a vector. For multidimensional models it has to be a list of skill vectors if the theta grid differs between dimensions. If not, a vector input can be supplied.
irtmodel
The default 2PL corresponds to the model where item slopes on dimensions are equal for all item categories. If item-category slopes should be estimated, use 2PLcat. If no item slopes should be estimated then 1PL
group
An optional vector of group identifiers for multiple group estimation
weights
Sample weights
Qmatrix
An optional array $I$ times $D$ times $K$ which indicates pre-specified item loadings on dimensions. The default for category $k$ is the score $k$, i.e. the scoring in the (generalized) partial credit model.
thetaDes
A design matrix for specifying nonlinear item response functions (see Example 1, Models 4 and 5)
skillspace
The parametric assumption of the skillspace. If skillspace="normal" then a univariate or multivariate normal distribution is assumed. The default "loglinear" corresponds to log-linear smoothing of the skillspace distributio
b.constraint
In this optional matrix with $C_b$ rows and three columns, $C_b$ item intercepts $b_{ik}$ can be fixed. 1st column: item index, 2nd column: category index, 3rd column: fixed item thresholds
a.constraint
In this optional matrix with $C_a$ rows and four columns, $C_a$ item intercepts $a_{idk}$ can be fixed. 1st column: item index, 2nd column: dimension index, 3rd column: category index, 4th column: fixed item slopes
mean.constraint
A $C$ times 3 matrix for constraining $C$ means in the normal distribution assumption (skillspace="normal"). 1st column: Dimension, 2nd column: Group, 3rd column: Value
Sigma.constraint
A $C$ times 4 matrix for constraining $C$ covariances in the normal distribution assumption (skillspace="normal"). 1st column: Dimension 1, 2nd column: Dimension 2, 3rd column: Group, 4th column: Value
delta.designmatrix
The design matrix of $\delta$ parameters for the reduced skillspace estimation (see Xu & von Davier, 2008)
standardized.latent
Should in a uni- or multidimensional model all latent variables of the first group be normally distributed and standardized? The default is FALSE.
centered.latent
Should in a uni- or multidimensional model all latent variables of the first group be normally distributed and have zero means? The default is FALSE.
maxiter
Maximum number of iterations
conv
Convergence criterion for item parameters and distribution parameters
globconv
Global deviance convergence criterion
msteps
Maximum number of M steps in estimating $b$ and $a$ item parameters. The default is 8 M steps.
convM
Convergence criterion in M step
decrease.increments
Should in the M step the increments of $a$ and $b$ parameters decrease during iterations? The default is FALSE. If there is an increase in deviance during estimation, setting decrease.increments to TRUE
use.freqpatt
Should frequencies of unique item response patterns be used. In case of large data set use.freqpatt=TRUE can speed calculations. Note that in this case, not all person parameters are calculated as usual in the output.
...
Further arguments to be passed

Value

  • An object of class gdm. The list contains the following entries:
  • itemData frame with item parameters
  • personData frame with person parameters: EAP denotes the mean of the individual posterior distribution, SE.EAP the corresponding standard error, MLE the maximum likelihood estimate at theta.k and MAP the mode of the posterior distribution
  • EAP.relReliability of the EAP
  • devianceDeviance
  • icInformation criteria, number of estimated parameters
  • bItem intercepts $b_{jk}$
  • se.bStandard error of item intercepts $b_{jk}$
  • aItem slopes $a_{jd}$ resp. $a_{jdk}$
  • se.aStandard error of item slopes $a_{jd}$ resp. $a_{jdk}$
  • itemfit.rmseaThe RMSEA item fit index (see itemfit.rmsea). This entry comes as a list with total and group-wise item fit statistics.
  • mean.rmseaMean of RMSEA item fit indexes.
  • QmatrixUsed $Q$ matrix
  • pi.kTrait distribution
  • mean.traitMeans of trait distribution
  • sd.traitStandard deviations of trait distribution
  • skewness.traitSkewnesses of trait distribution
  • correlation.traitList of correlation matrices of trait distribution corresponding to each group
  • pjkItem response probabilities evaluated at grid theta.k
  • n.ikAn array of expected counts $n_{cikg}$ of ability class $c$ at item $i$ at category $k$ in group $g$
  • GNumber of groups
  • DNumber of $\theta$ dimensions
  • INumber of items
  • NNumber of persons
  • deltaParameter estimates for skillspace representation
  • covdeltaCovariance matrix of parameter estimates for skillspace representation
  • dataOriginal dataframe
  • p.xi.ajIndividual likelihood
  • posteriorIndividual posterior distribution
  • skill.levelsNumber of skill levels per dimension
  • K.itemMaximal category per item
  • theta.kUsed theta design
  • thetaDesUsed theta design for item responses
  • timeInfo about computation time
  • skillspaceUsed skillspace parametrization
  • iterNumber of iterations
  • ...

Details

Case irtmodel="1PL": Equal item slopes of 1 are assumed in this model. Therefore, it corresponds to a generalized multidimensional Rasch model. $$logit P( X_{nj} = k | \theta_n ) = b_{j0} + \sum_d q_{jdk} \theta_{nd}$$ The $Q$ matrix entries $q_{jdk}$ are pre-specified by the user. Case irtmodel="2PL": For each item and each dimension, different item slopes $a_{jd}$ are estimated: $$logit P( X_{nj} = k | \theta_n ) = b_{j0} + \sum_d a_{jd} q_{jdk} \theta_{nd}$$ Case irtmodel="2PLcat": For each item, each dimension and each category, different item slopes $a_{jdk}$ are estimated: $$logit P( X_{nj} = k | \theta_n ) = b_{j0} + \sum_d a_{jdk} q_{jdk} \theta_{nd}$$ Note that this model can be generalized to include terms of any transformation $t_h$ of the $\theta_n$ vector (e.g. quadratic terms, step functions or interaction) such that the model can be formulated as $$logit P( X_{nj} = k | \theta_n ) = b_{j0} + \sum_h a_{jhk} q_{jhk} t_h( \theta_{n} )$$ In general, the number of functions $t_1 , ... , t_H$ will be larger than the $\theta$ dimension of $D$.

References

von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61, 287-307. Xu, X. & von Davier, M. (2008). Fitting the structured general diagnostic model to NAEP data. ETS Research Report ETS RR-08-27. Princeton, ETS.

See Also

Cognitive diagnostic models for dichotomous data can be estimated with din (DINA or DINO model) or gdina (GDINA model, which contains many CDMs as special cases). For assessment of model fit see modelfit.cor.din and anova.gdm.

Examples

Run this code
###################################################################
# EXAMPLE 1: Fraction Dataset 1
#      Unidimensional Models for dichotomous data
###################################################################

data( data.fraction1 )
dat <- data.fraction1$data
theta.k <- seq( -6 , 6 , len=15 )   # discretized ability

#***
# Model 1: Rasch model (normal distribution)
mod1 <- gdm( dat , irtmodel="1PL" , theta.k=theta.k , skillspace="normal" ,
        centered.latent=TRUE)
summary(mod1)

#***
# Model 2: Rasch model (log-linear smoothing)
# set the item difficulty of the 8th item to zero
b.constraint <- matrix( c(8,1,0) , 1 , 3 )  
mod2 <- gdm( dat , irtmodel="1PL" , theta.k=theta.k , 
          skillspace="loglinear" , b.constraint=b.constraint  )
summary(mod2)

#***
# Model 3: 2PL model
mod3 <- gdm( dat , irtmodel="2PL" , theta.k=theta.k , 
          skillspace="normal" , standardized.latent=TRUE  )
summary(mod3)

#***
# Model 4: include quadratic term in item response function
#   using the argument decrease.increments=TRUE leads to a more
#   stable estimate
thetaDes <- cbind( theta.k , theta.k^2 )
colnames(thetaDes) <- c( "F1" , "F1q" )
mod4 <- gdm( dat , irtmodel="2PL" , theta.k=theta.k , 
          thetaDes = thetaDes , skillspace="normal" ,
          standardized.latent=TRUE , decrease.increments=TRUE)
summary(mod4)

#***
# Model 5: step function for ICC
#          two different probabilities theta < 0 and theta > 0
thetaDes <- matrix( 1*(theta.k>0) , ncol=1 )
colnames(thetaDes) <- c( "Fgrm1" )
mod5 <- gdm( dat , irtmodel="2PL" , theta.k=theta.k , 
          thetaDes = thetaDes , skillspace="normal" )
summary(mod5)

#***
# Model 6: DINA model with din function
mod6 <- din( dat , q.matrix = matrix( 1 , nrow=ncol(dat),ncol=1 ) )
summary(mod6)

#***
# Model 7: Estimating a version of the DINA model with gdm
theta.k <- c(-.5,.5)
mod7 <- gdm( dat , irtmodel="2PL" , theta.k=theta.k , 
           skillspace="loglinear" )
summary(mod7)

###################################################################
# EXAMPLE 2: Cultural Activities - data.Students
#      Unidimensional Models for polytomous data
###################################################################

data( data.Students )
dat <- data.Students[ , grep( "act" , colnames(data.Students) ) ]
theta.k <- seq( -4 , 4 , len=11 )   # discretized ability

#***
# Model 1: Partial Credit Model (PCM)
mod1 <- gdm( dat , irtmodel="1PL" , theta.k=theta.k , skillspace="normal" ,
           centered.latent=TRUE)
summary(mod1)

#***
# Model 1b: PCM using frequency patterns
mod1b <- gdm( dat , irtmodel="1PL" , theta.k=theta.k , skillspace="normal" ,
           centered.latent=TRUE , use.freqpatt=TRUE)
summary(mod1b)

#***
# Model 2: PCM with two groups
mod2 <- gdm( dat , irtmodel="1PL" , theta.k=theta.k , 
            group=data.Students$urban + 1 , skillspace="normal" ,
            centered.latent=TRUE)
summary(mod2)

#***
# Model 3: PCM with loglinear smoothing
b.constraint <- matrix( c(1,2,0) , ncol=3 )
mod3 <- gdm( dat , irtmodel="1PL" , theta.k=theta.k , 
    skillspace="loglinear" , b.constraint=b.constraint )
summary(mod3)

#***
# Model 4: Model with pre-specified item weights in Q matrix
Qmatrix <- array( 1 , dim=c(5,1,2) )
Qmatrix[,1,2] <- 2
Qmatrix[c(2,4),1,1] <- .74
Qmatrix[c(2,4),1,2] <- 2.3
# score for category 1 is 1.2 and for category 2 is 1.8
mod4 <- gdm( dat , irtmodel="1PL" , theta.k=theta.k , Qmatrix=Qmatrix ,
           skillspace="normal"  , centered.latent=TRUE)
summary(mod4)

#***
# Model 5: Generalized partial credit model
mod5 <- gdm( dat , irtmodel="2PL" , theta.k=theta.k ,  
          skillspace="normal" , standardized.latent=TRUE )
summary(mod5)

#***
# Model 6: Item-category slope estimation
mod6 <- gdm( dat , irtmodel="2PLcat" , theta.k=theta.k ,  
          skillspace="normal" , standardized.latent=TRUE ,
          decrease.increments=TRUE)        
summary(mod6)

#***
# Models 7: items with different number of categories
dat0 <- dat
dat0[ paste(dat0[,1]) == 2 , 1 ] <- 1 # 1st item has only two categories
dat0[ paste(dat0[,3]) == 2 , 3 ] <- 1 # 3rd item has only two categories

# Model 7a: PCM
mod7a <- gdm( dat0 , irtmodel="1PL" , theta.k=theta.k ,  
           centered.latent=TRUE )        
summary(mod7a)

# Model 7b: Item category slopes
mod7b <- gdm( dat0 , irtmodel="2PLcat" , theta.k=theta.k ,  
            standardized.latent=TRUE , decrease.increments=TRUE )        
summary(mod7b)

###################################################################
# EXAMPLE 3: Fraction Dataset 2
#      Multidimensional Models for dichotomous data
###################################################################

data( data.fraction2 )
dat <- data.fraction2$data
Qmatrix <- data.fraction2$q.matrix3

#***
# Model 1: One-dimensional Rasch model
theta.k <- seq( -4 , 4 , len=11 )   # discretized ability
mod1 <- gdm( dat , irtmodel = "1PL" , theta.k=theta.k ,
           centered.latent=TRUE) 
summary(mod1)

#***
# Model 2: One-dimensional 2PL model
mod2 <- gdm( dat , irtmodel = "2PL" , theta.k=theta.k ,
           standardized.latent=TRUE) 
summary(mod2)

#***
# Model 3: 3-dimensional Rasch Model (normal distribution)
mod3 <- gdm( dat , irtmodel="1PL" , theta.k=theta.k , 
            Qmatrix=Qmatrix , centered.latent=TRUE , 
            globconv=5*10^(-3) , conv=10^(-4) , maxiter=10 )
summary(mod3)            

#***
# Model 4: 3-dimensional Rasch model (loglinear smoothing)
# set some item parameters of items 4,1 and 2 to zero
b.constraint <- cbind( c(4,1,2) , 1 , 0 )
mod4 <- gdm( dat , irtmodel="1PL" , theta.k=theta.k , 
            Qmatrix=Qmatrix , b.constraint=b.constraint , 
            skillspace="loglinear" , globconv= 5*10^(-2) , conv=10^(-3) )
summary(mod4)            

#***
# Model 5: define a different theta grid for each dimension
theta.k <- list( "Dim1"= seq( -5 , 5 , len=11 ) , 
                 "Dim2"= seq(-5,5,len=8) , 
                 "Dim3"=seq( -3,3,len=6) )
mod5 <- gdm( dat , irtmodel="1PL" , theta.k=theta.k , 
            Qmatrix=Qmatrix , b.constraint=b.constraint , 
            skillspace="loglinear" , globconv= 5*10^(-2) , conv=10^(-3) )
summary(mod5)            

#***
# Model 6: multdimensional 2PL model (normal distribution)
theta.k <- seq( -5 , 5 , len=13 )
a.constraint <- cbind( c(8,1,3) , 1:3 , 1 , 1 ) # fix some slopes to 1
mod6 <- gdm( dat , irtmodel="2PL" , theta.k=theta.k , 
            Qmatrix=Qmatrix , centered.latent=TRUE , 
            a.constraint=a.constraint , decrease.increments=TRUE , 
            skillspace="normal" , maxiter=400)
summary(mod6)

#***
# Model 7: multdimensional 2PL model (loglinear distribution)
a.constraint <- cbind( c(8,1,3) , 1:3 , 1 , 1 )
b.constraint <- cbind( c(8,1,3) , 1 , 0 )
mod7 <- gdm( dat , irtmodel="2PL" , theta.k=theta.k , 
            Qmatrix=Qmatrix , b.constraint=b.constraint , 
            a.constraint=a.constraint , decrease.increments=FALSE , 
            skillspace="loglinear" , maxiter=400)
summary(mod7)

Run the code above in your browser using DataLab