Learn R Programming

CDM (version 1.4-16)

gdina: Function for Estimating the Generalized DINA (GDINA) Model

Description

This function implements the generalized DINA model (GDINA; de la Torre, 2011). See the paper for details about estimable cognitive diagnosis models. In addition, multiple group estimation is also possible using the gdina function. This function also allows for the estimation of a higher order GDINA model (de la Torre & Douglas, 2004).

Usage

gdina(data, q.matrix, conv.crit=0.0001, dev.crit=.1 ,  maxit=1000, 
    linkfct="identity", Mj=NULL, group=NULL , method="WLS" , 
    delta.designmatrix=NULL, delta.basispar.lower=NULL, 
    delta.basispar.upper=NULL, delta.basispar.init=NULL, 
    zeroprob.skillclasses=NULL, reduced.skillspace=TRUE , HOGDINA=-1,
    Z.skillspace=NULL, weights=rep(1, nrow(data)), rule="GDINA", progress=TRUE,
    progress.item=FALSE , ...)

Arguments

data
A required $N$ times $J$ data matrix containing the binary responses, 0 or 1, of $N$ respondents to $J$ test items, where 1 denotes a correct anwer and 0 an incorrect one. The $n$th row of the matrix represents the binary response pattern
q.matrix
A required binary $J$ times $K$ containing the attributes not required or required, 0 or 1, to master the items. The jth row of the matrix is a binary indicator vector indicating which attributes are not required (coded by 0) and which a
conv.crit
Convergence criterion for maximum absolut change in item parameters
dev.crit
Convergence criterion for maximum absolut change in deviance
maxit
Maximum number of iterations
linkfct
A string which indicates the link function for the GDINA model. Options are "identity" (identity link), "logit" (logit link) and "log" (log link). The default is the "identity" link. Not
Mj
A list of design matrices and labels for each item. The definition of Mj follows the defintion of $M_j$ in de la Torre (2011). Please study the value Mj of the function in default analysis. See Example 3.
group
A vector of group identifiers for multiple group estimation. Default is NULL (no multiple group estimation).
method
Estimation method for item parameters as described in de la Torre (2011). The default "WLS" weights probabilities attribute classes by a weighting matrix $W_j$ of expected frequencies, whereas the method "ULS"
delta.designmatrix
A design matrix for restrictions on delta. See Example 4.
delta.basispar.lower
Lower bounds for delta basis parameters.
delta.basispar.upper
Upper bounds for delta basis parameters.
delta.basispar.init
An optional vector of starting values for the basis parameters of delta. This argument only applies when using a designmatrix for delta, i.e. delta.designmatrix is not NULL.
zeroprob.skillclasses
an optional vector of integers which indicates which skill classes should have zero probability. Default is NULL (no skill classes with zero probability).
reduced.skillspace
Logical which indicates if the latent class skill space should be dimensionally reduced (see Xu & von Davier, 2008). Default is TRUE. The dimensional reduction is only well defined for more than three skills. The di
HOGDINA
Values of -1, 0 or 1 which indicate if a higher order GDINA model (see Details) should be estimated. The default value of -1 corresponds to the case that no higher order factor is assumed to exist. A value of 0 corresponds to independent attr
Z.skillspace
A user specified design matrix for the skill space reduction as described in Xu and von Davier (2008). See in the Examples section for applications. See Example 6.
weights
An optional vector of sample weights.
rule
A string or a vector of itemwise condensation rules. Allowed entries are GDINA, DINA, DINO, ACDM (additive cognitive diagnostic model) and RRUM (reduced reparametrized unifi
progress
Display progress on the Rconsole?
progress.item
Display itemwise progress
...
Further arguments to be passed

Value

  • An object of class gdina with following entries
  • coefItem parameters
  • deltaBasis item parameters
  • se.deltaStandard errors of basis item parameters
  • itemfit.rmseaThe RMSEA item fit index (see itemfit.rmsea).
  • mean.rmseaMean of RMSEA item fit indexes.
  • loglikeLog-likelihood
  • devianceDeviance
  • GNumber of groups
  • NSample size
  • AICAIC
  • BICBIC
  • CAICCAIC
  • NparsTotal number of parameters
  • NiparNumber of item parameters
  • NskillparNumber of parameters for skill class distribution
  • NskillclassesNumber of skill classes
  • varmat.deltaCovariance matrix of $\delta$ item parameters
  • varmat.plajXXX
  • posteriorIndividual posterior distribution
  • likeIndividual likelihood
  • dataOriginal data
  • q.matrixUsed $Q$ matrix
  • patternIndividual patterns, individual MLE and MAP classifications and their corresponding probabilities
  • attribute.pattProbabilities of skill classes
  • skill.pattMarginal skill probabilities
  • subj.patternIndividual subject pattern
  • attribute.patt.splittedSplitted attribute pattern
  • pjkArray of item response probabilities
  • MjDesign matrix $M_j$ in GDINA algorithm (see de la Torre, 2011)
  • AjDesign matrix $A_j$ in GDINA algorithm (see de la Torre, 2011)
  • delta.designmatrixDesignmatrix for item parameters
  • reduced.skillspaceA logical if skillspace reduction was performed
  • Z.skillspaceDesign matrix for skillspace reduction
  • betaParameters $\delta$ for skill class representation
  • covbetaStandard errors of $\delta$ parameters
  • model.typeXXX
  • iterNumber of iterations
  • rrum.paramsParameters in the parametrization of the reduced RUM model if rule="RRUM".
  • HOGDINAThe used value of HOGDINA
  • a.attrAttribute parameters $a_k$ in case of HOGDINA>=0
  • b.attrAttribute parameters $b_k$ in case of HOGDINA>=0
  • attr.rfAttribute response functions. This matrix contains all $a_k$ and $b_k$ parameters
  • ...Further values

Details

The estimation is based on an EM algorithm as described in de la Torre (2011). Item parameters are contained in the delta vector which is a list where the $j$th entry corresponds to item parameters of the $j$th item. Assume that two skills $\alpha_1$ and $\alpha_2$ are required for mastering item $j$. Then the GDINA model can be written as $$g [ P( X_{nj} = 1 | \alpha_n ) ] = \delta_{j0} + \delta_{j1} \alpha_{n1} + \delta_{j2} \alpha_{n2} + \delta_{j12} \alpha_{n1} \alpha_{n2}$$ which is a two-way GDINA-model (the rule="GDINA2" specification) with a link function $g$. If the specification ACDM is chosen, then $\delta_{j12}=0$. The DINA model (rule="DINA") assumes $\delta_{j1} = \delta_{j2} = 0$. For the reduced RUM model (rule="RRUM"), the item response model is $$P(X_{nj}=1 | \alpha_n ) = \pi_i^\ast \cdot r_{i1}^{1-\alpha_{i1} } \cdot r_{i2}^{1-\alpha_{i2} }$$ From this equation, it is obvious, that this model is equivalent to an additive model (rule="ACDM") with a logarithmic link function (linkfct="log"). If a reduced skillspace (reduced.skillspace=TRUE) is employed, then the logarithm of probability distribution of the attributes is modelled as a log-linear model: $$\log P[ ( \alpha_{n1} , \alpha_{n2} , \ldots , \alpha_{nK} ) ] = \gamma_0 + \sum_k \gamma_k \alpha_{nk} + \sum_{k < l} \gamma_{kl} \alpha_{nk} \alpha_{nl}$$ If a higher order DINA model is assumed (HOGDINA=1), then a higher order factor $\theta_n$ for the attributes is assumed: $$P( \alpha_{nk} = 1 | \theta_n ) = \Phi ( a_k \theta_n + b_k )$$ For HOGDINA=0, all attributes $\alpha_{nk}$ are assumed to be independent of each other: $$P[ ( \alpha_{n1} , \alpha_{n2} , \ldots , \alpha_{nK} ) ] = \prod_k P( \alpha_{nk} )$$

References

de la Torre, J. & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69, 333-353. de la Torre, J. (2011) The generalized {DINA} model framework. Psychometrika, 76, 179--199. Xu, X. & von Davier, M. (2008). Fitting the structured general diagnostic model to NAEP data. ETS Research Report ETS RR-08-27. Princeton, ETS.

See Also

See also the din function (for DINA and DINO estimation). For assessment of model fit see modelfit.cor.din and anova.gdina. See sim.gdina for simulating the GDINA model.

Examples

Run this code
###################################################################
# EXAMPLE 1: Simulated DINA data
#    different condensation rules 
###################################################################
data(sim.dina)

#***
# Model 1: estimation of the GDINA model (identity link)
mod1 <- gdina( data = sim.dina ,  q.matrix = sim.qmatrix , maxit=700)
summary(mod1)

#***
# Model 2: estimation of the DINA model with gdina function
mod2 <- gdina( data = sim.dina ,  q.matrix = sim.qmatrix , rule="DINA")
summary(mod2)

#***
# Model 3: compare results with din function
mod2b <- din( data = sim.dina ,  q.matrix = sim.qmatrix , rule="DINA")
summary(mod2b)
cbind( mod2$coef , mod2b$coef )

#***
# Model 4: DINA model with logit link
mod4 <- gdina( data = sim.dina ,  q.matrix = sim.qmatrix , maxit= 20 , 
                rule="DINA" , linkfct = "logit" )
summary(mod4)

#***
# Model 5: DINA model log link
mod5 <- gdina( data = sim.dina ,  q.matrix = sim.qmatrix , maxit=100 , 
                    rule="DINA" , linkfct = "log" )
summary(mod5)

#***
# Model 6: RRUM model
mod6 <- gdina( data = sim.dina, q.matrix = sim.qmatrix, maxit=100,  rule="RRUM")
summary(mod6)

#***
# Model 7: Higher order GDINA model
mod7 <- gdina( data = sim.dina, q.matrix = sim.qmatrix, maxit=100,  HOGDINA=1)
summary(mod7)

#***
# Model 8: Independence GDINA model
mod8 <- gdina( data = sim.dina, q.matrix = sim.qmatrix, maxit=100,  HOGDINA=0)
summary(mod8)

###################################################################
# EXAMPLE 2: Simulated DINO data
#    additive cognitive diagnosis model
#    with different link functions
###################################################################

#***
# Model 1: additive cognitive diagnosis model (ACDM; identity link)
mod1 <- gdina( data=sim.dino,  q.matrix=sim.qmatrix,  
                    rule="ACDM")
summary(mod1)

#***
# Model 2: ACDM logit link
mod2 <- gdina( data=sim.dino, q.matrix=sim.qmatrix,  
                    rule="ACDM", linkfct="logit" )
summary(mod2)

#***
# Model 3: ACDM log link
mod3 <- gdina( data=sim.dino,  q.matrix=sim.qmatrix,  
                rule="ACDM", linkfct="log" )
summary(mod3)

#***
# Model 4: Different condensation rules per item
I <- 9      # number of items
rule <- rep( "GDINA" , I )
rule[1] <- "DINO"   # 1st item: DINO model
rule[7] <- "GDINA2" # 7th item: GDINA model with first- 
                    #           and second-order interactions
rule[8] <- "ACDM"   # 8ht item: additive CDM
rule[9] <- "DINA"   # 9th item: DINA model
mod4 <- gdina( data=sim.dino, q.matrix=sim.qmatrix, rule=rule )
summary(mod4)

###################################################################
# EXAMPLE 3: Model with user-specified design matrices
###################################################################

# do a preliminary analysis and modify obtained design matrices
mod0 <- gdina( data = sim.dino ,  q.matrix = sim.qmatrix ,  maxit=1)

# extract default design matrices
Mj <- mod0$Mj
Mj.user <- Mj   # these user defined design matrices are modified.
#~~~
# For the second item, the following model should hold
# X1 ~ V2 + V2*V3
mj <- Mj[[2]][[1]]
mj.lab <- Mj[[2]][[2]]
mj <- mj[,-3]
mj.lab <- mj.lab[-3]
Mj.user[[2]] <- list( mj , mj.lab )
#    [[1]]
#        [,1] [,2] [,3]
#    [1,]    1    0    0
#    [2,]    1    1    0
#    [3,]    1    0    0
#    [4,]    1    1    1
#    [[2]]
#    [1] "0"   "1"   "1-2"    
#~~~
# For the eight item an equality constraint should hold
# X8 ~ a*V2 + a*V3 + V2*V3
mj <- Mj[[8]][[1]]
mj.lab <- Mj[[8]][[2]]
mj[,2] <- mj[,2] + mj[,3]
mj <- mj[,-3]
mj.lab <- c("0" , "1=2" , "1-2" )
Mj.user[[8]] <- list( mj , mj.lab )
Mj.user
mod <- gdina( data = sim.dino ,  q.matrix = sim.qmatrix ,
                    Mj = Mj.user ,  maxit=200 )
summary(mod)

###################################################################
# EXAMPLE 4: Design matrix for delta parameters
###################################################################

#~~~
# estimate an initial model
mod0 <- gdina( data = sim.dino ,  q.matrix = sim.qmatrix , 
            rule="ACDM" , maxit=1)
# extract coefficients
c0 <- mod0$coef
I <- 9  # number of items
delta.designmatrix <- matrix( 0 , nrow= nrow(c0) , ncol = nrow(c0) )
diag( delta.designmatrix) <- 1
# set intercept of item 1 and item 3 equal to each other
delta.designmatrix[ 7 , 1 ] <- 1 ; delta.designmatrix[,7] <- 0
# set loading of V1 of item1 and item 3 equal
delta.designmatrix[ 8 , 2 ] <- 1 ; delta.designmatrix[,8] <- 0
delta.designmatrix <- delta.designmatrix[ , -c(7:8) ]       
                # exclude original parameters with indices 7 and 8

#***
# Model 1: ACDM with designmatrix
mod1 <- gdina( data = sim.dino ,  q.matrix = sim.qmatrix ,  rule="ACDM" , 
            delta.designmatrix = delta.designmatrix )
summary(mod1)            

#***
# Model 2: Same model, but with logit link instead of identity link function
mod2 <- gdina( data = sim.dino ,  q.matrix = sim.qmatrix ,  rule="ACDM" , 
            delta.designmatrix = delta.designmatrix , 
            maxit=100 , linkfct = "logit")
summary(mod2)            

###################################################################
# SIMULATED EXAMPLE 5: Multiple group estimation
###################################################################

# simulate data
set.seed(9279)
N1 <- 200 ; N2 <- 100   # group sizes
I <- 10                 # number of items
q.matrix <- matrix(0,I,2)   # create Q matrix
q.matrix[1:7,1] <- 1 ; q.matrix[ 5:10,2] <- 1
# simulate first group
dat1 <- sim.din(N1, q.matrix=q.matrix , mean = c(0,0) )$dat
# simulate second group
dat2 <- sim.din(N2, q.matrix=q.matrix , mean = c(-.3 , -.7) )$dat
# merge data
dat <- rbind( dat1 , dat2 )
# group indicator 
group <- c( rep(1,N1) , rep(2,N2) )

# estimate GDINA model
mod <- gdina( data = dat , q.matrix = q.matrix ,  group= group)
summary(mod)

# estimate DINA model
mod2 <- gdina( data = dat , q.matrix = q.matrix , 
                group= group , rule="DINA")
summary(mod2)                       

###################################################################
# EXAMPLE 6: User specified reduced skill space
###################################################################

#   Some correlations between attributes should be set to zero.
q.matrix <- expand.grid( c(0,1) , c(0,1) , c(0,1) , c(0,1) )
colnames(q.matrix) <- colnames( paste("Attr" , 1:4 ,sep=""))
q.matrix <- q.matrix[ -1 , ]
Sigma <- matrix( .5 , nrow=4 , ncol=4 )
diag(Sigma) <- 1
Sigma[3,2] <- Sigma[2,3] <- 0 # set correlation of attribute A2 and A3 to zero
dat <- sim.din( N=1000 , q.matrix = q.matrix , Sigma = Sigma)$dat

#~~~ Step 1: initial estimation
mod1a <- gdina( data=dat , q.matrix = q.matrix , maxit= 1 , rule="DINA")
# estimate also "full" model
mod1 <- gdina( data=dat , q.matrix = q.matrix , rule="DINA")

#~~~ Step2: modify designmatrix for reduced skillspace
Z.skillspace <- data.frame( mod1a$Z.skillspace )
# set correlations of A2/A4 and A3/A4 to zero
vars <- c("A2_A3","A2_A4") 
for (vv in vars){ Z.skillspace[,vv] <- NULL }

#~~~ Step 3: estimate model with reduced skillspace
mod2 <- gdina( data=dat , q.matrix = q.matrix , 
        Z.skillspace=Z.skillspace , rule="DINA")

#~~~ eliminate all covariances
Z.skillspace <- data.frame( mod1$Z.skillspace )
colnames(Z.skillspace)
Z.skillspace <- Z.skillspace[ , - 
	grep( "_" , colnames(Z.skillspace ) , fixed=TRUE)]
colnames(Z.skillspace)

mod3 <- gdina( data=dat , q.matrix = q.matrix , 
        Z.skillspace=Z.skillspace , rule="DINA")
summary(mod1); summary(mod2); summary(mod3)

Run the code above in your browser using DataLab