fraction.subtraction.data: Fraction Subtraction Data

Description

Tatsuoka's (1984) fraction subtraction data set is comprised of responses to $J=20$ fraction subtraction test items from $N=536$ middle school students.

Usage

data(fraction.subtraction.data)

Arguments

format

The fraction.subtraction.data data frame consists of 536 rows and 20 columns, representing the responses of the $N=536$ students to each of the eqn{J=20} test items. Each row in the data set corresponds to the responses of a particular student. Thereby a "1" denotes that a correct response was recorded, while "0" denotes an incorrect response. The other way round, each column corresponds to all responses to a particular item.

source

The Royal Statistical Society Datasets Website, Series C, Applied Statistics, Data analytic methods for latent partially ordered classification models: URL: http://www.blackwellpublishing.com/rss/Volumes/Cv51p2_read2.htm

Details

The items used for the fraction subtraction test originally appeared in Tatsuoka (1984) and are published in Tatsuoka (2002). They can also be found in DeCarlo (2011). All test items are based on 8 attributes (e.g. convert a whole number to a fraction, separate a whole number from a fraction or simplify before subtracting). The complete list of skills can be found in fraction.subtraction.qmatrix.

References

DeCarlo, L. T. (2011) On the Analysis of Fraction Subtraction Data: The DINA Model, Classification, Latent Class Sizes, and the Q-Matrix. Applied Psychological Measurement, 35, 8--26. Tatsuoka, C. (2002) Data analytic methods for latent partially ordered classification models. Journal of the Royal Statistical Society, Series C, Applied Statistics, 51, 337--350. Tatsuoka, K. (1984) Analysis of errors in fraction addition and subtraction problems. Final Report for NIE-G-81-0002, University of Illinois, Urbana-Champaign.

Examples

Run this code

##
## (1) examples based on dataset fractions.subtraction.data
##

## dataset fractions.subtraction.data and corresponding Q-Matrix
fraction.subtraction.data
fraction.subtraction.qmatrix

## Misspecification in parameter specification for method din()
## leads to warnings and terminates estimation procedure. E.g.,

# See Q-Matrix specification
fractions.dina.warning1 <- din(data = fraction.subtraction.data,
  q.matrix = t(fraction.subtraction.qmatrix)) 
  
# See guess.init specification
fractions.dina.warning2 <- din(data = fraction.subtraction.data,
  q.matrix = fraction.subtraction.qmatrix, guess.init = rep(1.2,
  ncol(fraction.subtraction.data)))
  
# See rule specification   
fractions.dina.warning3 <- din(data = fraction.subtraction.data,
  q.matrix = fraction.subtraction.qmatrix, rule = c(rep("DINA",
  10), rep("DINO", 9)))

## Parameter estimation of DINA model
# rule = "DINA" is default
fractions.dina <- din(data = fraction.subtraction.data,
  q.matrix = fraction.subtraction.qmatrix, rule = "DINA")
fractions.dina # see print.din

attributes(fractions.dina)
str(fractions.dina)	  

## For instance accessing the guessing parameters through
## assignment
fractions.dina$guess

## corresponding summaries, including diagnostic accuracies,
## summary of skill pattern distribution and information 
## criteria AIC and BIC
summary(fractions.dina)

## In particular, accessing detailed summary through assignment
detailed.summary.fs <- summary(fractions.dina)
str(detailed.summary.fs)

## Diagnostic accuracy of item 8 seems to low. This is also
## visualized in the first plot 
plot(fractions.dina)

## The reason therefore is a high guessing parameter
round(fractions.dina$guess[,1], 2)

## Set an upper boundary for the guessing parameter of 
## item 5, 8 and 9
fractions.dina.bound <- din(data = fraction.subtraction.data, 
  q.matrix = fraction.subtraction.qmatrix, constraint.guess =
  matrix(c(5,8,9, rep(0.2, 3)), ncol = 2))
fractions.dina.bound
detailed.summary.fs.bound <- summary(fractions.dina.bound)

## This improves the diagnostic accuracies
summary(detailed.summary.fs$IDI[1,])
summary(detailed.summary.fs.bound$IDI[1,])

## The second plot shows the expected (MAP) and observed skill 
## probabilities. The third plot visualizes the skill pattern
## occurrence probabilities; Only the 'highest' are labeled; it
## is obvious that the skill class '11111111' (all skills are
## mastered) is the most probable in this population. The fourth
## plot shows the skill probabilities conditional on response
## patterns; in this population the skills 3 and 6 seem to be
## mastered easier than the others. The fifth plot shows the
## skill probabilities conditional on a specified response
## pattern; it is shown whether a skill is mastered (above 
## .5+'uncertainty') unclassifiable (within the boundaries) or
## not mastered (below .5-'uncertainty'). In this case, the
## fifteenth respondent was chosen; if no response pattern is 
## specified, the plot will not be shown (of course)
pattern <- paste(fraction.subtraction.data[15,], collapse = "")

#uncertainty = 0.1, highest = 0.05 are default
plot(fractions.dina.bound, uncertainty = 0.1, highest = 0.05, 
  pattern = pattern)

Run the code above in your browser using DataLab