rte.test.cheating: Performs statistical tests for cheating using CopyDetect

Description

Uses function CopyDetect1 from package CopyDetect to test for visual cheating in exams based on correct answers from the students. The statistical tests performed by CopyDetect1 are:

Omega index (Wollack, 1996)
Generalized Binomial Test ([GBT], van der Linden & Sotaridona (2006)
K index (Holland, 1996)
K1 and K2 indices (Sotaridona & Meijer, 2002)
S1 and S2 indices (Sotaridona & Meijer, 2003)

The function rte.test.cheating will have as input a dataframe with the names and corrections of students and output a summary of the cheating tests as a list, including suspicious pairs.

Usage

rte.test.cheating(df.grade, p.level = 0.05, print.suspects = TRUE,
  do.cheat.plot = TRUE, suspicion.threshold = 0.5)

Arguments

df.grade

A dataframe where first column is the name of students (item exam.names) and the rest of the columns are the correct (TRUE) and incorrect (FALSE) answers. Each column other than exam.names should be a question

p.level

Critial level of statistical testing

print.suspects

Print testing information and suspects on screen ? (TRUE or FALSE) (Default=TRUE)

do.cheat.plot

Print plot of cheating tests? (TRUE of FALSE) (Default=TRUE)

suspicion.threshold

Proportion of failed cheating tests that justify suspition, between 0 and 1

Value

A list with the following items:

df.pvalue: A dataframe with the statistical results for all pairs of students from the upper triangle (1 test for each pair)
df.suspects: A dataframe with the suspicious pair of students

Details

More details regarding the tests can be found in:

Zopluoglu, C. (2013). CopyDetect An R Package for Computing Statistical Indices to Detect Answer Copying on Multiple-Choice Examinations. Applied psychological measurement, 37(1), 93-95.

The article can be found here

Examples

Run this code

# NOT RUN {
# number of simulated questions in exam
n.sim.questions <- 10

base.names <- c('John', 'Marcelo','Ricardo', 'Tarcizio')
last.names <- c('Smith', 'P.')

name.grid <- expand.grid(base.names,last.names)

my.names <- paste(name.grid[,1], name.grid[,2])
# official names from the university system (will assume it is equal to my.names)
# In a practical situation, this list of official names will come from the university system
exam.names <- my.names

set.seed(10)

correction.mat <- matrix(sample(c(TRUE,FALSE),
                                size = length(exam.names)*n.sim.questions,
                                replace = TRUE),nrow = length(exam.names))

idx.cheater.1 <- 5 # std 5 and 6 have simillar correct answers
idx.cheater.2 <- 6
proportion.to.cheat <- 0.5  # proportion of same correct answers
q.to.cheat <- floor(proportion.to.cheat*n.sim.questions)
correction.mat[idx.cheater.1, ] <-  c(rep(TRUE,q.to.cheat),
                                      rep(FALSE,n.sim.questions-q.to.cheat))

correction.mat[idx.cheater.2, ] <- correction.mat[idx.cheater.1, ]


df.grade <- cbind(data.frame(exam.names),correction.mat)


test.cheating.out <- rte.test.cheating(df.grade, do.cheat.plot = FALSE )
# }

Run the code above in your browser using DataLab