btm: Extended Bradley-Terry Model

Description

The function btm estimates an extended Bradley-Terry model (Hunter, 2004; see Details). Parameter estimation uses a bias corrected joint maximum likelihood estimation method based on $\varepsilon$-adjustment (see Bertoli-Barsotti, Lando & Punzo, 2014). See Details for the algorithm.

The function btm_sim simulated data from the extended Bradley-Terry model.

Usage

btm(data, judge=NULL, ignore.ties=FALSE, fix.eta=NULL, fix.delta=NULL, fix.theta=NULL,
       maxiter=100, conv=1e-04, eps=0.3)
# S3 method for btm
summary(object, file=NULL, digits=4,...)
btm_sim(theta, eta=0, delta=-99, repeated=FALSE)

Arguments

data

Data frame with three columns. The first two columns contain labels from the units in the pair comparison. The third column contains the result of the comparison. "1" means that the first units wins, "0" means that the second unit wins and "0.5" means a draw (a tie).

judge

Optional vector of judge identifiers (if multiple judges are available)

ignore.ties

Logical indicating whether ties should be ignored.

fix.eta

Numeric value for a fixed $\eta$ value

fix.delta

Numeric value for a fixed $\delta$ value

fix.theta

A vector with entries for fixed theta values.

maxiter

Maximum number of iterations

conv

Convergence criterion

eps

The $\varepsilon$ parameter for the $\varepsilon$-adjustment method (see Bertoli-Barsotti, Lando & Punzo, 2014) which reduces bias in ability estimates. In case of $\varepsilon=0$, persons with extreme scores are removed from the pairwise comparison.

object

Object of class btm

file

Optional file name for sinking the summary into

digits

Number of digits after decimal to print

…

Further arguments to be passed.

theta

Vector of abilities

eta

Value of $\eta$ parameter

delta

Value of $\delta$ parameter

repeated

Logical indicating whether repeated ratings of dyads (for home advantage effect) should be simulated

Value

List with following entries

pars

Parameter summary for $\eta$ and $\delta$

effects

Parameter estimates for $\theta$ and outfit and infit statistics

summary.effects

Summary of $\theta$ parameter estimates

mle.rel

MLE reliability, also known as separation reliability

sepG

Separation index $G$

probs

Estimated probabilities

data

Used dataset with integer identifiers

fit_judges

Fit statistics (outfit and infit) for judges if judge is provided. In addition, average agreement of the rating with the mode of the ratings is calculated for each judge (at least three ratings per dyad has to be available for computing the agreement).

residuals

Unstandardized and standardized residuals for each observation

Details

The extended Bradley-Terry model for the comparison of individuals $i$ and $j$ is defined as $$P(X_{ij}=1 ) \propto \exp( \eta + \theta_i ) $$ $$P(X_{ij}=0 ) \propto \exp( \theta_j ) $$ $$P(X_{ij}=0.5) \propto \exp( \delta + ( \eta + \theta_i +\theta_j )/2 ) $$ The parameters $\theta_i$ denote the abilities, $\delta$ is the tendency of the occurrence of ties and $\eta$ is the home-advantage effect.

A joint maximum likelihood (JML) estimation is applied for simulataneous estimation of $\eta$, $\delta$ and all $\theta_i$ parameters. In the Rasch model, it was shown that JML can result in biased parameter estimates. The $\varepsilon$-adjustment approach has been proposed to reduce the bias in parameter estimates (Bertoli-Bersotti, Lando & Punzo, 2014). This estimation approach is adapted to the Bradley-Terry model in the btm function. To this end, the likelihood function is modified for the purpose of bias reduction. It can be easily shown that there exist sufficient statistics for $\eta$, $\delta$ and all $\theta_i$ parameters. In the $\varepsilon$-adjustment approach, the sufficient statistic for the $\theta_i$ parameter is modified. In JML estimation of the Bradley-Terry model, $S_i=\sum_{j \ne i} ( x_{ij} + x_{ji} )$ is a sufficient statistic for $\theta_i$. Let $M_i$ the maximum score for person $i$ which is the number of $x_{ij}$ terms appearing in $S_i$. In the $\varepsilon$-adjustment approach, the sufficient statistic $S_i$ is modified to $$S_{i, \varepsilon}=\varepsilon + \frac{M_i - 2 \varepsilon}{M_i} S_i $$ and $S_{i, \varepsilon}$ instead of $S_{i}$ is used in JML estimation. Hence, original scores $S_i$ are linearly transformed for all persons $i$.

References

Bertoli-Barsotti, L., Lando, T., & Punzo, A. (2014). Estimating a Rasch Model via fuzzy empirical probability functions. In D. Vicari, A. Okada, G. Ragozini & C. Weihs (Eds.). Analysis and Modeling of Complex Data in Behavioral and Social Sciences. Springer.

Hunter, D. R. (2004). MM algorithms for generalized Bradley-Terry models. Annals of Statistics, 32, 384-406. 10.1214/aos/1079120141

Examples

Run this code

# NOT RUN {
#############################################################################
# EXAMPLE 1: Bradley-Terry model | data.pw01
#############################################################################

data(data.pw01)

dat <- data.pw01
dat <- dat[, c("home_team", "away_team", "result") ]

# recode results according to needed input
dat$result[ dat$result==0 ] <- 1/2   # code for ties
dat$result[ dat$result==2 ] <- 0     # code for victory of away team

#********************
# Model 1: Estimation with ties and home advantage
mod1 <- sirt::btm( dat)
summary(mod1)

# }
# NOT RUN {
#********************
# Model 2: Estimation with ties, no epsilon adjustment
mod2 <- sirt::btm( dat, eps=0, fix.eta=0)
summary(mod2)

#********************
# Model 3: Some fixed abilities
fix.theta <- c("Anhalt Dessau"=-1 )
mod3 <- sirt::btm( dat, eps=0, fix.theta=fix.theta)
summary(mod3)

#********************
# Model 4: Ignoring ties, no home advantage effect
mod4 <- sirt::btm( dat, ignore.ties=TRUE, fix.eta=0)
summary(mod4)

#********************
# Model 5: Ignoring ties, no home advantage effect (JML approach and eps=0)
mod5 <- sirt::btm( dat, ignore.ties=TRUE, fix.eta=0, eps=0)
summary(mod5)

#############################################################################
# EXAMPLE 2: Venice chess data
#############################################################################

# See http://www.rasch.org/rmt/rmt113o.htm
# Linacre, J. M. (1997). Paired Comparisons with Standard Rasch Software.
# Rasch Measurement Transactions, 11:3, 584-585.

# dataset with chess games -> "D" denotes a draw (tie)
chessdata <- scan( what="character")
    1D.0..1...1....1.....1......D.......D........1.........1.......... Browne
    0.1.D..0...1....1.....1......D.......1........D.........1......... Mariotti
    .D0..0..1...D....D.....1......1.......1........1.........D........ Tatai
    ...1D1...D...D....1.....D......D.......D........1.........0....... Hort
    ......010D....D....D.....1......D.......1........1.........D...... Kavalek
    ..........00DDD.....D.....D......D.......1........D.........1..... Damjanovic
    ...............00D0DD......D......1.......1........1.........0.... Gligoric
    .....................000D0DD.......D.......1........D.........1... Radulov
    ............................DD0DDD0D........0........0.........1.. Bobotsov
    ....................................D00D00001.........1.........1. Cosulich
    .............................................0D000D0D10..........1 Westerinen
    .......................................................00D1D010000 Zichichi

L <- length(chessdata) / 2
games <- matrix( chessdata, nrow=L, ncol=2, byrow=TRUE )
G <- nchar(games[1,1])
# create matrix with results
results <- matrix( NA, nrow=G, ncol=3 )
for (gg in 1:G){
    games.gg <- substring( games[,1], gg, gg )
    ind.gg <- which( games.gg !="." )
    results[gg, 1:2 ] <- games[ ind.gg, 2]
    results[gg, 3 ] <- games.gg[ ind.gg[1] ]
}
results <- as.data.frame(results)
results[,3] <- paste(results[,3] )
results[ results[,3]=="D", 3] <- 1/2
results[,3] <- as.numeric( results[,3] )

# fit model ignoring draws
mod1 <- sirt::btm( results, ignore.ties=TRUE, fix.eta=0, eps=0 )
summary(mod1)

# fit model with draws
mod2 <- sirt::btm( results, fix.eta=0, eps=0 )
summary(mod2)

#############################################################################
# EXAMPLE 3: Simulated data from the Bradley-Terry model
#############################################################################

set.seed(9098)
N <- 22
theta <- seq(2,-2, len=N)

#** simulate and estimate data without repeated dyads
dat1 <- sirt::btm_sim(theta=theta)
mod1 <- sirt::btm( dat1, ignore.ties=TRUE, fix.delta=-99, fix.eta=0)
summary(mod1)

#*** simulate data with home advantage effect and ties
dat2 <- sirt::btm_sim(theta=theta, eta=.8, delta=-.6, repeated=TRUE)
mod2 <- sirt::btm(dat2)
summary(mod2)

#############################################################################
# EXAMPLE 4: Estimating the Bradley-Terry model with multiple judges
#############################################################################

#*** simulating data with multiple judges
set.seed(987)
N <- 26  # number of objects to be rated
theta <- seq(2,-2, len=N)
s1 <- stats::sd(theta)
dat <- NULL
# judge discriminations which define tendency to provide reliable ratings
discrim <- c( rep(.9,10), rep(.5,2), rep(0,2) )
#=> last four raters provide less reliable ratings

RR <- length(discrim)
for (rr in 1:RR){
    theta1 <- discrim[rr]*theta + stats::rnorm(N, mean=0, sd=s1*sqrt(1-discrim[rr]))
    dat1 <- sirt::btm_sim(theta1)
    dat1$judge <- rr
    dat <- rbind(dat, dat1)
}

#** estimate the Bradley-Terry model and compute judge-specific fit statistics
mod <- sirt::btm( dat[,1:3], judge=paste0("J",100+dat[,4]), fix.eta=0, ignore.ties=TRUE)
summary(mod)
# }

Run the code above in your browser using DataLab