mcbo: Median Constrained Bucket Order (MCBO)

Description

Find the median ranking constrained to exactly b buckets (tied groups) according to Kemeny's axiomatic approach. This implements the algorithms described in D'Ambrosio et al. (2019) for rank aggregation with fixed bucket constraints.

Usage

mcbo(
  X,
  nbuckets,
  wk = NULL,
  ps = TRUE,
  algorithm = "BB",
  itermax = 10,
  np = 10,
  gl = 100,
  ff = 0.4,
  cr = 0.8,
  use_cpp = TRUE
)

Value

A list containing the following components:

Consensus		The consensus ranking with exactly nbuckets buckets
Tau		Averaged TauX rank correlation coefficient
Eltime		Elapsed time in seconds

Arguments

X

A N by M data matrix, in which there are N judges and M objects to be judged. Each row is a ranking of the objects which are represented by the columns. If X contains the rankings observed only once, the argument wk can be used

nbuckets

Integer. The number of buckets (tied groups) the consensus ranking must contain. Must be between 2 and M-1 (where M is the number of objects)

wk

Optional: the frequency of each ranking in the data

ps

Logical. If TRUE, displays progress information on screen. Default TRUE

algorithm

Character string specifying the algorithm to use. One of:

"BB" - Branch-and-Bound (exact, default). Best for M <= 15
"quick" - Quick algorithm (heuristic). Best for 15 < M <= 50
"decor" - Differential Evolution (metaheuristic). Best for M > 50

itermax

Integer. Maximum number of iterations for "quick" and "decor" algorithms. Default 10

np

Integer. Number of population individuals for "decor" algorithm. Default 10

gl

Integer. Generations limit for "decor": maximum number of consecutive generations without improvement. Default 100

ff

Numeric. Scaling rate for mutation in "decor". Must be in [0,1]. Default 0.4

cr

Numeric. Crossover range for "decor". Must be in [0,1]. Default 0.8

use_cpp

Logical. If TRUE (default), use optimized C++ implementations for core functions (combinpmatr, scorematrix, PenaltyBB2, etc.)

Author

Antonio D'Ambrosio antdambr@unina.it

Details

The median constrained bucket order problem finds the ranking that minimizes the sum of Kemeny distances to the input rankings, subject to the constraint that the solution must have exactly nbuckets groups of tied items.

This is useful in applications where the output must conform to predetermined categories, such as:

Wine quality classifications (e.g., 5 fixed tiers)
Medical triage systems (fixed severity codes)
Educational grading (fixed letter grades: A, B, C, D, F)

The search space is restricted to $Z^{n/b}$, which contains $$\sum_{b=1}^{n} b! S(n,b)$$ rankings, where S(n,b) is the Stirling number of the second kind.

Algorithm Selection Guidelines:

BB: Exact solution, guaranteed optimal. Use for M <= 15 items
quick: Fast heuristic, near-optimal. Use for 15 < M <= 50 items
decor: Best for large problems. Use for M > 50 items

For stochastic algorithms (quick, decor), consider running multiple times (controlled by itermax) to avoid local optima.

References

D'Ambrosio, A., Iorio, C., Staiano, M., and Siciliano, R. (2019). Median constrained bucket order rank aggregation. Computational Statistics, 34(2), 787-802. tools:::Rd_expr_doi("10.1007/s00180-018-0858-z")

Examples

Run this code

if (FALSE) {
# Simple example with 98 judges ranking 5 items into 3 buckets
data(Idea)
RevIdea <- 6 - Idea  # Reverse ranking
CR <- mcbo(RevIdea, nbuckets = 3, algorithm = "BB")
print(CR$Consensus)
print(CR$Tau)

# Large dataset with Quick algorithm
data(EMD)
CR_quick <- mcbo(EMD[,1:15], nbuckets = 5, wk = EMD[,16], 
                 algorithm = "quick", itermax = 20)
}

Run the code above in your browser using DataLab