btdata: Create a btdata object

Description

Creates a btdata object, primarily for use in the btfit function.

Usage

btdata(x, return_graph = FALSE)
# S3 method for btdata
summary(object, ...)

Arguments

The data, which is either a three- or four-column data frame, a directed igraph object, a square matrix or a square contingency table. See Details.

return_graph

Logical. If TRUE, an igraph object representing the comparison graph will be returned.

object

An object of class "btdata", typically the result ob of ob <- btdata(..).

...

Other arguments

Value

An object of class "btdata", which is a list containing:

wins

A \(K\) by \(K\) square matrix, where \(K\) is the total number of players. The \(i,j\)-th element is \(w_{ij}\), the number of times item \(i\) has beaten item \(j\). If the items in x are unnamed, the wins matrix will be assigned row and column names 1:K.

components

A list of the fully-connected components.

graph

The comparison graph of the data (if return_graph = TRUE). See Details.

Details

The x argument to btdata can be one of four types:

A matrix (either a base matrix) or a class from the Matrix package), dimension \(K\) by \(K\), where \(K\) is the number of items. The i,j-th element is \(w_{ij}\), the number of times item \(i\) has beaten item \(j\). Ties can be accounted for by assigning half a win (i.e. 0.5) to each item.
A contingency table of class table, similar to the matrix described in the above point.
An igraph, representing the comparison graph, with the \(K\) items as nodes. For the edges:
- If the graph is unweighted, a directed edge from node \(i\) to node \(j\) for every time item \(i\) has beaten item \(j\)
- If the graph is weighted, then one edge from node \(i\) to node \(j\) if item \(i\) has beaten item \(j\) at least once, with the weight attribute of that edge set to the number of times \(i\) has beaten \(j\).
If x is a data frame, it must have three or four columns:
- 3-column data frameThe first column contains the name of the winning item, the second column contains the name of the losing item and the third columns contains the number of times that the winner has beaten the loser. Multiple entries for the same pair of items are handled correctly. If x is a three-column dataframe, but the third column gives a code for who won, rather than a count, see codes_to_counts.
- 4-column data frameThe first column contains the name of item 1, the second column contains the name of item 2, the third column contains the number of times that item 1 has beaten item 2 and the fourth column contains the number of times item 2 has beaten item 1. Multiple entries for the same pair of items are handled correctly. This kind of data frame is also the output of codes_to_counts.
- In either of these cases, the data can be aggregated, or there can be one row per comparison.
- Ties can be accounted for by assigning half a win (i.e. 0.5) to each item.

summary.btdata shows the number of items, the density of the wins matrix and whether the underlying comparison graph is fully connected. If it is not fully connected, summary.btdata will additional show the number of fully-connected components and a table giving the frequency of components of different sizes. For more details on the comparison graph, and how its structure affects how the Bradley-Terry model is fitted, see btfit and the vignette: https://ellakaye.github.io/BradleyTerryScalable/articles/BradleyTerryScalable.html.

Examples

Run this code

citations_btdata <- btdata(BradleyTerryScalable::citations)
summary(citations_btdata)
toy_df_4col <- codes_to_counts(BradleyTerryScalable::toy_data, c("W1", "W2", "D"))
toy_btdata <- btdata(toy_df_4col)
summary(toy_btdata)