llbt.design: Loglinear Bradley-Terry Model (LLBT) - Design Matrix Generation

Description

The function llbt.design returns a data frame containing the design matrix for a loglinear paired comparison model. Additionally, the frequencies of the pairwise comparisons are computed and are stored in the first column of the data frame.

Usage

llbt.design(data, nitems=NULL, objnames="", objcovs=NULL,
          cat.scovs=NULL, num.scovs=NULL, casewise=FALSE, ...)

Arguments

data

either a data frame or a data file name.

nitems

number of items (objects).

objnames

an optional character vector with names for the objects These names are the columns names in the ouput data frame. If objnames is not specified o1,o2, etc. will be used.

objcovs

an optional data frame with object specific covariates. The rows correspond to the objects, the columns define the covariates. The column names of this data frame are later used to fit the covariates. Factors are not allowed. In th

cat.scovs

a character vector with the names of the categorical subject covariates in the data file to be included into the design matrix. (example: cat.scovs = c("SEX", "WORK")). If all covariates in the data are cat

num.scovs

analogous to cat.scovs for numerical (continuous) subject covariates. If any numerical covariates are specified, casewise is set to TRUE

casewise

If casewise = TRUE a separate design structure is set up for each subject in the data. This is required when fitting continuous subject covariates. However, the design can become very large in the case of many subjects

...

deprecated options to allow for backwards compatibility (see Deprecated below)

Value

The output is a dataframe of class llbtdes. Each row represents a decision in a certain comparison. Dependent on the number of response categories, comparisons are made up of two or three rows in the design matrix. If subject covariates are specified, the design matrix is duplicated as many times as there are combinations of the levels of each categorical covariate or, if casewise = TRUE, as there are subjects in the data. Each individual design matrix consists of rows for all comparisons. The first column contains the counts for the paired comparison response patterns and is labelled with y. The next columns are the covariates for the categories (labelled as g0,g1, etc.). In case of an odd number of categories, g1 can be used to model an undecided effect. The subsequent columns are covariates for the items. If specified, subject covariates and then object specific covariates follow.

encoding

UTF-8

Input Data

Responses have to be coded as consecutive integers (e.g., (0,1), or (1,2,3,...), where the smallest value corresponds to (highest) preference for the first object in a comparison. For (ordinal) paired comparison data (resptype = "paircomp") the codings (1,-1), (2,1,-1,-2), (1,0,-1), (2,1,0,-1,-2) etc. can also be used. Then negative numbers correspond to not preferred, 0 to undecided. Missing responses (for paired comparisons but not for subject covariates) are allowed under a missing at random assumption and specified via NA. Input data (via the first argument obj in the function call) is specified either through a dataframe or a datafile in which case obj is a path/filename. The input data file if specified must be a plain text file with variable names in the first row as readable via the command

read.table(datafilename,
        header = TRUE)

. The leftmost columns must be the responses to the paired comparisons (where the mandatory order of comparisons is (12) (13) (23) (14) (24) (34) (15) (25) etc.), optionally followed by columns for subject covariates. If categorical, these have to be specified such that the categories are represented by consecutive integers starting with 1. Missing values for subject covariates are not allowed and treated such that rows with NAs are removed from the resulting design structure and a message is printed. For an example see xmpl.

Details

The function llbt.design allows for different scenarios mainly concerning

paired comparison data.Responses can be either simplypreferred--not preferredor ordinal (strongly preferred--...--not at all preferred). In both cases an undecided category may or may not occur. If there are more than three categories a they are reduced to two or three response categories.
item covariates.The design matrix for the basic model has columns for the items (objects) and for each response category.
object specific covariates.For modelling certain characteristics of objects a reparameterisation can be included in the design. This is sometimes called conjoint analysis. The object specific covariates can be continuous or dummy variables. For the specification see Argumentobjcovsabove.
subject covariates.For modelling different preference scales for the items according to characteristics of the respondents categorical and/or continuous subject covariates can be included in the design.Categorical subject covariates: The corresponding variables are defined as numerical vectors where the levels are specified with consecutive integers starting with 1. This format must be used in the input data file. These variables arefactor(s)in the output data frame.Continuous subject covariates: also defined as numerical vectors in the input data frame. If present, the resulting design structure is automatically expanded, i.e., there are as many design blocks as there are subjects.
object specific covariates.The objects (items) can be reparameterised using an object specific design matrix. This allows for scenarios such as conjoint analysis or for modelling some characteristics shared by the objects. The number of such characteristics must not exceed the number of objects minus one.

References

R. Dittrich, R. Hatzinger and W. Katzenbeisser. Modelling the effect of subject-specific covariates in paired comparison studies with an application to university rankings. Applied Statististics (1998), 47, Part 4, pp. 511-525

Examples

Run this code

# cems universities example
des <- llbt.design(cemspc, nitems = 6, cat.scovs = "ENG")

res0 <- gnm(y ~ o1+o2+o3+o4+o5+o6 + ENG:(o1+o2+o3+o4+o5+o6),
    eliminate = mu:ENG, data = des, family = poisson)
summary(res0)

# inclusion of g1 allows for an undecided effect
res <- gnm(y ~ o1+o2+o3+o4+o5+o6 + ENG:(o1+o2+o3+o4+o5+o6) + g1,
    eliminate = mu:ENG, data = des, family = poisson)
summary(res)

# calculating and plotting worth parameters
wmat <- llbt.worth(res)
plot(wmat)

# object specific covariates
LAT  <- c(0, 1, 1, 0, 1, 0)        # latin cities
EC   <- c(1, 0, 1, 0, 0, 1)
OBJ  <- data.frame(LAT,EC)
des2 <- llbt.design(cemspc, nitems = 6, objcovs = OBJ, cat.scovs = "ENG")
res2 <- gnm(y ~ LAT + EC + LAT:ENG + g1,
    eliminate = mu:ENG, data = des2, family = poisson)

# calculating and plotting worth parameters
wmat2 <- llbt.worth(res2)
wmat2
plot(wmat2)

Run the code above in your browser using DataLab